Regional Frequency Analysis of Extremes Precipitation Using L-Moments and Partial L-Moments

1Department of Statistics, Quaid-i-Azam University, Islamabad, Pakistan 2Government Degree College, Lahore, Swabi, Khyber Pakhtunkhwa, Pakistan 3Department of Statistics, COMSATS Institute of Information Technology, Lahore, Pakistan 4Faculty of Health Studies, University of Bradford, Bradford, UK 5Bradford Institute for Health Research, Bradford Teaching Hospitals NHS Foundation Trust, Bradford, UK 6Arriyadh Community College, King Saud University, Riyadh, Saudi Arabia 7Workers University, Cairo, Egypt


Introduction
Hydraulic and hydrologic designs are key steps in planning of any water project.Any problem pitched at designing stage will result in the failure of design irrespective of the fact how correctly the other steps are taken.Hydrologists deal with water-related issues, problems of quantity, quality, and availability, in the society that known as hydrologic events.Stochastic methods are often used to understand sources of uncertainties in physical processes that give rise to observed hydrologic events, as precipitation and stream flow estimates depend on the past or future events.Several statistical methods offered to minimize and summarize the uncertainties of observed data and frequency analysis is one of them.It determines that how often a particular event will occur by estimating the quantile   for return period of , where  is the magnitude of the event that occurs at a given time and location.
Dalrymple [1] proposed regional frequency analysis (RFA) method for pooling various data samples.It is indexflood procedure in hydrology.Hosking et al. [2] studied the properties of probability-weighted moments (PWMs) method based on RFA method.This method is first used by Greis and Wood [3] and Wallis [4].Cunnane [5] reviewed twelve methods of RFA and related regional PWMs algorithm.
Initially, PWMs are considered as an alternative parameter estimation method; however, it was difficult to interpret directly as measures of the shape and scale parameters of distribution.RFA can forecast the flood flow using empirical We explain the methodology of the L-moments and PLmoments in Section 2.2.Section 2.3 shows the application of the RFA where we choose the appropriate distribution for the regional analysis.Section 3.3 provides the estimation of quantile for both, the small and large return period.The results of our simulations study will be presented in Section 4.

Study
Area and Data Sources.The data was collected for this study from Karachi Data Processing Center through Pakistan Meteorological (PMD) Islamabad.Monthly precipitation data has been recorded from 1999 to 2012.There are 17 meteorological stations of Northern areas and Khyber Pakhtunkhwa, Pakistan.These stations are full precipitation regions that affect areas in Pakistan, where water is essential for hydropower and flood plains.Figure 1 shows the location of the study area and geographic distribution of precipitation stations.There is no missing value in this data set.

The L-Moments.
Conventional moments method may be used for estimating the parameters of probability distribution.However, this approach has some serious drawbacks.Ratio of moment estimators is biased and often assumption of being normally distributed is violated, Wallis et al. [26].Furthermore, it is sensitive to outliers, Pearson [27].Therefore, it is unreliable for skewed distributions.
Hosking [28] proposed L-moments approach to overcome the above problems.The L-moments may precisely describe the statistical properties of hydrological information, and it can write as a linear function of the PWMs.The PWMs of order  were properly described by Greenwood et al. [29] as where , , and  are the real numbers and () is the cumulative distribution function of .A functional case is Therefore, where  = () is the cumulative distribution function of a random variable  and  is a nonnegative integer of the real number that is  = 0, 1, 2, 3, . ... Therefore the first four Lmoments, which are the linear combinations of the PWMs, are Advances in Meteorology The L-moments have no units of measurement, which are called the L-moments ratios.The L-moments ratios, proposed by Hosking [28], are computed as where  represents the L-coefficient of variation (L-Cv),  3 represents the L-coefficient of skewness (L-Cs), and  4 represents the L-coefficient of kurtosis (L-Ck).

The Partial L-Moments.
Wang [8,10,11] introduced a concept of partial probability-weighted moments (PPWMs) that will estimate the higher quantiles of flood flows.Data can be censored to the right tail or left tail.Initially, PPWMs were to take out the smaller values from the process of distribution fitting because such values have slight influence on the frequency analysis and are nuisance to the fitting process.
The left tail PPWMs are defined by Wang [8,11] as where  0 = ( 0 ) which is the lower limit of the censored observations and  0 is the censoring threshold value.The PPWMs elongated form described by Wang [10] are to be given a censored sample as If the value of  0 is starting from the zero, then the result of PPWMs will be the same as the usual PWMs.As  (1) ≤  (2) ≤  (3) ⋅ ⋅ ⋅ ≤  () is the arranged sample, Wang [10] describes the unbiased estimator of β  as where The censoring level,  0 , is the prior selection of the number of censored sample data.The procedure that determines the number of sample data points are to be censored: where  0 and  are the lengths of sample which are to be censored and uncensored, respectively.Similarly,  0 is the highest value of the censored sample.The first four PLmoments for the PPWMs are Similarly, the PL-moments ratios can be written as where τ , τ 3 , and τ 4 denote the partial L-coefficient of variation (PL-Cv), partial L-coefficient of skewness (PL-Cs), and partial L-coefficient of kurtosis (PL-Ck), respectively.Therefore, the first four sample PL-moments can be computed as And the first four sample PL-moments ratios can be computed as where t , t 3 , and t 4 represent the sample partial L-moments ratios of the PL-Cv, PL-Cs, and PL-Ck, respectively.The derivation is L-moment and PL-moments are given in Appendix.In the present study, different level of censoring threshold is selected.
2.3.Regional Frequency Analysis.Hosking and Wallis [7,30] identified the following four steps to explain the procedure of the RFA: (1) Data screening (2) Designing of the homogeneous region (3) Selection of an appropriate probability distribution (4) Parameters estimation of the appropriate probability distribution 2.3.1.Data Screening.We screened data anomalies before applying any statistical analysis.

Discordance Test.
Hosking and Wallis [7] suggested a discordancy measure (  ) test that recognizes the locations where sample L-moments are marked contrarily from the most other locations.Locations with the large flaws in the data will be flagged as discordant.
The discordancy test for a region containing  locations, for site , is proposed by Hosking and Wallis [7] as follows: where   is the vector containing the three sample L-moments ratios for the site  expressed as is the average vector of   for the overall region that is and  is the covariance matrix for the sample that can be expressed as Broadly speaking, a location or a site is considered to be discordant from the whole region or group if the value of   is larger than the critical value.

Heterogeneity Test.
The homogeneity measure (  ) identifies homogenous regions, Hosking and Wallis [7].It is also useful to tag the locations if they are plausible to handle as a homogeneous region.It estimates the amount of heterogeneity in the overall region.The heterogeneity test (  ) is computed as where  V and  V are representing the mean and standard deviation of the simulated   values.Also, Here,   2 ,   3 , and   4 are region average L-moments or PLmoments ratios.We assessed the heterogeneity of a region as suggested by Hosking and Wallis [7]: Region is acceptably homogeneous if  < 1.

Region is possibly heterogeneous
Region is definitely heterogeneous if  ≥ 2.

Selection of the Appropriate Probability Distribution.
Hosking and Wallis [7] proposed two approaches to select the distribution that fitted best the data: the L-moment ratios diagram and the -test.The L-moment ratios diagram is using the unbiased estimators, Hosking [28], Stedinger et al. [31], Vogel and Fennessey [32], and Hosking [33].The L-moments ratio diagram is a plot of the computed values L-Cs and the observed values L-Ck of the distribution function.The curves indicate the hypothetical connections between L-Cs and L-Ck of the candidate distribution.The L-moment ratio diagrams have been proposed for discriminating between the candidate probability distributions in describing the regional information (Hosking [28]; Stedinger et al. [31]; Hosking and Wallis [7]).The L-moments ratio diagrams have been used as a component of probability distribution process for regional information (Schaefer [34]; Pearson [35]; Vogel and Fennessey [32], Vogel et al. [36]; Chow and Watt [37]; Ön Öz and Bayazit [38]; Vogel and Wilson [39]; Peel et al. [40]).
Hosking and Wallis [7] suggested a measure to see how well the L-Cs and L-Ck of the fitted probability distribution match the regional average L-Cs and L-Ck of the observed information.
The measure goodness of fit for every single selected probability distribution is computed as follows: where  Dis 4 represents the value of the L-Ck of the fitted distribution,   4 represents the weighted regional average L-Ck, and  4 represents standard deviation of the   4 , which is obtained from the simulation of the Kappa probability distribution.
If the computed value of  Dis is equal to zero, the probability distribution will be the most suitable fit.If the computed

Estimation
The sites' information and statistic by using L-moments for the present study are presented in Table 1.In Table 1, mean represents the first sample L/PL-moments and ,  3 , and  4 are the sample L/PL-moments ratios of the L-Cv/PL-Cv, L-Cs/PL-Cs, and L-Ck/PL-Ck, respectively.The lower level censoring threshold is selected from 10 to 23%.Table 2 expresses the feasible threshold values according to the percentile technique along with Average Annual Occurrence Number (AAON).Jiang et al. [41] and Yuguo [42] suggested that the optimal threshold can be obtained if the values of AAON lie between 1 and 2. Table 2 shows that 90th percentile observations are suitable for the optimum threshold selection of most of areas in the present study.We have 168 values in each station; according to the above table, Astore station has 3.2 threshold values due to which 16 values are being censored in 168.According to censored level, 10.2% censored level was selected.Similarly, 10.5% was selected for Balakot and Muzaffarabad.According to this process, maximum threshold level 22.3% was selected for Gupis.By using the above process for each station, 17 different censoring levels were selected.So we decide that the range from 10 to 23% of censoring level should be kept for selecting threshold values.
3.1.Regional Frequency Analysis.The following four steps are considered as prerequisite for frequency analysis, Hosking and Wallis [7,30]: (1) Data screening (2) Designing of the homogeneous region (3) Selection of an appropriate probability distribution (4) Parameters estimation of the appropriate probability distribution 3.1.1.Data Screening.In this study, we use secondary data after carefully examining all locations for abnormalities and missing observations.Therefore, we use 14 years of data for RFA that were obtained from seventeen locations.

Discordance Test.
Table 3 shows   result of ( 18) for 17 locations of this study region.It can be observed from the results of L/PL-moments in Table 3 that the value of statistic varies from 0.07 to 2.44.If   is greater than 3, the location is considered to be discordant from the rest of the regional data, Hosking and Wallis [7].In this study region, no location is diagnosed as discordant (  ≥ 3).Therefore, we use all data for the development of the RFA based on Lmoments and PL-moments.
3.1.3.Regional Heterogeneity Measure.The next step is the formation of the homogeneous region, which is conventionally tougher and needs the higher number of subjective judgments.The homogeneity conditions are defined as the locations that have the same frequency distributions.For the present study area, realization of the Kappa probability distribution is used to conduct the heterogeneity test based on the L-moments and PL-moments.
Number of simulations are 10,000 for computing the heterogeneity.We computed the regional average L-moments ratios, the regional PL-moments ratios, and the corresponding parameter values of the fitted Kappa probability distribution (see Table 4).Table 4 shows the results of the heterogeneity measure using L-moments and PL-moments methods.It can be observed from Table 4 that the different values for the -statistic are −0.41,−1.60, and −2.89 based on L-moments and −0.06, −1.8, and −3.17 based on PL-moments.Therefore, we concluded that, by comparing these results and the heterogeneity conditions, study region is acceptably homogeneous for L-moments and PL-moments.No further subdivisions of the present study are necessary.

Fitting Appropriate Probability Distribution.
After homogeneity analysis of the study area, a suitable probability distribution is required for the RFA.The objective is not only to recognize a suitable probability distribution for RFA but also to observe a probability distribution that will provide robust quantile estimate for each location and for the regional growth cure.List of candidate probability distributions for RFA is GLO, GEV, GPA, and GNO.
We plotted L-moments and PL-moments diagrams for preliminary evaluation of the probability distribution for the study area.
Figure 2 illustrates an analogy of the observed and hypothetical relationships of the probability distribution.Figure 2 shows that GLO distribution is not a suitable candidate for the L-moments and PL-moments.
Interestingly, both analyses of the L-moments and the PLmoments diagram show that the sample average values are appropriately distinguished by the hypothetical L-moments and PL-moments for GPA and GNO distributions.
However, it is hard to find a suitable probability distribution that fits most of the regional observed data.Table 5 shows the goodness of fit test results for candidate probability distributions.
Table 5 shows that GLO distribution failed the goodness of fit test for both L-moments and for PL-moments methods as the calculated value of the -test for the GLO distribution is larger than the critical value of 1.64 (at 90% confidence level).It has been observed that the computed values of | Dis | are less than 1.64 (at 90% confidence level), namely, GEV, GNO, and GPA distributions.However, GEV, GNO, and GPA distributions are suitable for regional distribution based on L-moments and PL-moments methods and for obtaining the future estimates of the quantile.
Further, it can be noted that GPA distribution is suitable for L-moments method (lowest critical | Dis | value).Similarly, GNO distribution is suitable for PL-moments method (lowest critical value).Table 6 shows the estimates of the regional parameters for L-moments and PL-moments for the suitable probability distribution.

Estimation of the
Quantiles.The regional quantile estimates q(), with varying nonexceedance probability  for the GNO, GEV, and GPA distributions, are presented in Table 7 based on L-moments and PL-moments.Quantile function is normally represented as q(⋅) for fitted regional frequency distribution.The quantile estimate at location  is established by joining the estimate of   and (⋅).
Mathematical form of the quantile estimate with nonexceedance probability  is The regional growth curves for the GEV, GNO, and GPA distributions are shown in Figure 3.
Figure 3 shows the regional growth curves of each candidate distribution for L-moments and PL-moments.GEV, GNO, and GPA distributions are approximately identical up until 100-year return period ( = 0.99) for both Lmoments and PL-moments.However, afterward the growth curves of the GPA distribution lie below the GEV and GNO distributions.
Therefore, it is necessary to assess the performance of regional quantile estimates.

Accuracy of the Estimated Quantiles and the Regional Growth Curve
A Monte Carlo simulation is designed to assess accuracy of the regional quantile estimates that are obtained by the RFA.We use logical L-moments algorithm that has been reported by Hosking and Wallis [7] in Section 6.4.This algorithm takes samples from a region that has comparable characteristics as of the actual region, such as having the same record length, same number of locations, and the regional L-moments ratios.The area used for simulation should report the plausible heterogeneity in the area and intersite dependency if exist (Hosking and Wallis [7]).In the repeated sampling procedure, the quantile estimates are computed for the different nonexceedance probabilities.Suppose that, at th repetition and location , quantile estimate can be written  Reduced variate, − (− (F)) Figure 3: L-moments and PL-moments regional growth curves.
as Q()  () for the nonexceedance probability .The relative error for this estimator is The bias and the RMSE of the above quantity over all  repetition are Also, for the estimated quantile, the regional average bias and the relative RMSE are We use empirical quantities of quantile distribution for the assessment analysis that can be computed by taking the ratio of estimated to true values, Q ()/  () for the quantile, and q ()/  () for the regional growth curves.Therefore, 90% of the regional growth curve lie in between the interval: Inverting the expression for   (), we have q ()  0.05 () ≤  () ≤ q ()  0.05 () .
The 90% confidence interval limits show the measure of variation between the estimated and the true quantiles.These limits provide the expected magnitude of errors in the estimated quantiles and the regional growth curves.We computed L-moments ratios to find the most suitable distribution and the precision of original growth curves.The correlation between the study region sites varies from −0.05 to 0.86 with an average of 0.40.Therefore, we use algorithm from Table 6.1 of Hosking and Wallis [7].
We held out the analysis for recurrence of different years.We run 10,000 simulations with sample size of 30, 60, and 90 in each case.The whole process is repeated for GEV, GNO, and GPA distributions.From these repetitions, we computed several performance measures, such as the regional average relative bias, regional average RMSE, regional average relative RMSE, and the error bounds for the estimated regional growth curves for the selected nonexceedance probability .Overall results for the suitable probability distribution for both methods are presented in Tables 8, 9, and 10 for sample size of 30, 60, and 90, respectively.
Figures 4, 5, and 6 show estimated regional growth curves for sample sizes 30, 60, and 90, respectively, and also GEV, GNO, and GPA distributions with the 90% error bounds.
Tables 8, 9, and 10 show that increase in the sample size such as 30 to 90 improved the performance particularly in the prediction of the large nonexceedance probability .L-moments method provides similar performance for the GNO and GPA distributions in terms of relative bias that is presented in Tables 8, 9, and 10.We found that the GPA distribution produced the lowest relative bias compared to GEV and GNO distribution for the PL-moment for the various values of the nonexceedance probability .However, the GPA distribution performs better in terms of RMSE than the GEV and GNO distribution for both methods (Lmoments and PL-moments).Furthermore, RMSE is lowest for PL-moments compared to L-moments.In addition, the error bounds for the GPA distribution of regional quantiles are narrow compared to GEV and GNO distributions.It shows that the estimation of censored sample improves the prediction of extreme precipitation explicitly at large nonexceedance probability .

Discussion and Conclusion
This study provides a comprehensive evaluation of the L-moments and the PL-moments.First, revisiting RFA on L-moments by Hosking and Wallis [7], we aimed to develop similar connections of regional homogeneity for PLmoments.The L-moments and the PL-moments for candidate probability distributions (GLO, GEV, GNO, and GPA) are also developed for presenting the corresponding L-ratio and PL-ratio diagrams with the goodness of fit test results.The regional growth curves for the selected distribution have been shown in Figures 4, 5, and 6.At the lower tail, GEV, GPA, and GNO distributions are approximately the same, but, at the upper tail, there is variation between the regional quantiles.The regional homogeneity analysis starts by assuming 17 locations of Northern areas and Khyber Pakhtunkhwa, Pakistan, as one homogeneous region, based on L-moments and PL-moments at censoring level ranging from 10 to 23%.This assumption is statistically accepted after applying the heterogeneity and discordancy tests.The statistic provides appropriate distribution for modeling the monthly extreme precipitation in Northern areas and Khyber Pakhtunkhwa, Pakistan.We found that GPA distribution is suitable for the L-moments and GNO distribution for the PLmoments.
Finally, Monte Carlo simulation used for performance evaluation by commonly used error functions.Several accuracy measures such as relative bias, RMSE, relative RMSE, and error function bounds for the regional quantiles are computed with 10,000 runs of Monte Carlo simulations.We found that GPA distribution produced robust quantile estimates for both return periods and methods (L-moments and PL-moments).Our results support the finding of previous study (e.g., Cunnane [9]; Bhattarai [12]) for censored sample analysis where PL-moments method outperformed the Lmoments method for the estimation of large return periods events.Regional growth curve for GPA (based on PL-moments) Regional growth curve for GNO (based on L-moments) Regional growth curve for GNO (based on PL-moments) Regional growth curve for GEV (based on L-moments) Regional growth curve for GEV (based on PL-moments) Figure 4: Regional growth curves of the three distributions for L-moments and PL-moments with their 90% error bounds for sample size of 30.Reduced variate, − (− (F)) Reduced variate, − (− (F))

Advances in Meteorology
Regional growth curve for GPA (based on L-moments) Regional growth curve for GNO (based on L-moments) Regional growth curve for GEV (based on L-moments) Regional growth curve for GPA (based on PL-moments) Regional growth curve for GNO (based on PL-moments) Regional growth curve for GEV (based on PL-moments) Regional growth curve for GPA (based on L-moments) Regional growth curve for GNO (based on L-moments) Regional growth curve for GEV (based on L-moments) Regional growth curve for GPA (based on PL-moments) Regional growth curve for GNO (based on PL-moments)

Figure 1 :
Figure 1: Locations of Northern areas and Khyber Pakhtunkhwa and the meteorological stations ( axis indicates Longitude [E];  axis indicates Latitude [N]).

Figure 2 :
Figure 2: L-diagram and PL-diagram of the GLO, GEV, GPA, and GNO distributions.

Figure 5 :
Figure 5: Regional growth curves of the three distributions for L-moments and PL-moments with their 90% error bounds for sample size of 60.

Figure 6 :
Figure 6: Regional growth curves of the three distributions for L-moments and PL-moments with their 90% error bounds for sample size of 90.

Table 1 :
Statistics of annual extreme monthly precipitation for study region based on L-moments and PL-moments.
value of -statistic is less than 1.64 at 90% confidence level (i.e., | Dis | ≤ 1.64), it will indicate that the distribution qualifies the goodness of fit criteria.If there are more than one distribution that qualify the criteria, the most suitable distribution has the minimum | Dis | value.

Table 2 :
Precipitation threshold selection in GPA distribution for 17 stations.

Table 3 :
Discordance test result based on L-moments and PLmoments.

Table 4 :
Heterogeneity measures for the study region based on L-moments and PL-moments.

Table 5 :
-test result for the goodness of fit.

Table 6 :
Regional parameters for the three candidate distributions for L-moments and PL-moments.

Table 8 :
Accuracy measure of the estimated growth curves of the region for sample size of 30.

Table 9 :
Accuracy measure of the estimated growth curves of the region for sample size of 60.

Table 10 :
Accuracy measure of the estimated growth curves of the region for sample size of 90.