A Doubling Method for the Generalized Lambda Distribution

This paper introduces a new family of generalized lambda distributions GLDs based on amethod of doubling symmetric GLDs. The focus of the development is in the context of L-moments and L-correlation theory. As such, included is the development of a procedure for specifying double GLDs with controlled degrees of L-skew, L-kurtosis, and L-correlations. The procedure can be applied in a variety of settings such as modeling events and Monte Carlo or simulation studies. Further, it is demonstrated that estimates of L-skew, L-kurtosis, and L-correlation are substantially superior to conventional product-moment estimates of skew, kurtosis, and Pearson correlation in terms of both relative bias and efficiency when heavy tailed distributions are of concern.


Introduction
The conventional moment-based family of generalized lambda distributions GLDs is often used in various applied mathematics contexts to model and describe data by a single functional form 1, page 5 . Some examples include modeling non-log-normal security price distributions 2 , biological and physical phenomena 3 , and solar radiation data 4 . The GLD is also a popular tool for generating random variables for Monte Carlo or simulation studies. Some examples include studies in such areas as operations research 5 , microarray research 6 , and structural equation modeling 7 .
The family of GLDs is based on the transformation 1, page 21 , 9, 11 , 10, page 127 where u is uniformly distributed on the interval 0, 1 . The parameters λ 1  and scale parameters, while λ 3 and λ 4 are shape parameters that determine the skew and kurtosis of a GLD. The pdf and cdf associated with 1.1 can be expressed as 10, page 127 where f : → 2 and F : → 2 are parametric forms of the pdf and cdf with the mappings u → x, y and u → x, v with x q u , y 1/q u , v u, and where 1 and u are the regular uniform pdf and cdf, respectively. It is assumed that q u > 0 in 1.2 to ensure a valid pdf that is, the transformation in 1.1 is strictly increasing. An essential requirement for a valid pdf is that λ 2 , λ 3 , λ 4 in 1.1 all have the same sign 1, page 24 . For more specific details on the parameter space and conditions related to valid GLDs, see Karian and Dudewicz 1, pages 21-47 . Provided in Figure 1 is an example of a valid GLD pdf based on 1.1 and 1.2 . Symmetric GLDs are produced for the case where λ 3 λ 4 in 1.1 and where the mean α 1 , variance α 2 2 , skew α 3 , and kurtosis α 4 can be determined from 8

1.7
Numerical solutions for λ 2 and λ 3 in 1.5 and 1.7 can be found in 1, Appendix B , which are associated with standardized GLDs i.e., α 1 λ 1 0, α 2 2 1 . Note that the term β in 1.5 and 1.7 represents the complete beta function where the arguments cannot be negative. As such, for the kth moment to exist then we must have λ 3 > −1/k 8, 9, 11 . Thus, the condition λ 3 > −1/4 ensures that the first four moments exist. We propose a new family of asymmetric GLDs based on a technique referred to herein as doubling symmetric GLDs. More specifically, a family of double GLDs can be created by selecting a pair of constants λ L , λ R and transforming separately for u ≤ 1/2 and u ≥ 1/2 as follows: where λ L λ R is the nonzero shape parameter for the left right side of a distribution and δ is the positive parameter that determines the height of the double GLD at u 1/2. Provided in Figure 2 is an example of a double GLD pdf based on 1.2 and 1.8 . Inspection of Figures 1 and 2 clearly indicates that these two GLDs are markedly different even though both distributions have the same values of skew α 3 3 and kurtosis α 4 65 . Note that the values of λ L and λ R in Figure 2 were determined based on the equations for α 3 and α 4 given in the appendix. Conventional moment-based estimators e.g., α 3,4 have unfavorable attributes insofar as they can be substantially biased, have high variance, or can be influenced by outliers. For 4 ISRN Applied Mathematics example, inspection of Figure 2 indicates, on average, that the estimates of α 3,4 are only 67.70% and 14.70% of their associated population parameters. Note that each estimate of α 3,4 in Figure 2 was calculated based on samples of size n 250 and the formulae currently used by most commercial software packages such as SAS, SPSS, and Minitab, for computing skew and kurtosis.
L-moment-based estimators, such as L-skew τ 3 and L-kurtosis τ 4 , have been introduced to address the limitations associated with conventional moment-based estimators 12, 13 . Specifically, some of the advantages that L-moments have over conventional moments are that they a exist whenever the mean of the distribution exists, b are nearly unbiased for all sample sizes and distributions, and c are more robust in the presence of outliers. For example, the estimates of τ 3,4 in Figure 2 are relatively much closer to their respective parameters τ 3,4 with much smaller standard errors than their corresponding conventional moment-based analogs α 3,4 . More specifically, the estimates of τ 3,4 that were simulated are, on average, 98.89% and 99.23% of their parameters.
In the context of multivariate data generation, the methodology has been developed for simulating symmetric or asymmetric GLDs based on 1.1 with specified Pearson correlation structures 2, 14 . This methodology is based on conventional product moments and the popular NORTA NORmal To Anything, 15 approach, which begins with generating multivariate standard normal deviates. However, the NORTA approach is not without its limitations. Specifically, one limitation arises because the Pearson correlation is not invariant under nonlinear strictly increasing transformations such as 1.1 . As such, the NORTA approach must begin with the computation of an intermediate correlation IC matrix, which is different than the specified correlation matrix between the GLDs. The purpose of the IC matrix is to adjust for the effect of the transformation in 1.1 such that the resulting GLDs have their specified skew, kurtosis, and Pearson correlation matrix.
Two additional limitations associated with the NORTA approach in this context are that solutions to an IC matrix may neither a exist in the range of -1, 1 as the absolute values of the ICs must be greater than or equal to their specified Pearson correlations nor b yield a positive definite IC matrix albeit the specified Pearson correlation matrix is positive definite 16 . Further, these two problems can be exacerbated when heavy tailed distributions are involved in the computation of ICs as functions performing numerical integration can more frequently either fail to converge or yield incorrect solutions. In contradistinction, it has been demonstrated in the context of the L-correlation that the limitations associated with the NORTA approach are less pronounced because the solution values of an IC matrix are in closer proximity to their specified positive definite L-correlation matrix 17 .
In view of the above, the present aim is to derive the double GLD family of distributions based on 1.8 in the contexts of L-moment and L-correlation theory. Specifically, the purpose of this paper is to develop the methodology and a procedure for simulating double GLDs with specified L-moments and L-correlations. The primary advantages of the proposed procedure are that estimates of L-skew, L-kurtosis, and L-correlation are less biased and more efficient.
The remainder of this paper is organized into four sections. The next section provides a summary of univariate L-moment theory and the derivations of the system of equations and boundary conditions for generating double GLDs with specified values of L-skew and L-kurtosis. The section thereafter introduces the coefficient of L-correlation and the equation is subsequently derived for determining ICs for specified L-correlations between double GLDs. The steps for implementing the proposed L-moment procedure are subsequently described. A numerical example and results of a simulation are also provided to confirm the derivations and compare the new L-moment-based procedure with the conventional product-momentbased procedure. In the last section, the results of the simulation are discussed.

Preliminaries
Let X 1 , . . . , X j , . . . , X n be iid random variables each with continuous pdf f x , cdf F x , order statistics denoted as X 1 : n ≤ · · · ≤ X j : n ≤ · · · ≤ X n : n , and L-moments defined in terms of either linear combinations of a expectations of order statistics or b probability weighted moments β i . For the purposes considered herein, the first four L-moments associated with X j : n are expressed as 13, pages 20-22 where the β i are determined from where i 0, . . . , 3. The coefficients associated with β i in 2.5 are obtained from shifted orthogonal Legendre polynomials and are computed as shown in 13, page 20 . The L-moments Λ 1 and Λ 2 in 2.1 and 2.2 are measures of location and scale and are the arithmetic mean and one-half the coefficient of mean difference, respectively. Higherorder L-moments are transformed to dimensionless quantities referred to as L-moment ratios defined as τ r Λ r /Λ 2 for r ≥ 3, and where τ 3 and τ 4 are the analogs to the conventional measures of skew and kurtosis. In general, L-moment ratios are bounded in the interval −1 < τ r < 1 as is the index of L-skew τ 3 where a symmetric distribution implies that all L-moment ratios with odd subscripts are zero. Other smaller boundaries can be found for more specific cases. For example, the index of L-kurtosis τ 4 has the boundary condition for continuous distributions of 18

L-Moments for Double GLDs
The family of double GLDs, associated with 1.8 , based on L-moments is less restrictive than the family based on conventional moments insofar as we may consider the nonzero 6 ISRN Applied Mathematics parameters of λ L , λ R > −1 for any distribution with finite k order L-moments rather than λ L , λ R > −1/k for the kth-order conventional moment to exist. This advantage is attributed to Hosking's Theorem 1 12 which states that if the mean Λ 1 exists, then all other L-moments will have finite expectations. As such, the family of double GLDs can be derived in the context of L-moments by defining the probability weighted moments based on 2.5 in terms of q u in 1.1 and the regular uniform pdf and cdf as Integrating 2.7 for i 0, 1, 2, 3 and using 2.1 -2.4 yields Thus, given specified values of τ 3 and τ 4 , 2.10 and 2.11 can be numerically solved for the corresponding values of λ L and λ R . Note that the values of L-skew τ 3 and L-kurtosis τ 4 in 2.10 and 2.11 are independent of the value of δ selected in 1.8 . Further, inspection of 2.10 and 2.11 indicates that interchanging values for the parameters λ L and λ R reverses the direction of τ 3 and has no effect on τ 4 . For the special case of λ L λ R the double GLD is symmetric where τ 3 0 in 2.10 and τ 4 in 2.11 will simplify to the expression A double GLD will lie in the area above the curve graphed in the |τ 3 | and τ 4 plane.
Differentiating 2.12 with respect to λ L and equating the resulting expression to zero and solving for λ L yields a minimum value of L-kurtosis of where λ L λ R √ 6−1. As such, using 2.10 and 2.11 with λ L √ 6−1 and λ R ∈ −1, √ 6−1 , provided in Figure 3 is a graph of the region for feasible combinations of τ 3 and τ 4 for double GLDs. Feasible combinations of τ 3 and τ 4 for 2.10 and 2.11 will lie in the region above the curve graphed in the |τ 3 |, τ 4 plane of Figure 3.
Provided in Figure 4 are some examples of various double GLDs. These distributions are used in the simulation portion of this study in Section 4. The next section begins with an introduction to the L-correlation.

The L-Correlation for Double GLDs
The coefficient of L-correlation see 19 is introduced by considering two random variables Y j and Y k with continuous distribution functions F Y j and F Y k , respectively. The second L-moments of Y j and Y k can alternatively be expressed as As such, the L-correlations of Y j toward Y k and Y k toward Y j are expressed as The L-correlation in 3.5 or 3.6 is bounded such that −1 ≤ η jk ≤ 1 where a value of η jk 1 η jk −1 indicates a strictly increasing decreasing monotone relationship between the two variables. In general, we would also note that η jk / η kj . In the context of the L-moment-based double GLD, suppose it is desired to simulate T distributions of the form in 1.8 with a specified L-correlation matrix and where each distribution has its own specified values of τ 3 and τ 4 . Let Z 1 , . . . , Z T denote standard normal variables where the distribution functions and bivariate density function associated with Z j and Z k are expressed as Using 3.7 , it follows that the jth double GLD associated with 1.8 can be expressed as q j Φ z j because Φ z j ∼ U 0, 1 . As such, using 3.5 , the L-correlation of q j Φ z j toward q k Φ z k can be evaluated using solved values of λ L j and λ R j for q j Φ z j , a specified intermediate correlation IC ρ jk in 3.9 , and the following integral expressed as The double GLD in 3.10 is standardized by a linear transformation such that it has a mean of zero and one-half the coefficient of mean difference equal to that of the unit-normal distribution as where Λ 1 is the mean from 2.8 and ξ is a constant that scales Λ 2 in 2.9 and in the denominator of 3.5 to 1/ √ π as Analogously, the L-correlation of q k Φ z k toward q j Φ z j is expressed as Note for the special case that if q j Φ z j in 3.10 and q k Φ z k in 3.13 have the same parameters that is, λ L j λ L k and λ R j λ R k , then η jk η kj . Provided in Algorithm 1 is source code written in Mathematica 20 that implements the computation of an IC ρ jk based on 3.10 . The details for simulating double GLDs with specified L-correlations are described in the next section.

The Procedure and Simulation Study
To implement the procedure for simulating double GLDs with specified L-moments and Lcorrelations we suggest the following six steps.
1 Specify the L-moments for T transformations of the form in 1.8 , that is, q 1 Φ z 1 , . . . , q T Φ z T and obtain the solutions for the parameters of λ L j and λ R j by solving 2.10 and 2.11 using the specified values of L-skew τ 3 and L-kurtosis τ 4 for each distribution. Specify a T × T matrix of L-correlations η jk for q j Φ z j toward q k Φ z k , where j < k ∈ {1, 2, . . . , T}.
2 Compute the Pearson intermediate correlations ICs ρ jk by substituting the solutions of λ L j and λ R j from Step 1 into 3.10 and then numerically integrate to solve for ρ jk see Algorithm 1 for an example . Repeat this step separately for all T T −1 / 2 pairwise combinations of correlations.
3 Assemble the ICs into a T × T matrix and decompose this matrix using a Cholesky factorization. Note that this step requires the IC matrix to be positive definite. 4 Use the results of the Cholesky factorization from Step 3 to generate T standard normal variables Z 1 , . . . , Z T correlated at the intermediate levels as follows: where V 1 , . . . , V T are independent standard normal random variables and where a ij represents the element in the ith row and the jth column of the matrix associated with the Cholesky factorization performed in Step 3.
5 Substitute Z 1 , . . . , Z T from Step 4 into the following Taylor series-based expansion for the standard normal cdf 21 : where φ Z j denotes the standard normal pdf and where the absolute error associated with 4.2 is less than 8 × 10 −16 .
6 Substitute the regular uniform deviates, Φ Z j , generated from Step 5 into the T equations of the form in 1.8 , as noted in Step 1 , to generate the double GLDs with the specified L-moments and L-correlations.
To demonstrate the steps above and evaluate the proposed procedure, a comparison between the new L-moment and conventional product moment-based procedures is subsequently described. Specifically, the distributions in Figure 4 are used as a basis for a comparison using the specified correlation matrices in Table 1 where both strong and moderate levels of correlation are considered. Tables 2 and 3 give the solved IC matrices for the conventional moment-and L-moment-based procedures, respectively. See Algorithm 2 for the algorithm and an example for computing ICs for the conventional procedure. Tables 4 and 5  In terms of the simulation, a Fortran algorithm was written for both procedures to generate 25,000 independent sample estimates for the specified parameters of a conventional Φ j CDF NormalDistribution 0,1 , Z j ; Φ k CDF NormalDistribution 0,1 , Z k ; λ Lj 0.2779733598283232; λ Rj −0.08329479143312762; λ Lk 0.3662599069298994; λ Rk 0.1297771263080004;       based on Fisher's k-statistics, that is, the formulae currently used by most commercial software packages such as SAS, SPSS, and Minitab, for computing indices of skew and kurtosis  The estimate for ρ * jk was based on the usual formula for the Pearson product-moment of correlation statistic and the estimate for η jk was computed based on 3.5 using the empirical forms of the cdfs in 3.1 and 3.3 . The estimates for ρ * jk and η jk were both transformed using Fisher's z transformation. Biascorrected accelerated bootstrapped average estimates, confidence intervals C.I.s , and standard errors were subsequently obtained for the estimates associated with the parameters α 3,4 , τ 3,4 , z ρ * jk , z η jk using 10,000 resamples via the commercial software package Spotfire S 23 .
The bootstrap results for the estimates of z ρ * jk and z η jk were transformed back to their original metrics. Further, if a parameter P was outside its associated bootstrap C.I., then an index of relative bias RB was computed for the estimate E as RB E − P /P × 100. The results of the simulation are reported in Tables 6, 7 , 8, 9, 10, 11 and are discussed in the next section.

Discussion and Conclusion
One of the advantages that L-moment ratios have over conventional moment-based estimators is that they can be far less biased when sampling is from distributions with heavy tails 13, 19 . Inspection of the simulation results in Tables 6 and 7 clearly indicates that this is the case. Specifically, the superiority that estimates of L-moment ratios τ 3 , τ 4 have over their corresponding conventional moment-based counterparts α 3 , α 4 is obvious. For example, with samples of size n 25 the estimates of skew and kurtosis for Distribution 1 were, on average, only 64.07% and 10.22% of their associated population parameters, whereas the estimates of L-skew and L-kurtosis were 98.6% and 99.35% of their respective parameters. It is also evident from comparing Tables 6 and 7 that L-skew and L-kurtosis are more efficient estimators as their standard errors are substantially smaller and more stable than the conventional moment-based estimators of skew and kurtosis.
Presented in Tables 8, 9, 10, 11 are the results associated with the conventional Pearson and L-correlations. Overall inspection of these tables indicates that the L-correlation is superior to the Pearson correlation in terms of relative bias. For example, for strong correlations n 25 the relative bias for the two heavy tailed distributions i.e., distributions 1 and 2 was 8.66% for the Pearson correlation compared to only 1.17% for the L-correlation. Further, for large sample sizes n 1000 , the L-correlation bootstrap C.I.s contained all of     the population parameters, whereas the Pearson correlation C.I.s contained none of the parameters. It is also noted that the variability of the L-correlation appears to be more stable than that of the Pearson correlation both within and across the different conditions. In summary, the new L-moment-based procedure is an attractive alternative to the traditional conventional moment-based procedure. In particular, the L-moment-based double GLD procedure has distinct advantages when distributions with heavy tails are used. Finally,

System of Conventional Moment-Based Equations for Double GLDs
The moments μ r 1,...,4 based on 1.8 can be determined from A.2 In terms of the double GLD in Figure 2, setting δ 1/ √ 2π in A.1 for the unit normal distribution, the moments associated with the location and scale parameters in A.2 are