A Mixture of Generalized Tukey ’ s g Distributions

Mixtures of symmetric distributions, in particular normal mixtures as a tool in statistical modeling, have been widely studied. In recent years, mixtures of asymmetric distributions have emerged as a top contender for analyzing statistical data. Tukey’s g family of generalized distributions depend on the parameters, namely, g, which controls the skewness.This paper presents the probability density function (pdf) associated with amixture of Tukey’s g family of generalized distributions.Themixture of this class of skewed distributions is a generalization of Tukey’s g family of distributions. In this paper, we calculate a closed form expression for the density and distribution of the mixture of two Tukey’s g families of generalized distributions, which allows us to easily compute probabilities, moments, and related measures. This class of distributions contains the mixture of Log-symmetric distributions as a special case.


Introduction
The main focus of interest in financial economics is the distribution of stock market returns.Mandelbrot [1] suggested the family of stable Paretian distributions for stock market returns.Fama [2] established that the normality assumption of the empirical data does not hold as the distribution is fat tailed.Kon [3] and Tse [4] used a mixture of normal distributions for stock return.Fielitz and Rozelle [5] proposed a mixture of nonnormal stable distributions for stock price.Consequently, greater emphasis has been placed on using distributions which have asymmetry and leptokurtic properties.Recently Jiménez et al. [6] proposed option pricing based mixture of log-skew-normal distributions.If extreme events tend to occur more frequently than normal events, then skewness and kurtosis of nonnormal distributions play an essential role for the volatility smile.
The most important and useful characteristic of Tukey's -ℎ family of distributions introduced by Tukey [7] is that it covers most of the Pearsonian family of distributions.It can also generate several known distributions, for example, lognormal, Cauchy, exponential, and Chi-squared (see Martínez and Iglewicz [8], page 363).From Tukey's -ℎ family of distribution, we obtain  distribution, which is closely related to lognormal distribution and possesses similar properties of moments.Tukey's -ℎ family of distributions have been used to study financial markets.Badrinath and Chatterjee [9,10] and Mills [11] used -ℎ to model the return on a stock index, as well as the return on shares in several markets.Dutta and Babbel [12] found that the skewness and leptokurtic behavior of LIBOR were modeled effectively using -ℎ distribution.Dutta and Babbel [13] used -ℎ to model interest rates and options on interest rates, while Tang and Wu [14] proposed a new method for the Decomposition of Portfolio VaR.Dutta and Perry [15] and recently Jiménez and Arunachalam [16] used -ℎ distribution to study the operational risk for heavy tailed severity models.Jiménez and Arunachalam [17] provided explicit expressions for VaR and CVaR calculations using the family of Tukey's -ℎ distributions.Currently Jiménez et al. [18] studied generalization of Tukey's -ℎ family of distributions, when the standard normal random variable is replaced by a continuous random variable  with mean 0 and variance 1.
The subfamily of  distributions exhibits skewness and has great importance in the study of asymmetric distributions for analyzing data.This kind of distribution allows us to obtain scaled Log-symmetric distributions.Vitiello and Poon [19] considered a simple mixture of two  distributions 2 Journal of Probability and Statistics for option pricing data.The purpose of this paper is to present a mixture of Tukey's  distributions and derive some statistical properties including the pdf and moment generating function and its properties.
The paper is organized as follows: Section 2 presents Tukey's -ℎ family of generalized distributions and its pdf, as well as the cumulative distribution function (cdf).In Section 3, some theoretical results of the mixture of two Tukey's  families of generalized distributions are presented and Section 4 explains the methodology of calculating estimation of parameters by the method of moments.Section 5 discusses the adjustment methodology of our proposed model to real data of Heating-Degree-Days (HDD) indices and finally, in Section 6, we conclude.

Tukey's 𝑔 Family of Generalized Distributions
Tukey [7] introduced the family -ℎ distributions by means of two nonlinear transformations given by with  ̸ = 0, ℎ ∈ R, where the distribution of  is standard normal.When these transformations are applied to a continuous random variable  with mean 0 and variance 1 such that its pdf   (⋅) is symmetric about the origin and cdf   (⋅), the transformation  ,ℎ () is obtained, which henceforth will be termed Tukey's -ℎ generalized distribution.If ℎ = 0, Tukey's -ℎ generalized distribution reduces to which is known as Tukey's  generalized distribution.
In order to model an arbitrary random variable  using the transformation given in (2), Hoaglin and Peters [20] introduced two new parameters,  (location) and  (scale), and proposed the following linear transformation:  =  +  with  =  ,0 () . ( The following properties for pdf, cdf, and quantile functions of Tukey's  generalized distribution were established by Jiménez et al. [18] in terms of the pdf and cdf of  as follows: where  = ln(/||) and  =  − /.We say that the random variable  has a Log-symmetric distribution (such distributions are all asymmetric; see for reference Johnson et al. [21] and Stuart and Ord [22]) with three parameters: threshold (), scale (), and shape (), denoted by  ∼ LS(, , ).
The first expression of (4) allows us to obtain the following pdf associated with Tukey's  distribution.Table 1 shows the parameters of the pdf of  that we obtain using a selected set of well known symmetrical distributions (from Jiménez et al. [18]).
The th moment of the random variable  =  ,0 () is given by where g = ( − ) and   () is the moment generating function of the random variable , which are even function; that is,   () =   (−).Table 2 shows parameters of the pdf and the moment generating function for a random variable , using a selected set of well known symmetrical distributions.Expression ( 5) allows us to obtain the moments of Tukey's  generalized distribution.The th moment of the random variable  given by ( 3) can be obtained using the formula where / = sgn()  .Note that the above expression of the th moment does not depend on the parameter .
Formulas for calculating the standardized skewness,  1 (), and standardized excess kurtosis,  2 (), are given by where sgn(⋅) denote the signum function.Note that these expressions only depend on the parameter  and its sign, respectively.Any LS distribution should satisfy the following test given in Stuart and Ord [22]:

The Mixture of Two 𝑔 Distributions
We assume that  follows a Log-Symmetric Mixture (LSMIX) distribution.Let us assume that   () is the weighted sum of -component LS densities; that is, We use the notation  ∼ LSMIX(Λ), where Λ = ( 1 , . . .,   ), and each element   = (  ,   ,   ,   ) is the parameter vector that defines the th component and probability weights,   , satisfying the conditions According to Titterington et al. [23] the two-component mixture of known distributions is set by two weights.Let Then we can assume that   () is the weighted sum of two Tukey's  mixture densities such that  1  2 > 0. Thus where, without loss of generality, we let  1 <  2 , 0 ≤  1 ≤ 1 and for  = 1, 2 with   =  − (/  ),   = (/|  |).We use the notation  ∼ GTMIX(, ,  1 ,  2 ,  1 ).Vitiello and Poon [19] did not provide the piecewise nature of the mixture density function above in (12).In this case the cdf of  is given by where  2 = 1 −  1 .Begin with the fact that the quartile function is the inverse of the cdf.Thus, replacing   >  2 in (14), we obtain If we assume that  ∼ (0, 1), (12) can be written as where () is the standard normal pdf.Note that the expression above matches the pdf of a mixture of three-parameter lognormal distributions.Letting  1 =  2 = 0, the above pdf reduces to that of a mixture of two-parameter lognormal distributions.
Given that every normal pdf is a version of the standard normal pdf then if  ∼ (,  2 ) we have and ( 12) can be written as If the parameters   are scaled by , that is,  *  =   , then with  *  =  − (/ *  ),  *  = ln(/| *  |) + (/) *  .Note that the expression above matches the pdf of a mixture of three-parameter lognormal distributions, which is a generalization of the pdf given in (16)

Estimation of the Mixtures of Two Tukey's 𝑔 Distributions
In this section, we explain the estimation of the mixture of two Tukey's  distributions.The expected value of  is given by The th raw moment of the random variable  is given by where  1  2 ̸ = 0, g = ( − )  and   () is the moment generating function of the random variable .The central moments   of the random variable  are given by The first five central moments are as follows: where for  = 1, 2 Because  1 <  2 , upon equating population moments to the corresponding sample moments, it follows from ( 23) that Left-hand side of system ( 23) is multiplied by  1 +  2 = 1; the equations take the following form: where   ( = 1, 2, . ..) denote the th central moment of the sample.Equations ( 26) accordingly constitute a system of five equations to be solved simultaneously for the estimates of the five parameters , ,  1 ,  2 , and  1 .
Note that, from the first equation of system of ( 26), it follows that We eliminate  1 between the first and the subsequent equations of (26) in turn and thereby reduce the system to the following four equations in four unknowns , ,  1 , and  2 : These systems of equations are solved computationally by using scientific software package and we do not need to verify the unique solution of the system as the parameter estimates.We skip further details and numerical illustration owing to space constraint.

Illustration
In this section we discuss some examples and applications of the results derived in Section 3 with two examples.In the first example, we discuss the pricing of a call option using a mixture of two Tukey's -generalized distributions as an example to illustrate the results of Section 3. In the second example, we examine the empirical real data of Heating-Degree-Day to demonstrate usefulness of our approach of mixture of LS distributions.Jiménez et al. [24] derived the option price of an European option assuming that the terminal price distribution follows a -generalized distribution.Instead if we use a mixture of two Tukey's classes of -generalized distributions, then the price of the call option denoted by (, ; ) with a strike price  and maturity date  =  +  can be expressed as follows: where  >  2 and When  ∼ (0, 1), (29) reduces to where Φ(⋅) denotes the cdf of a standard univariate normal variable.If we assume that  1 =  2 = 0, then (31) reduces to Note that when   =   √, these expressions coincide with the option pricing formula given in Bahra [25].The authors also established closed form formula for the calculation of the sensitives measures of option pricing (Greek parameters of the option).Here we wish to observe that our mixture model uses less unknown parameters for calculating the option pricing, whereas Vitiello and Poon [19] used nine unknown parameters to obtain the same for the mixture of two distributions.It has been known that when we increase the number of parameters, we lose degrees of freedom and it is no longer acceptable for the best fit of data.This gives an advantage of our approach for the mixture of two generalized distributions.We now present, as an example, the use of Heating-Degree-Days (HDD) in relation to winter temperature risk as a substitute for gas demand.HDD based contracts are listed on the Chicago Mercantile Exchange (CME).We consider an example that consists of monthly aggregate Heating-Degree-Day (HDD) data values at the Chicago O'Hare International Airport from December 1979 to December 2000 given in Wang [26] and explored also by Vitiello and Poon [19].We describe first a LS distribution with three parameters based method to infer the implied risk-neutral probability density (RND).In Table 3, we present the estimated values of the three parameters of lognormal and Log-Logistic distributions; our interest is to compare with Vitiello and Poon [19] risk-neutral densities with our proposed mixture model.
The smaller value of the Kolmogorov-Smirnov (KS) test confirms that the data obeys the LS distributions with three parameters.We wish to observe that Anderson-Darling (AD) test is more sensitive to the tails of the LS distributions in comparison with KS test.In this case, we choose the Log-Logistic distribution as the best fit for the HDD data.
The implicit risk-neutral densities (RND) of LS distributions are shown in Figure 1 and compared with Figure 6 of Vitiello and Poon [19].We have obtained a similar plot by our method with less unknown parameters than method given by Vitiello and Poon [19].Furthermore, their KS test value of 13.6326% which is higher than the KS test values of Table 3 favors the best fit for the frequency of the LS distributions.Therefore, finite mixtures are attractive from the application viewpoint because of its flexibility and permit us to model various kinds of shaped distributions.In Table 4,  we give the estimate values of the parameters of the mixture LS distributions.These parameters are estimated using (28).The estimated two -densities and the implied risk-neutral densities (RND) are shown in Figure 2. We observe that the bimodal LS mixture distribution has same fitting performance of the empirical distribution function (EDF) and lognormal mixture distribution gives best goodness of fit using the KS test.

Conclusions
This paper presents a mixture of Tukey's -generalized distributions and its properties.The methodology of estimating the unknown parameters by the method of moments is also presented.The proposed model has the advantage that it provides flexibility, when skewness, kurtosis, or other moments of the underlying distribution do not follow a normal distribution.Some special cases of well known distributions are obtained from the proposed model.

Figure 2 :
Figure 2: Empirical and two- densities estimated from HDD.

Table 1 :
Parameters of the pdf of the random variable  = ln().

Table 2 :
Parameters of the pdf and moment generating functions of the random variable .

Table 4 :
Estimates for adjusting the mixture of LS(Λ).