The Exponentiated Half-Logistic Family of Distributions: Properties and Applications

,


Introduction
The use of new generators of continuous distributions from classic distributions has become very common in recent years.One example is the beta-generated family of distributions proposed by Eugene et al. [4].Another example is the gamma-generated family of distributions defined by Zografos and Balakrishnan [5].Based on a baseline continuous distribution () with survival function () and density (), their families are defined by the cumulative distribution function (cdf) and probability density function (pdf) (for  ∈ R): respectively, where Γ() = ∫ ∞ 0  −1  −  is the gamma function.
Based on Zografos and Balakrishnan's [5] paper, we replace the gamma distribution by the exponentiated halflogistic ("EHL" for short) distribution to define a new family of continuous distributions by the cdf: where (; ) is the baseline cdf depending on a parameter vector  and  > 0 and  > 0 are two additional shape parameters.For any continuous  distribution, the EHL- distribution is defined by the cdf (2).Equation ( 2) is a wider family of continuous distributions and includes some special models as those listed in Table 1. in applications to finance [6].Now, we introduce a new fourparameter distribution called the EHLF distribution.Taking (; ) =  −(/)  to be the Fréchet distribution with scale parameter  > 0 and shape parameter  > 0, where  = (, )  , the EHLF density function (for  > 0) is given by The cdf and hrf corresponding to (7) respectively.A characteristic of the EHLF distribution is that its hrf can be monotonically increasing or decreasing and upside-down bathtub depending basically on the parameter values.Plots of its density function and hrf for some parameter values are displayed in Figures 1 and 2, respectively.

Exponentiated Half-Logistic-Log-Logistic (EHLLL) Model.
The log-logistic (LL) distribution is widely used in practice and it is an alternative to the log-normal distribution since it presents a failure rate function that increases, reaches a peak after some finite period, and then declines gradually.The properties of the LL distribution make it an attractive alternative to the log-normal and Weibull distributions in the analysis of survival data [7].This distribution can exhibit a monotonically decreasing failure rate function for some parameter values.For  > 0, let (; ) = 1 − [1 + (/)  ] −1 be the LL cdf, where  > 0 is the shape parameter and  > 0 is the scale parameter, where  = (, )  .The EHLLL density function becomes In Figure 3, we display some possible shapes of the EHLLL density function.The corresponding cdf and hrf are given by respectively.Plots of the EHLLL hrf for some parameter values are displayed in Figure 4.

Exponentiated Half-Logistic Generalized Half-Normal (EHLGHN) Model.
The most popular models used to describe the lifetime process under fatigue are the halfnormal (HN) and Birnbaum-Saunders (BS) distributions.
When modeling monotone hazard rates, the HN and BS distributions may be an initial choice because of their negatively and positively skewed density shapes.Consider (; ) to be the generalized half-normal (GHN) distribution [8] with scale parameter  > 0 and shape parameter  > 0, where  = (, )  , given by ( where erf(⋅) is the error function.Note that Then, the four-parameter EHLGHN density (for  > 0) can be expressed as If  = 1, the EHLGHN distribution model reduces to the exponentiated half-logistic half-normal (EHLHN) distribution.The cdf and hrf corresponding to (12) are respectively.A characteristic of the EHLGHN distribution is that its hrf can be bathtub shaped, monotonically increasing or decreasing, and upside-down bathtub depending basically on the parameter values.Plots of the EHLGHN density function and hrf for some parameter values are displayed in Figures 5 and 6, respectively.

Shapes
The shapes of the density and hazard rate functions can be described analytically.The critical points of the EHL- density function are the roots of the equation: There may be more than one root to (14).Let () =  2 log[()]/ 2 .We have If  =  0 is a root of ( 14), then it corresponds to a local maximum if () > 0 for all  <  0 and () < 0 for all  >  0 .It corresponds to a local minimum if () < 0 for all  <  0 and () > 0 for all  >  0 .It refers to a point of inflexion if either () > 0 for all  ̸ =  0 or () < 0 for all  ̸ =  0 .The critical point of the hrf of , say ℎ(), is obtained from the following equation: There may be more than one root to (16).Let () =  2 log[ℎ()]/ 2 .We have If  =  0 is a root of ( 16), then it refers to a local maximum if () > 0 for all  <  0 and () < 0 for all  >  0 .It corresponds to a local minimum if () < 0 for all  <  0 and () > 0 for all  >  0 .It gives an inflexion point if either () > 0 for all  ̸ =  0 or () < 0 for all  ̸ =  0 .

A Useful Expansion and Quantile Power Series
We can demonstrate that the cdf of  given by (2) admits the following expansion: where   () = ()  denotes the exponentiated- ("exp-") cumulative distribution with power parameter , Some structural properties of the exp- distributions are investigated by Mudholkar et al. [9], Gupta and Kundu [10], and Nadarajah and Kotz [11], among others.
The density function of  can be expressed as an infinite linear combination of exp- density functions: where ℎ +1 (; ) = ( + 1)(; )(; )  denotes the density function of the exp- random variable  +1 ∼ exp-( + 1) with power parameter  + 1. Equation ( 20) reveals that the EHL- density function is a linear combination of exp- density functions.Thus, some mathematical properties of the new family can be obtained directly from those properties of the exp- distribution.
Next, we derive an expansion for the argument of   (⋅) in ( 6): Using the generalized binomial expansion four times since  ∈ (0, 1), we can write and then where   = ∑ ∞ ,=0 ∑ ∞ =  ,,, and   = −  for  ≥ 1,  0 = 1− 0 , and Then, the quantile function of  can be expressed from (6) as where for  ≥ 1 and For any baseline  distribution, we can combine ( 21) with (28) to obtain and then using ( 22) and (23), we have where   = ∑ ∞ =0    , ,  ,0 =   0 , and, for  > 1, Equation ( 32) is the main result of this section since it allows to obtain various mathematical quantities for the EHL family as shown in the next sections.
The formulae derived throughout the paper can be easily handled in most symbolic computation software platforms such as Maple, Mathematica, and MATLAB.These platforms currently have the ability to deal with analytic expressions of formidable size and complexity.Established explicit expressions to calculate statistical measures can be more efficient than computing them directly by numerical integration.The infinity limit in these sums can be substituted by a large positive integer such as 20 or 30 for most practical purposes.

Moments
Hereafter, we will assume that () is the cdf of a random variable  and that () is the cdf of the random variable  having density function (3).The moments of  can be obtained from the (, )th probability weighted moments (PWMs) of  given by An alternative expression for  , can be determined using ( 22) and ( 23): The PWMs for several distributions can be calculated from (34) and (35).
We can write from ( 20) Thus, the moments of any EHL- distribution can be expressed as an infinite weighted linear combination of the baseline PWMs.Equations ( 34)-( 36) are the main results of this section.Further, the central moments (  ) and cumulants (  ) of  can be calculated as respectively, where 1 , and so forth.The skewness  1 =  3 / 3/2 2 and kurtosis  2 =  4 / 2  2 quantities follow from the second, third, and fourth cumulants.

EHLF Model.
Consider the Fréchet baseline cdf  , () =  −(/)  for  > 0 and corresponding pdf  , () discussed in Section 2.2.The EHLF density function can be written from (20) as where   * , () =   * , ()/.This equation reveals that the EHLF density function can be expressed as an infinite mixture of Fréchet densities.The (, )th PWM of the Fréchet distribution becomes Setting  = ( + 1)(/)  ,  , reduces to The integral converges absolutely for  <  and then Plots of the skewness and kurtosis for some choices of  as functions of , for  = 2.1 and  = 3.1, are displayed in Figure 7.

Other Measures
In this section, we calculate the following measures: generating function, incomplete moments, mean deviations, reliability, entropies, and order statistics for the EHL- family.

Generating Function.
Here, we provide two formulae for the moment generating function (mgf) () = (  ) of .
A first formula for () comes from (20) as where  +1 () is the generating function of the exp- distribution with power parameter  + 1.Hence, () can be determined from the exp- generating function.

Incomplete Moments.
Incomplete moments of the income distribution form natural building blocks for measuring inequality.For example, the Lorenz and Bonferroni curves depend upon the incomplete moments of the income distribution.The th incomplete moment of  is defined as   () = (  |  < ) = ∫  −∞   ().Here, we provide two formulae to calculate the incomplete moments of the EHL family.First, the th incomplete moment of  can be expressed as The integral in (54) can be computed at least numerically for most baseline distributions.
A second general formula for  1 () can be derived by setting  = () in (57): where Equations ( 55)-( 59) are the main results of this section.

6.5.
Entropies.An entropy is a measure of variation or uncertainty of a random variable .Two popular entropy measures are the Rényi and Shannon entropies.The Rényi entropy of a random variable with pdf () is defined (for  > 0 and  ̸ = 1) as The Shannon entropy of a random variable  is given by {− log[()]}, which is the special case of the Rényi entropy when  ↑ (65) After some algebraic manipulations, we obtain the following.

Proposition 1.
Let  be a random variable with pdf given by (3).Then, The simplest formula for the entropy of  becomes Journal of Probability and Statistics After some algebraic developments, we obtain an alternative expression for   (): where  , ∼ Beta(1, ( + ) + ( − 1) + 1).
6.6.Order Statistics.Order statistics make their appearance in many areas of statistical theory and practice.Suppose that  1 ,  2 , . . .,   is a random sample from the EHL- distribution.Let  : denote the th order statistic.From ( 18) and ( 20), the pdf of  : is given by where  = !/[( − 1)!( − )!].Using ( 22) and (23), we can write where and Hence, where 72) is the main result of this section.It reveals that the pdf of the EHL- order statistics is a linear combination of exp- density functions.So, several structural quantities of the EHL- order statistics like ordinary, incomplete moments, generating function, mean deviations, and several others can be obtained from the corresponding quantities of exp- distributions.

Bivariate Extensions
In this section, we introduce two extensions of the proposed model.The first extension is based on the idea of [15].Let  1 ∼ EHL-( 1 , , ),  2 ∼ EHL-( 2 , , ), and  3 ∼ EHL-( 3 , , ) be independent random variables.Further, we define  = max{ 1 ,  3 } and  = max{ 2 ,  3 }.Then, the pdf of the bivariate random variable (, ) is given by where  = min{, }.The marginal cdf 's are Clearly, if we consider  ∼ EHL-( 1 +  3 , , ) and  ∼ EHL-( 2 +  3 , , ), the pdf of (, ) is given by ∼ EHL-(, , ), where  is a  × 1 vector of unknown parameters of the baseline distribution (; ).The loglikelihood function for  = (, , )  can be expressed as Equation ( 84) can be maximized either directly, for example, using SAS (Proc NLMixed) or Ox (subroutine MaxBFGS) (see [16]) or by solving the nonlinear likelihood equations obtained by differentiating the score function.Initial estimates of the parameters  and  may be inferred from the estimates of .The components of the score vector () are given by For interval estimation and hypothesis tests on the model parameters, we require the ( + 2) × ( + 2) observed information matrix  = () calculated numerically.Under conditions that are fulfilled for parameters in the interior of the parameter space but not on the boundary, √( θ − ) is asymptotically normal  +2 (0, () −1 ), where () is the expected information matrix.We can substitute () by ( θ), that is, the observed information matrix evaluated at θ, and then the multivariate normal  +2 (0, ( θ) −1 ) distribution can be used to construct approximate confidence regions for the model parameters.
We can compute the maximum values of the unrestricted and restricted log-likelihoods to construct likelihood ratio (LR) statistics for testing some special models of the EHL- distribution.For example, for comparing, the EHLGHN and EHLHN distributions are equivalent to test  0 :  = 1 versus  1 :  ̸ = 1 and the LR statistic reduces to where α, λ, ĉ, and â are the MLEs under  and α, λ, and c are the estimates under  0 .

Applications
In this section, the potentiality of the EHL- family is illustrated by means of two applications using well-known data sets.We demonstrate the flexibility and applicability of the proposed model.The reason for choosing these data is that they allow us to show how in different fields it is necessary to have positively skewed distributions with nonnegative support.These data sets present different degrees of skewness and kurtosis.

Application 1:
Tubercle Data.The first data set corresponds to the survival times of guinea pigs injected with different doses of tubercle bacilli reported by Bjerkedal [17].
It is well known that guinea pigs have high susceptibility to human tuberculosis and that is because they were used in that study.Here, we are primarily concerned with the animals in the same cage that are under the same regimen; the data includes  = 72 observations.These data were also analyzed by Kundu et al. [18] using the Birnbaum-Saunders distribution.
An alternative approach for modeling these data can be provided by the Weibull and Birnbaum-Saunders (BS) distribution.There are various extensions of these lifetimes distributions.For example, Famoye et al. [19] proposed the beta Weibull (BW) distribution and Cordeiro et al. [1] study some mathematical properties of the BW distribution, which is a quite flexible model in analysing positive data.More recently, Cordeiro and Lemonte [20] proposed the  Now, we will apply formal goodness-of-fit tests in order to verify which distribution fits better to the carbon data.In particular, we consider the Cramér-von Mises ( * ) and Anderson-Darling ( * ) statistics.The  * and  * statistics are described in detail in Chen and Balakrishnan [21].In general, the smaller the values of these statistics, the better the fit to the data.Let (; ) be the cdf, where the form of  is known but  (a -dimensional parameter vector) is unknown.To obtain the statistics  * and  * , we can proceed as follows: (i) compute V  = (  ; θ), where the   's are in ascending order, and then   = Φ −1 (V  ), where Φ −1 (⋅) is the inverse of Φ(⋅);  2).
The  * and  * statistics for all the models are given in Table 3. From the figures in this table, the proposed EHLLL model fits the current data better than the other models.Therefore, the new family may be an interesting alternative to the other models available in the literature for modeling positive real data.
More information is provided by a visual comparison of the histogram of the data with the fitted EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN distributions.The plot of the fitted EHLLL density is displayed in Figure 10   Figure 11(a) displays plots of the empirical function and the estimated cdf 's of the EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN distributions.We note a good fit of the EHLLL and LL models to these data.9.2.Carbon Monoxide Data.The first data set consists of the carbon monoxide (CO) measurements made in several brands of cigarettes in 1998.The reports show that nicotine levels, on average, had remained stable since 1980, after falling in the preceding decade.The report entitled "Tar, nicotine, and carbon monoxide of the smoke of 1206 varieties of domestic cigarettes for the year of 1998" includes the data sets and some information about the source of the data, smoker's behavior and beliefs about nicotine, and tar and carbon monoxide contents in cigarettes.
The CO data includes  = 345 records of measurements of CO content, in milligrams, in cigarettes of several brands.
We fit the EHLF, EHLLL, EHLGHN, Fréchet, LL, GHN, BW, and -BS distributions to the data.The computations were done using the NLMixed procedure in SAS.Table 4 lists the MLEs (and the corresponding standard errors in parentheses) of the model parameters and the values of AIC, BIC, and CAIC statistics for some models.These results indicate that the EHLF, EHLGHN, BW, and GHN models have the lowest AIC, BIC, and CAIC values.
The  * and  * statistics for all the models are given in Table 5.From the figures in this table, the proposed EHLF model fits the current data better than the other models.
In order to assess if the EHL- model is really appropriate, the plots of the fitted EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN density functions are displayed in Figure 12.Based on these plots, we conclude that the EHLF distribution provides the best fit to the carbon monoxide data.
Figure 13(a) displays plots of the empirical function and the estimated cdf 's of the EHLF, EHLLL, EHLGHN, Fréchet, LL, and GHN distributions.We note a good fit of the EHLF model to these data.

Conclusions
We propose a new exponentiated half-logistic (EHL) family which represents a competitive alternative for lifetime data analysis.For any parent continuous distribution , we can define the corresponding EHL- distribution with two positive parameters.So, the new family extends several common distributions such as Fréchet, normal, log-normal, Gumbel, and log-logistic distributions.The mathematical properties of the new family such as ordinary, incomplete and factorial moments, generating and quantile functions, mean deviations, Bonferroni and Lorenz curves, Shannon entropy, Rényi entropy, reliability, and order statistics are obtained for any EHL- distribution.The model parameters are estimated by maximum likelihood.Two examples to real data illustrate the importance and potentiality of the new family.

Figure 7 :
Figure 7: Skewness and kurtosis of the EHLLL distribution as a function of  for some values of .

Figure 8 :Figure 9 :
Figure 8: Skewness and kurtosis of the EHLLL distribution as a function of  for some values of .
(b)  for the tubercle data.Clearly, the new EHLLL distribution provides a closer fit to the histogram.

Table 1 :
Some special models.
6.4.Reliability.Here, we derive the reliability  = ( 2 <  1 ) when  1 ∼ EHL-( 1 ,  1 ,  1 ) and  2 ∼ EHL-( 2 ,  2 ,  2 ) are independent random variables with a positive support.It has many applications especially in engineering concepts.Let   denote the pdf of   and let   denote the cdf of   .By expanding the binomial terms in  1 and  2 , we obtain

Table 2 :
MLEs of the model parameters for the tubercle data, the corresponding SEs (given in parentheses) and the statistics AIC, CAIC, and BIC.

Table 4 :
MLEs of the model parameters for the carbon monoxide data, the corresponding SEs (given in parentheses) and the AIC, CAIC, and BIC statistics.