A New Class of Distributions Generated by the Extended Bimodal-Normal Distribution

In this study, we present a new family of distributions through generalization of the extended bimodal-normal distribution. This family includes several special cases, like the normal, Birnbaum-Saunders, Student’s t, and Laplace distribution, that are developed and defined using stochastic representation.The theoretical properties are derived, and easily implementedMonteCarlo simulation schemes are presented. An inferential study is performed for the Laplace distribution. We end with an illustration of two real data sets.


Introduction
Although the normal distribution is the most popular probability model in statistics, several random phenomena in nature cannot be described by the normal distribution. In this regard, Azzalini [1] introduced an extension of the normal distribution called skew-normal distribution, where this model shares some properties with the standard normal model; it is mathematically tractable and it has a wide range of the coefficients of skewness and kurtosis. From this work, an important line of research focusing on finding new distributions that offered greater flexibility is generated.
More recently, Elal-Olivero [2] introduced a new class of skew-normal distribution called alpha-skew-normal distribution. In doing so, he first defined a new bimodal-symmetric normal distribution with probability density function given by where (⋅) is the standard normal density, which is defined as the bimodal-normal (BN) distribution. Furthermore, he studies some properties of this distribution and presents its stochastic representation as the product of two independent random variables √ and , where ∼ 2 (3) and is a discrete random variable such that P( = ±1) = 1/2; that is, = √ has the distribution BN. On the other hand, an extension of the BN density is given by where ≥ 0 is the shape parameter. Note that this density function is symmetric and is characterized by incorporating bimodality into the normal distribution, which is controlled by the parameter . Elal-Olivero [2] presents this extension as the symmetric-component of the alpha-skew-normal distribution. Furthermore, (2) also can be deduced from the model presented in Elal-Olivero et al. [3]. In this regard, Gui et al. [4] incorporated (2) into the slash distribution, developed its properties, and performed inferential studies, whereas Gómez and Guerrero [5] incorporated (2) into the Birnbaum-Saunders distribution, tested its bimodality, and demonstrated its principal properties. The objective of this article is to present a new family of distributions through generalization of (2). This generalization can be applied to any density function, thereby producing a more flexible model incorporating a shape parameter. Depending on the density at which we apply this generalization, it is observed that the new model is flexible enough to support uni-and bimodal shapes. Furthermore,

A General Class of Distributions
This section describes a general class of distributions generated by (2), presents its basic properties, and derives explicit expressions for the normal, Birnbaum-Saunders, Student's , and Laplace distribution.

Characterization and Properties
Theorem 1 (general class of distributions). Let be a probability density function and ℎ a positive continuous function such that E [ℎ( )] = < ∞, where ∼ . Then, is a probability density function with shape parameter ≥ 0.
Proof. If we note ( ; ) can be represented as a mixture of two densities, then the result follows immediately; that is, Remark 2. On the basis of Theorem 1, we can make the following observations: (1) If = 0, then ( ; ) = ( ), ∈ R.
(3) The -th moment of the random variable is given by 2.2. Special Cases. In this section, explicit expressions are provided for the probability density function in (3) for the normal, Birnbaum-Saunders, Student's , and Laplace distribution and different choices of ℎ. These models are selected to show the benefits of the proposed extension, and the choice of the function ℎ( ) is conditioned upon a positive function with finite expectation.
Corollary 5 (normal case). If ℎ( ) = 2 and ( ) = ( ), then has the probability density function given by and we say that has an "extended normal distribution," which is denoted as ∼ EN( ).
and we say that has an "extended Birnbaum-Saunders distribution," which is denoted as ∼ EBS( , , ).
and we say that has an "extended Student's distribution," which is denoted as ∼ Et(], ).
As we notice, in the Corollaries 5-8 and Figure 1, when the function ( ) is a symmetric density, the effect of the extension is that the model supports uni-and bimodal shapes. On the other hand, if the model has positive support, the bimodality depends on the choice of parameters, as seen in the Birnbaum-Saunders distribution case.

Some Results of the Special Cases
In this section, we develop some properties associated with the models defined in Corollaries 5-8. The cumulative distribution function, moment, and stochastic representation will be presented when they correspond to the cases at hand. Some proofs are straightforward and are, therefore, omitted.

Extended Normal Distribution.
The extended normal distribution is the basis for the development of the specific Table 1: Alternative stochastic representation of special cases.
The stochastic representation of ∼ EN( ) is obtained through Theorem 3. Table 1 shows an alternative way to generate random variables ∼ EN( ). Furthermore, the stochastic representation has a form that is similar to the representation given in Henze [6] for the skew-normal distribution presented in Azzalini [1].
(1) The cumulative distribution function is given by (2) The -th moment of the random variable is given by (3) The expected value and variance of the random variable is given by

Extended Birnbaum-Saunders distribution.
The Birnbaum-Saunders (BS) distribution (see Birnbaum and Saunders [7,8]) describes the lifetime of components exposed to fatigue caused by cyclical stress and tension. Since 1969, the number of studies that have investigated this distribution and discussed the development of both its theoretical properties and its applications has increased dramatically. Because of its significance, this distribution has been extended in a variety of manners to relax its behavior and thus make it applicable to a wide range of situations. For example, see Birnbaum and Saunders [7,8], Mann et al. [9], Desmond [10,11], Chang and Tang [12], Díaz-García and Leiva-Sánchez [13], Gómez et al. [14], and Olmos et al. [15]. The BS distribution with parameters > 0 and > 0 has density function given by where ( , ) is defined in Corollary 6 and is denoted as BS( , ). If 1 ∼ BS( , ) and ℎ( ) = 2 ( , ), then with a stochastic representation given by where ∼ BN. From Theorem 1, the extended Birnbaum-Saunders distribution has density (9) and from Theorem 3 we can generate random variables ∼ EBS( , , ). An alternative way to generate this random variable can be seen in Table 1.
Proof. The proofs are immediate from the theorem of the change of variable.
Remark 11. Like the Birnbaum-Saunders distribution observes that the property (1) established in Theorem 10 implies that the EBS distribution belongs to the scale family, whilst the property (2) implies that it also belongs to the family of random variables closed under reciprocation; see Saunders [16]. Furthermore, based on properties (1) and (2), we can have the two-parameter EBS distribution: / ∼ EBS( , , ).
where (⋅) is the cumulative distribution function of the Birnbaum-Saunders distribution.
(2) The -th moment of the random variable is given by where ∼ EN( ).
(3) The expected value and variance of the random variable is given by

Extended Student's -Distribution.
The Student'sdistribution serves as a robust alternative when it is desired to model data sets with atypical values and with a coefficient of kurtosis that is greater than of the normal distribution. The Student's -distribution with parameter ] > 0 has a density function given by If 1 ∼ ( ) and ℎ( ) = 2 (1 + ])/( 2 + ]), then, with a stochastic representation given by 2 = /√ /], where ∼ BN and ∼ 2 (]) are independent random variables. From Theorem 1, the extended Student's -distribution has density (10) and from Theorem 3 we can generate random variables ∼ Et(], ). An alternative way to generate this random variable can be seen in Table 1.
where (⋅) and (⋅) are the cumulative distribution function and probability density function of the -Student distribution, respectively. (2) The -th moment of the random variable is given by where ] > . (3) The expected value and variance of the random variable is given by , ] > 2.

Extended Laplace Distribution.
The Laplace (L) or double exponential distribution, which was originally published by Pierre Laplace in 1774, is a symmetric distribution with density function given by If 1 ∼ L and ℎ( ) = | |, then, with a stochastic representation given by 2 = √ 2 , where ∼ exp(1) and ∼ BN are independent random variables. From Theorem 1, the extended Laplace distribution has density (11) and from Theorem 3 we can generate random variables ∼ EL( ). An alternative way to generate this random variable can be seen in Table 1.
(1) The cumulative distribution function is given by (2) The -th moment of the random variable is given by  (30) Table 1 shows an alternative way to generate random variables for the special cases defined in Corollaries 5-8. We can see that the extended normal distribution is the basis for the development of the specific cases discussed previously.

Inferential Aspects of the EL Distribution
In this section, we will study some inferential properties of the extended Laplace distribution defined in Corollary 8 . We will explore maximum likelihood estimators and Monte Carlo simulation and will apply there results to two real data sets, comparing the fit with the Laplace distribution using the likelihood ratio and the Akaike Information Criterion (AIC).

Maximum Likelihood Estimator. In practice, it is common to work with a location and scale transformation = +
, where ∈ R, > 0, and ∼ EL( ) with ≥ 0. Hence, the density for the random variable , denoted as ∼ EL( , , ), is where = ( , , ) ⊤ , which is a continuous function in each parameter, but it is not differentiable at = , for = 1, . . . , . Thus, by assuming ̸ = , for = 1, . . . , , we have that elements of the score vector are ( ) = ( , , ) ⊤ , where = ℓ/ , given by where sgn(⋅) denotes the sign function. Hence, the maximum likelihood estimator̂solves the score equations ( ) = 0. Which must be obtained through a numerical method. A lot of software, including optimization toolbox, can be used for obtaining the maximum likelihood estimates. To achieve the maximization of log-likelihood function, we used the function optim on R (see R Core Team [17]), the specific method being Nelder-Mead (see Nelder and Mead [18]), that uses only function values and is robust but relatively slow. It will work reasonably well for nondifferentiable functions.
For obtaining the standard errors of the maximum likelihood estimates one should compute the information matrix ( ). It is well known that the elements of ( ) are given by , = 1, 2, 3 and = ( , , ) ⊤ . (34) Since expectation over EL distribution is not straightforward, numerical methods should be performed to obtain the explicit form of the information matrix. This matrix can be approximated by the observed information matrix (̂), which is defined as minus the Hessian matrix evaluated at̂; that is where the second derivatives are given below: Thus, we use the observed information matrix for computing the standard errors in the rest of the paper. Note that this approximation of the observed information matrix is obtained under a less stringent supposition, this is, assuming that the density function is absolutely continuous, as is the case with the Laplace distribution (see Kotz et al. [19], remark 2.6.1).

Numerical Study.
We shall use Monte Carlo simulation to evaluate the finite sample performance of the maximum likelihood estimator. The number of Monte Carlo replications was 1000 from simulated samples of the EL distribution for several samples sizes. Each sample was generated using the stochastic representation of the EL distribution, described above. For each generated sample, we obtain the maximum likelihood estimates using the function optim on R, the specific method being Nelder-Mead.
In order to analyze the point estimation results, we computed, for each sample size and for each estimator, the standard error from the observed information matrix defined in (35). The result can be seen in Table 2. From the results, we can see that the estimates are quite stable and estimates are asymptotically unbiased as expected, that is, it is observed that the bias becomes smaller as the sample size increases.

Data Illustration.
In this section we shall examine the application of the EL distribution to two real data sets. The first data set is related to the project WHO MONICA (World Health Organization Multinational Monitoring of Trends and Determinants in Cardiovascular Disease). This data set has been previously analyzed and studied in Kuulasmaa et al. [20], Kulathinal et al. [21], and de Castro et al. [22] and corresponds to the average annual rate of occurrence of cardiovascular mortality or the presence of coronary disease. The data are as follows: −5. The second data set consists of the heights in inches of 126 students from University of Pennsylvania. This data set has been previously analyzed and studied by Hassan and Hijazi [23] and Gui et al. [4]. The data are as follows: 55.00, 60.00, 60. 25 Table 3 shows a descriptive summary of the data sets analyzed.
From these data sets, the maximum likelihood estimators of the parameters associated with the Laplace (L( , )) and the extended Laplace (EL( , , )) distributions are obtained; the results of the comparison are summarized in Table 4 and    The AIC is used to compare the estimated models. As one can see, our model with the smallest values of AIC is preferable. In addition, we can use the likelihood ratio (LR) test statistic to confirm our claim. To do this, we consider the following hypotheses: The value of the LR test statistic for the data WHO MONICA is 7.22 and for the data set of the height of students is 26.96 and comparing this quantity with 2 (1) = 3.83, the null hypothesis is rejected. Figure 3 shows the graphs of the QQ-plots for the WHO MONICA and HEIGHT data sets calculated with the Laplace and extended Laplace models fitted with the maximum likelihood estimates of the parameters. These also show the good agreement of the EL distribution for the two data sets.

Conclusion and Final Comments
We have presented a generalization of the extended bimodalnormal distribution that depends on a shape parameter that controls the effect of bimodality when the density is symmetric. But, generally speaking, it produces a more flexible model in terms of asymmetry and kurtosis coefficients. Additionally, we have demonstrated its basic properties and stochastic representation, the latter of which played a significant role in the development of this work. The family of distributions includes a large number of distributions. For example, four of them were presented as corollaries, leaving the Laplace distribution for last, and used to develop some inferential aspects and Monte Carlo simulation schemes, which were facilitated by the use of stochastic representation in the generation of random variables. Finally, using two real data sets, we demonstrated that the proposed model resulted in better behavior relative to the standard Laplace model. Moreover, in the statistical literature, there are a variety of extensions of the Laplace distribution, in order to achieve greater flexibility, but without the effect of bimodality that Journal of Probability and Statistics fitted the data analyzed. Although bimodality can be achieved through a mixture of distributions, the proposed model is more parsimonious in terms of the number of parameters. It is important to emphasize that Theorem 1 can be extended, as demonstrated below.
is a probability density function with shape parameter ≥ 0.
Note that when 1 ( ) = 2 ( ) = ( ) we have Theorem 1. Furthermore, this new extension has stochastic representation given by the following theorem.
Proof. Since ( ; ) is a mixture, the result follows immediately.

Data Availability
The WHO MONICA and HEIGHT data sets used to support the findings of this study are included within the article.

Disclosure
Preliminary results of this manuscript were presented as a poster in "Flexible Statistical Models for a Skewed World of Data: A Workshop in Honor of Reinaldo B. Arellano-Valle's 65th Birthday" 2017.