We proposed a new family of lifetime distributions, namely, complementary exponentiated exponential geometric distribution. This new family arises on a latent competing risk scenario, where the lifetime associated with a particular risk is not observable but only the maximum lifetime value among all risks. The properties of the proposed distribution are discussed, including a formal proof of its probability density function and explicit algebraic formulas for its survival and hazard functions, moments, rth moment of the ith order statistic, mean residual lifetime, and modal value. Inference is implemented via a straightforwardly maximum likelihood procedure. The practical importance of the new distribution was demonstrated in three applications where our distribution outperforms several former lifetime distributions, such as the exponential, the exponential-geometric, the Weibull, the modified Weibull, and the generalized exponential-Poisson distribution.
1. Introduction
Several new classes of models have been introduced in recent years grounded in the simple exponential distribution. The main idea is to propose lifetime distributions which can accommodate practical applications where the underlying hazard functions are nonconstant, presenting monotone shapes, since the exponential distribution does not provide a reasonable fit in such situations. For instance, we can cite [1], which proposed a variation of the exponential distribution, the exponential geometric (EG) distribution, with decreasing hazard function, [2], which introduced the exponentiated exponential distribution as a generalization of the usual exponential distribution, which can accommodate data with increasing and decreasing hazard functions, [3], which proposed a generalized exponential distribution, which can accommodate data with increasing and decreasing hazard functions, [4], which proposed the exponentiated type distributions extending the Fréchet, gamma, Gumbel, and Weibull distributions, [5], which proposed another modification of the exponential distribution with decreasing hazard function, [6], which generalizes the distribution proposed by [5] by including a power parameter in this distribution, which can accommodate increasing, decreasing, and unimodal hazard functions, [7], which proposed the Poisson-exponential distribution, and [8], which proposed the complementary exponential geometric distribution, which is complementary to the exponential geometric distribution proposed by [1]. The last two proposed distributions accommodate increasing hazard functions.
In this paper, following [8], we propose a new distribution family by extending the exponentiated exponential distribution [2] by compounding it with a geometric distribution, hereafter the complementary exponentiated exponential geometric distribution or simplistically the CE2G distribution. The new distribution genesis is stated on a complementary risk problem base [9] in presence of latent risks, in the sense that there is no information about which factor was responsible for the component failure and only the maximum lifetime value among all risks is observed. This family have one shape and two scale parameters accommodating increasing, decreasing, and bathtub failure rates.
The paper is organized as follows. In Section 2 we introduce the new CE2G distribution, derive the expressions for the probability density, survival, and hazard functions and the pth quantile, and present its genesis. In Section 3 we present some of its properties, such as its characteristic function, rth raw moment, mean and variance, order statistics, rth moment of the ith order statistic, mean residual lifetime, and modal value. In Section 8 we present the inferential procedure. In Section 10 the practical importance of the new distribution was demonstrated in three applications where our distribution outperforms several former lifetime distributions, such as the exponential, the exponential-geometric, the Weibull, the modified Weibull, and the generalized exponential Poisson distribution. Some final comments in Section 11 conclude the paper.
2. The CE2G Model
Let Y be a nonnegative random variable denoting the lifetime of a component in some population. The random variable Y is said to have a CE2G distribution with parameters λ>0, α>0, and 0<θ<1 if its probability density function (pdf) is given by
(1)f(y)=αλθe-λy(1-e-λy)α-1[1-(1-θ)(1-e-λy)α]2,y>0,
where λ is a scale parameter of the distribution, and α and θ are shape parameters. Figure 1(a) shows the CE2G probability density function for λ=1, θ=0.05,0.5,0.95, and α=0.3,1.0,3 and we can see that the function can be decreasing or unimodal.
(a) Probability density function of the CE2G distribution. (b) Failure rate function of the CE2G distribution. We fixed λ=1.
The survival function of a CE2G distributed random variable is given by
(2)S(y)=1-(1-e-λy)α1-(1-θ)(1-e-λy)α,y>0,
where, α>0, θ∈(0,1), and λ>0.
From (2) and (1), the failure rate function, according to the relationship h(y)=f(y)/S(y), is given by
(3)h(y)=αλθe-λy(1-e-λy)α-1[1-(1-e-λy)α][1-(1-θ)(1-e-λy)α].
The initial value is not finite if α<1 and otherwise is given by h(0)=λθ if α=1 or h(0)=0 if α>1 and the long-term hazard function value is h(∞)=λ. The failure rate (3) can be increasing, decreasing, or bathtub as shown in Figure 1(b), which shows some failure rate function shapes to λ=1, θ=0.05,0.5,0.95, and α=0.3,1.0,3.
The pth quantile of the CE2G distribution is given by
(4)Q(u)=F-1(u)=-ln(1-(u/(θ(1-u)+u))1/α)λ,
where u has the uniform U(0,1) distribution and F(y)=1-S(y) is the distribution function of Y.
Consider that in the study of reliability we can observe only the maximum component lifetime for each component among all risks. On many occasions, the information about what risk produces the dead of the component in analysis is not available or it is impossible that the true cause of failure is specified. Complementary risks (CR) problems arise in several areas and an extensive literature is available. Interested readers can see [10–12].
Then, in this context, our model can be derived as follows. Let M be a random variable denoting the number of failure causes, m=1,2,… and considering M with geometrical probability distribution given by
(5)P(M=m)=θ(1-θ)m-1,
where 0<θ<1 and M=1,2,….
Also consider ti, i=1,2,3,… realizations of a random variable denoting the failure times, that is, the time-to-event due to the ith CR and, from [2], Ti has an exponentiated exponential probability distribution with parameters λ and α, given by
(6)f(ti;λ,α)=αg(ti;λ)G(ti;λ)=αλexp{-λti}(1-exp{-λti})α-1,
where g(·) and G(·) are the pdf and df, respectively, of the exponential distribution with parameter λ.
In the latent complementary risks scenario, the number of causes M and the lifetime tj associated with a particular cause are not observable (latent variables), but only the maximum lifetime Y among all causes is usually observed. So, we only observe the random variable given by
(7)Y=max{T1,T2,…,TM}.
The following result shows that the random variable Y has probability density function given by (1).
Proposition 1.
If the random variable Y is defined as (7), then, considering (5) and (6), Y is distributed according to a CE2G distribution, with probability density function given by (1).
Proof.
The conditional density function of (7) given M=m is given by
(8)f(y∣M=m,λ,α)=mαλe-λy(1-e-λy)α-1[(1-e-λy)α]m-1;t>0,m=1,…
Then, the marginal probability density function of Y is given by
(9)f(y)=∑m=1∞mαλe-λy(1-e-λy)α-1[(1-e-λy)α]m-1×θ(1-θ)m-1=θαλe-λy(1-e-λy)α-1∑m=1∞m[(1-e-λy)α(1-θ)]m-1=θαλe-λy(1-e-λy)α-1∑m=1∞[(1-e-λy)α(1-θ)]m-11-(1-e-λy)α(1-θ)=θαλe-λy(1-e-λy)α-1[11-(1-θ)(1-e-λy)α]2.
This completes the proof.
3. Some Properties
Many of the most important features and characteristics of a distribution can be studied through its moments, such as mean and variance. A general expression for rth ordinary moment μr′=E(Yr) of the CE2G distribution is hard to be obtained and we resume the mean and variance as follows.
The moment generating function of the Y variable with density function given by (1) can be obtained analytically, if we consider the expression, given in [13, page 329, Equation (1.6)].
(10)∫01zp-1(1-z)n-1(1+bzm)ldz=Γ(n)∑k=0∞(lk)(b)kΓ(p+km)Γ(p+n+km).
For any real number t, let ΦY(t) be the characteristic function of Y, that is, ΦY(t)=E[eitY], where i denotes the imaginary unit. With the preceding notations, we state the following.
Proposition 2.
For the random variable Y with CE2G distribution, we have that its characteristic function is given by
(11)Φ(t)=αθΓ(1-itλ)∑k=0∞(-2k)Γ(α[k+1])(θ-1)kΓ(α[k+1]+1-it/λ),
where i=-1.
Proof.
Consider the following:
(12)ΦY(t)=∫0∞eityf(y)dy=∫0∞eityαλθe-λy(1-e-λy)α-1[1-(1-θ)(1-e-λy)α]2dy=αθ∫01zα-1(1-z)-it/λ(1-(1-θ)zα)2dz,
where the last equality follows from the change of variable z=1-e-λy.
Comparing the last integral with (10), obtaining n=1-it/λ, b=θ-1, m=α=p, and l=-2, and making the appropriate substitutions completed the proof.
Proposition 3.
A random variable Y with density given by (1) has mean and variance given, respectively, by
(13)E(Y)=θλ∑k=0∞(-2k)(θ-1)k(k+1)[Ψ(0,α[k+1]+1)-Ψ(0,1)],Var(Y)=θλ2{[∑k=0∞(-2k)(θ-1)k(k+1)(Ψ(0,α[k+1]+1)-Ψ(0,1))]2∑k=0∞[(-2k)(θ-1)k(k+1)-(Ψ(0,1)2+π26+Ψ(0,α[k+1]+1)×[Ψ(0,α[k+1]+1)-2Ψ(0,1)]-Ψ(1,α[k+1]+1)π26)(-2k)(θ-1)k(k+1)]-θ[∑k=0∞(-2k)(θ-1)k(k+1)×(Ψ(0,α[k+1]+1)-Ψ(0,1))(-2k)(θ-1)k(k+1)(θ-1)k(k+1)]2∑k=0∞},
where Ψ(n,z)=(dn+1/dzn+1)ln(Γ(z)) is known as PsiGamma function.
Proof.
The first result follows from the relationship ΦY′(t)/i|t=0=E(Y). From the literature, ΦY′′(t)/i2|t=0=E(Y2) and Var(Y)=E(Y2)-[E(Y)]2, and with a little algebra follow the results.
Skewness is a measure of the asymmetry of the probability distribution. The skewness value can be positive or negative, or even undefined. Qualitatively, a negative skew indicates that the tail on the left side of the probability density function is longer than the right side and the bulk of the values lie to the right of the mean. A positive skew indicates that the tail on the right side is longer than the left side and the bulk of the values lie to the left of the mean. The skewness of a random variable Y, say γ1, is given by the third standardized moment
(14)γ1=E[(Y-μ)3](E[(Y-μ)2])3/2=E(Y3)-3E(Y2)E(Y)+3E2(Y)E(Y)-E3(Y)[E(Y2)-E2(Y)]3/2.
Kurtosis is any measure of the “peakedness” of the probability distribution of a real-valued random variable. In a similar way to the concept of skewness, kurtosis is a descriptor of the shape of a probability distribution. It is common practice to use the kurtosis to provide a comparison of the shape of a given distribution to that of the normal distribution. One common measure of kurtosis, originating with Karl Pearson, say γ2, is based on a scaled version of the fourth moment, given by
(15)γ2=E[(Y-μ)4](E[(Y-μ)2])2=E(Y4)-4E(Y3)E(Y)+6E(Y2)E2(Y)-3E4(Y)[E(Y2)-E2(Y)]2.
Algebraic expressions of kurtosis and skewness are extensive to show, due to the fact that is necessary the algebraic moment expressions up order four. This moment can be obtained by algebraic manipulation to determine E(Y), E(Y2), E(Y3), and E(Y4) in (14) and (15) through the Equation (11). Figure 2 shows the kurtosis (γ2) and skewness (γ1) of the CE2G distribution for α with λ=1, θ=0.1,0.5,0.9 and for θ with λ=1, α=0.3,1.0,3.
(a) Kurtosis and skewness of CE2G distribution for fixed λ=1. (b) Kurtosis and skewness of CE2G distribution for fixed λ=2.
4. Order Statistics
Order statistics are among the most fundamental tools in nonparametric statistics and inference. Let Y1,…,Yn be a random sample taken from the CE2G distribution and Y1:n,…,Yn:n denote the corresponding order statistics. Then, the pdf fi:n(y) of the ith order statistics Yi:n is given by
(16)fi:n(x)=n!(k-1)!(n-k)!F(y)k-1(1-F(y))n-kf(y).
The rth moment of the ith order statistic Yi:n can be obtained from the following result due to [14]:
(17)E[Yi:nr]=r∑p=n-i+1n(-1)p-n+i-1(p-1n-i)(np)∫0∞yr-1[S(y)]pdy.
Consider the binomial series expansion given by
(18)(1-x)-r=∑k=0∞(r)kk!xk,
where (r)k is a Pochhammer symbol, given (r)k=r(r+1)⋯(r+k-1) and if |x|<1 the series converge, and
(19)(-r)k=(-1)k(r-k+1)k.
Proposition 4.
For the random variable Y with CE2G distribution, we have that rth moment of the ith order statistic is given by
(20)E[Yi:nr]=r!λr∑p=n-i+1n∑j=0∞∑k=0∞∑l=0p∑m=0∞(-1)p-n+i+r+m+l-2(p-1n-i)(np)×(1-θ)j(p)j(p-l+1)l(α(j+l)+k-m+1)mj!l!m!(m+1)r.
Proof.
From (2) and (18), we have that
(21)∫0∞yr-1[S(y)]pdy=∫0∞yr-1(1-(1-e-λy)α1-(1-θ)(1-e-λy)α)pdy=(-1)r-1λr∫01lnr-1(1-x)(1-x)(1-xα1-(1-θ)xα)pdx=(-1)r-1λr∑j=0∞∑k=0∞∑l=0p(1-θ)j(p)j(-p)lj!l!×∫01xα(j+l)+klnr-1(1-x)dx.
Using the change of variable ln(1-x)=-u and the expansion (18) results in the kernel of the gamma distribution function as
(22)∫0∞yr-1[S(y)]pdy=(-1)r-1λr∑j=0∞∑k=0∞∑l=0p∑m=0∞(1-θ)j(p)j(-p)lj!l!×(-[α(j+l)+k])mm!(r-1)!(m+1)r.
Now considering (22) in (17) and the property (19), the result follows.
5. Entropy
An entropy of a random variable Y is a measure of variation of the uncertainty. A popular entropy measure is Rényi entropy [15].
If Y has the probability density function (1) then Rényi entropy is defined by
(23)γ(ρ)=11-ρlog(∫fρ(y)dy),
where ρ>0 and ρ≠1.
Proposition 5.
If the random variable Y is defined as (7), then, the Rényi entropy, is given by
(24)γ(ρ)=11-ρ×log((k!Γ(α(ρ+k)+1))-1(αθ)ρλρ-1∑k=0∞[(1-θ)k(2ρ)kΓ(ρ(α-1)+kα+1)×Γ(ρ)(k!Γ(α(ρ+k)+1))-1]∑k=0∞[(1-θ)k(2ρ)kΓ(ρ(α-1)+kα+1)).
Proof.
From (23), we can calculate
(25)∫fρ(y)dy=∫0∞(αλθ)ρe-λρy(1-e-λy)ρ(α-1)[1-(1-θ)(1-e-λy)α]2ρdy=(αλθ)ρ∫0∞∑k=0∞[(2ρ)kk!e-λρy(1-e-λy)ρ(α-1)+kα×(1-θ)k(2ρ)kk!]dy=(αθ)ρ∫0∞∑k=0∞[(2ρ)kk!(1-e-λy)ρ(α-1)+kα(1-θ)k×(2ρ)kk!(λe-λy)ρ-1]λe-λydy=(αθ)ρλρ-1∑k=0∞[(1-θ)k(2ρ)kk!∫0∞uρ(α-1)+kα×(1-u)ρ-1du∫0∞(2ρ)kk!]=(αθ)ρλρ-1∑k=0∞[(1-θ)k(2ρ)kk!×Γ(ρ(α-1)+kα+1)Γ(ρ)Γ(α(ρ+k)+1)].
So, using the (25) in γ(ρ), the result follows.
6. Reliability
In the context of reliability, the stress-strength model describes the life of a component which has a random strength Y that is subjected to a random stress X. The component fails at the instant hat, the stress applied to it exceeds the strength, and the component will function satisfactorily whenever Y>X. So, R=Pr(X<Y) is a measure of component reliability. In the area of stress-strength models there has been a large amount of work as regards estimation of the reliability R when Y and X are independent random variables belonging to the same univariate family of distributions.
Proposition 6.
If the random variable Y is defined as (7), then, the reliability R=P(X,Y) for X and Y i.i.d is given by
(26)θ2∑k=0∞(1-θ)k(3)kk!(k+2).
Proof.
For X and Y i.i.d. CE2G r.v.'s where X is the stress and Y is the strength, the reliability R=P(X<Y) is given by
(27)R=∫0∞∫0yαλθe-λx(1-e-λx)α-1[1-(1-θ)(1-e-λx)α]2×αλθe-λy(1-e-λy)α-1[1-(1-θ)(1-e-λy)α]2dxdy=∫0∞θ(1-e-λy)α[1-(1-θ)(1-e-λy)α]×αλθe-λy(1-e-λy)α-1[1-(1-θ)(1-e-λy)α]2dy=∑k=0∞θ2αλ(3)kk!(1-θ)k×∫0∞(1-e-λy)α(k+2)-1e-λydy=∑k=0∞∑j=0∞θ2αλ(3)k(1-α(k+2))jk!j!(1-θ)k×∫0∞e-λ(j+1)ydy=∑k=0∞∑j=0∞θ2α(3)k(1-α(k+2))jk!j!(j+1)(1-θ)k=∑k=0∞θ2(3)kk!(k+2)(1-θ)k.
This completes the proof.
7. Residual Lifetime Distribution
Given that there was no failure prior to time t, the residual lifetime distribution of a random variable X, distributed as CE2G distribution, has the survival function given by
(28)St(x)=Pr[X>x+t∣X>t]=(1-(1-e-λ(x+t))α1-(1-e-λt)α)×(1-(1-θ)(1-e-λt)α1-(1-θ)(1-e-λ(x+t))α).
The mean residual lifetime of a continuous distribution with survival function F-(x) is given by
(29)μ(t)=E(X-t∣X>t)=1S(t)∫t∞S(u)du.
Proposition 7.
For the random variable Y with CE2G distribution, we have that the mean residual lifetime is given by
(30)μ(t)=1λ(1-(1-θ)(1-e-λt)α1-(1-e-λt)α)×∑k=0∞∑i=0∞∑j=01(1-θ)i(-1)jj!×(1-(1-eλt)α(i+j)+k+1α(i+j)+k+1).
Proof.
From (29) and using S(y) given by (2), we have that
(31)1S(t)∫t∞S(u)du=1-(1-θ)(1-e-λt)α1-(1-e-λt)α×∫t∞1-(1-e-λu)α1-(1-θ)(1-e-λu)αdu=1λ1-(1-θ)(1-e-λt)1-(1-e-λt)α×∫1-e-λt11-xα(1-xα(1-θ))(1-x)dx.
Now using (18) and making a binomial expansion in a similar way of the proof of Proposition 4 on (22), the result follows.
8. Inference
Assuming the lifetimes are independently distributed and are independent from the censoring mechanism, the maximum likelihood estimates (MLEs) of the parameters are obtained by direct maximization of the log-likelihood function given by
(32)ℓ(θ,λ,α)=ln(αθλ)∑i=1nci-λ∑i=1nciyi+(α-1)∑i=1nciln(1-e-λyi)+∑i=1n(1-ci)ln(1-(1-e-λyi)α)-∑i=1n(1+ci)ln(1-(1-θ)(1-e-λyi)α),
where ci is a censoring indicator, which is equal to 0 or 1, respectively, if the data is censored or observed. The advantage of this procedure is that it runs immediately using existing statistical packages. We have considered the optim routine of the R [16].
Large-sample inference for the parameters are based on the MLEs and their estimated standard errors. For (α,θ,λ), we consider the observed Fisher information matrix given by
(33)IF(α,θ,λ)=(IααIαθIαλIθαIθθIθλIλαIλθIλλ)|(α,θ,λ)=(α^,θ^,λ^),
where the elements of the matrix IF(α,θ,λ) are given in the appendix.
Under conditions that are fulfilled for the parameters α,θ, and λ in the interior of the parameter space, the asymptotic distribution of (α^,θ^,λ^), as n→∞, is a normal 3-variate with zero mean and variance covariance matrix IF-1(α,θ,λ).
In order to compare different distributions, we relied upon several authors in the literature, for example, [6, 17–19], which use the Akaike information criterion (AIC) and Bayesian information criterion (BIC) values, which are defined, respectively, by -2ℓ(·)+2q and -2ℓ(·)+qlog(n), where ℓ(·) is the LogLikehood evaluated in the MLE vector on respective distribution, q is the number of parameters estimated, and n is the sample size. The best distribution corresponds to a lower AIC and BIC values.
9. Simulation Study
Regarding the performance of the MLEs in the process of estimation, a study was performed based on one hundred generated dataset from the CE2G with six different sets of parameters for n=20, 50, 100, 200, 500, and 1000. In order to have unbounded parameters, we consider the following restrictions on the parameters in estimation process. For the parameter θ, we considered the transformation θ=eθ*/(1+eθ*), where θ*∈ℛ, and for α and λ consider an exponential transformation. Based on the literature of the MLEs, we can return on the original parameters thought of the transformations. For the calculus of their variances, we use the delta method. The values (α,λ,θ)=(1,1,0.5) were used as the initial values for all numerics simulations since λ>0, α>0, and 0<θ<1.
The results are condensated in Table 1, which shows the averages of the MLEs, Av(α^,λ^,θ^), together with coverage probability of the 95% confidence intervals for parameters of the CE2G, C(α,λ,θ), the bias, the mean squarer error, MSE, and their deviance, Sd(α^,λ^,θ^). These results suggest that the MLEs estimates have performed adequately. The deviance of the MLEs decrease when sample size increases. The empirical coverage probabilities are close to the nominal coverage level, particularly, as sample size increases.
Mean of the MLEs, their deviances, coverages, bias and MSE.
n
Av(α^,λ^,θ^)
Sd(α^,λ^,θ^)
Bias
MSE
C(α,λ,θ)
(α,λ,θ)= (1.48, 3.10, 0.75)
20
(1.5716, 3.4497, 0.7522)
(0.7890, 1.1204, 0.3327)
(0.0916, 0.3497, 0.0022)
(0.6247, 1.3651, 0.1096)
(0.99, 0.99, 0.80)
50
(1.4902, 3.4026, 0.7145)
(0.4478, 0.7103, 0.3066)
(0.0102, 0.3026, −0.0355)
(0.1987, 0.5911, 0.0943)
(0.99, 0.99, 0.86)
100
(1.4765, 3.2589, 0.7233)
(0.2683, 0.4964, 0.2494)
(−0.0035, 0.1589, −0.0267)
(0.0713, 0.2692, 0.0623)
(0.99, 0.99, 0.91)
200
(1.4798, 3.1846, 0.7379)
(0.2090, 0.3846, 0.2176)
(−0.0002, 0.0846, −0.0121)
(0.0433, 0.1536, 0.0470)
(0.99, 0.99, 0.97)
500
(1.4725, 3.1617, 0.7361)
(0.1584, 0.2977, 0.1811)
(−0.0075, 0.0617, −0.0139)
(0.0249, 0.0916, 0.0326)
(0.99, 0.99, 0.99)
1000
(1.5020, 3.1116, 0.7697)
(0.1061, 0.1832, 0.1321)
(0.0220, 0.0116, 0.0197)
(0.0116, 0.0334, 0.0177)
(0.99, 0.99, 0.92)
(α,λ,θ)= (1.25, 2.63, 0.24)
20
(1.6389, 2.7783, 0.4016)
(1.0305, 0.8411, 0.3342)
(0.3889, 0.1483, 0.1616)
(1.2026, 0.7224, 0.1367)
(0.99, 0.99, 0.99)
50
(1.4826, 2.7004, 0.3459)
(0.7378, 0.5976, 0.2589)
(0.2326, 0.0704, 0.1059)
(0.5930, 0.3586, 0.0776)
(0.99, 0.99, 0.99)
100
(1.3892, 2.6563, 0.3046)
(0.5549, 0.3699, 0.1893)
(0.1392, 0.0263, 0.0646)
(0.3242, 0.1362, 0.0396)
(0.99, 0.99, 0.99)
200
(1.2869, 2.6143, 0.2729)
(0.3339, 0.2520, 0.1229)
(0.0369, −0.0157, 0.0329)
(0.1117, 0.0631, 0.0160)
(0.99, 0.99, 0.99)
500
(1.2609, 2.6029, 0.2497)
(0.1980, 0.1444, 0.0632)
(0.0109, −0.0271, 0.0097)
(0.0389, 0.0214, 0.0041)
(0.99, 0.99, 0.99)
1000
(1.2696, 2.6243, 0.2479)
(0.1621, 0.1123, 0.0517)
(0.0196, −0.0057, 0.0079)
(0.0264, 0.0125, 0.0027)
(0.99, 0.99, 0.99)
(α,λ,θ)= (0.25, 0.63, 0.20)
20
(0.3852, 0.6554, 0.4163)
(0.2658, 0.2378, 0.3376)
(0.1352, 0.0254, 0.2163)
(0.0882, 0.0566, 0.1596)
(0.92, 0.99, 0.99)
50
(0.2809, 0.6400, 0.2641)
(0.1264, 0.1368, 0.1973)
(0.0309, 0.0100, 0.0641)
(0.0168, 0.0186, 0.0427)
(0.99, 0.99, 0.99)
100
(0.2935, 0.6064, 0.2841)
(0.1162, 0.0931, 0.1732)
(0.0435, −0.0236, 0.0841)
(0.0152, 0.0091, 0.0368)
(0.99, 0.99, 0.99)
200
(0.2657, 0.6354, 0.2246)
(0.0810, 0.0744, 0.1009)
(0.0157, 0.0054, 0.0246)
(0.0067, 0.0055, 0.0107)
(0.99, 0.99, 0.99)
500
(0.2569, 0.6388, 0.2078)
(0.0429, 0.0492, 0.0537)
(0.0069, 0.0088, 0.0078)
(0.0019, 0.0025, 0.0029)
(0.99, 0.99, 0.99)
1000
(0.2536, 0.6313, 0.2044)
(0.0307, 0.0303, 0.0339)
(0.0036, 0.0013, 0.0044)
(0.0009, 0.0009, 0.0012)
(0.99, 0.99, 0.99)
(α,λ,θ)= (0.30, 0.60, 0.90)
20
(0.3258, 0.7817, 0.8033)
(0.1165, 0.3750, 0.2751)
(0.0258, 0.1817, −0.0967)
(0.0141, 0.1723, 0.0843)
(0.99, 0.99, 0.80)
50
(0.2813, 0.6879, 0.7639)
(0.0658, 0.2013, 0.2639)
(−0.0187, 0.0879, −0.1361)
(0.0046, 0.0479, 0.0875)
(0.99, 0.99, 0.85)
100
(0.2869, 0.6535, 0.8123)
(0.0489, 0.1406, 0.2222)
(−0.0131, 0.0535, −0.0877)
(0.0025, 0.0224, 0.0566)
(0.99, 0.99, 0.93)
200
(0.2905, 0.6325, 0.8364)
(0.0343, 0.0921, 0.1553)
(−0.0095, 0.0325, −0.0636)
(0.0013, 0.0095, 0.0279)
(0.99, 0.99, 0.97)
500
(0.3007, 0.6117, 0.8884)
(0.0219, 0.0647, 0.1214)
(0.0007, 0.0117, −0.0116)
(0.0005, 0.0043, 0.0147)
(0.99, 0.99, 0.97)
1000
(0.2970, 0.6053, 0.8821)
(0.0184, 0.0455, 0.1003)
(−0.0030, 0.0053, −0.0179)
(0.0003, 0.0021, 0.0103)
(0.99, 0.99, 0.98)
(α,λ,θ)= (0.50, 2.00, 0.40)
20
(0.5748, 2.3413, 0.4948)
(0.2790, 0.8066, 0.3586)
(0.0748, 0.3413, 0.0948)
(0.0826, 0.7606, 0.1363)
(0.99, 0.99, 0.99)
50
(0.6019, 2.0303, 0.5348)
(0.2218, 0.4461, 0.2941)
(0.1019, 0.0303, 0.1348)
(0.0591, 0.1979, 0.1038)
(0.99, 0.99, 0.99)
100
(0.5100, 2.0592, 0.4423)
(0.1622, 0.3178, 0.2465)
(0.0100, 0.0592, 0.0423)
(0.0262, 0.1035, 0.0620)
(0.99, 0.99, 0.99)
200
(0.5307, 2.0009, 0.4503)
(0.1091, 0.2491, 0.1864)
(0.0307, 0.0009, 0.0503)
(0.0127, 0.0614, 0.0369)
(0.99, 0.99, 0.99)
500
(0.5045, 1.9954, 0.4194)
(0.0727, 0.1594, 0.1154)
(0.0045, −0.0046, 0.0194)
(0.0053, 0.0252, 0.0136)
(0.99, 0.99, 0.99)
1000
(0.5051, 2.0072, 0.4034)
(0.0493, 0.1002, 0.0598)
(0.0051, 0.0072, 0.0034)
(0.0024, 0.0100, 0.0036)
(0.99, 0.99, 0.98)
(α,λ,θ)= (2.00, 0.25, 0.80)
20
(2.1599, 0.3199, 0.6131)
(1.0176, 0.1112, 0.3449)
(0.1599, 0.0699, −0.1869)
(1.0508, 0.0171, 0.1527)
(0.99, 0.99, 0.79)
50
(2.0826, 0.2743, 0.7193)
(0.5220, 0.0528, 0.2874)
(0.0826, 0.0243, −0.0807)
(0.2766, 0.0033, 0.0883)
(0.99, 0.99, 0.88)
100
(1.9984, 0.2629, 0.7519)
(0.4419, 0.0418, 0.2711)
(−0.0016, 0.0129, −0.0481)
(0.1933, 0.0019, 0.0751)
(0.99, 0.99, 0.87)
200
(2.0322, 0.2569, 0.7808)
(0.3046, 0.0272, 0.2050)
(0.0322, 0.0069, −0.0192)
(0.0929, 0.0008, 0.0420)
(0.99, 0.99, 0.97)
500
(1.9945, 0.2552, 0.7849)
(0.1613, 0.0218, 0.1783)
(−0.0055, 0.0052, −0.0151)
(0.0258, 0.0005, 0.0317)
(0.99, 0.99, 0.92)
1000
(1.9659, 0.2526, 0.7774)
(0.1358, 0.0160, 0.1496)
(−0.0341, 0.0026, −0.0226)
(0.0194, 0.0003, 0.0227)
(0.99, 0.99, 0.96)
10. Applications
In this section, we compare the CE2G distribution fit with several usual lifetime distributions on three datasets extracted from the literature. The first dataset, T1, refers to the serum-reversal time (days) of 143 children contaminated with HIV from vertical transmission at the university hospital of the Ribeirão Preto Scholl of Medicine (Hospital das Clínicas da Faculdade de Medicina de Ribeirão Preto) from 1986 to 2001 [20]. Serum reversal can occur in children born from mothers infected with HIV.
The second dataset, T2, is lifetimes in hours of 417 forty-watt, 110-volt internally frosted incandescent lamps taken from 42 weekly quality control [21]. Survival times, in days, are given for the control group of lamps on original dataset.
The third dataset, T3, gives the survival times for laboratory mice, which were exposed to a fixed dose of radiation at an age of 5 to 6 weeks. The cause of death for each mouse was determined after autopsy to be one of three possibilities: thymic lymphoma (C1), reticulum cell sarcoma (C2), or other causes (C3) [22]. Consider here the minces of C3 in the control group.
Firstly, in order to identify the shape of a lifetime data failure rate function, we will consider a graphical method based on the TTT plot [23]. In its empirical version, the TTT plot is given by G(r/n)=[(∑i=1rYi:n)+(n-r)Yr:n]/(∑i=1nYi:n), where r=1,…,n and Yi:n, i=1,…,n represent the order statistics of the sample. It has been shown that the failure rate function is increasing (decreasing) if the TTT plot is concave (convex). Figure 3(a) shows concave TTT plots for the T1, T2, and T3 datasets, indicating increasing failure rate functions.
(a) Empirical TTT plot for the dataset T1, T2, and T3, respectively. (b) Models fitting for the dataset T1, T2, and T3, respectively.
We compare the CE2G distribution fits with the exponential distribution with probability density function given by f(x)=λe-λx, the exponentiated exponential distribution, EE, with probability density function given by f(x)=α*λe-λx(1-e-λx)α-1, the EG distribution [1] with probability density function given by f(x)=λ(1-(1-θ)e-λx)-1, the Weibull distribution with probability density function given by f(x)=(θ/λ)(x/λ)θ-1e-(x/λ)θ, where the shape parameter is θ and scale parameter is λ, the gamma distribution with probability density function given by f(x)=(1/λθΓ(θ))xθ-1e-x/λ, with shape parameter θ and scale parameter λ, the modified Weibull (MW) distribution with probability density function given by f(x)=αxθ-1(θ+λx)eλxe-αxθexp{λx}, where α,θ≥0 and λ>0, the generalized exponential Poisson (GEP) distribution [6] with probability density function given by f(x)=(αβλ/(1-e-λ)α)(1-e-λ+λexp(-βx))α-1e-λ-βx+λexp(-βx), the generalized Birnbaum-Saunders (BS-G) distribution [24] with probability density function given by f(y)=(((y-μ)/β+β/(x-μ))/2α(x-μ))ϕ([(y-μ)/β-β/(x-μ)]/α), where ϕ(·) is the probability density distribution of the standard normal distribution, and the Birnbaum-Saunders (BS) distribution. The BS distribution is obtained considering μ=0 in the BS-G probability density function.
Table 2 provides the AIC and BIC criterion values for each distribution. They provide evidence in favor of our CE2G distribution for the datasets T1 and T2 in all of the three comparison criterion. For the dataset T3, the CE2G distribution provides similar fitting to the Weibull and MW distributions, implying that the CE2G distribution is a competitor to the usual survival distributions. These results are corroborated by the empirical Kaplan-Meier survival functions and the fitted survival functions shown in Figure 3(b). The MLEs (and their corresponding standard errors in parentheses) of the parameters α, θ(×1000), and λ(×10000) of the CE2G distribution are given, respectively, by 3.7469 (0.5688), 41.4860 (9.7659), and 17536.46 (7.1814) for T1, by 5.1765 (19.4159), 0.2625 (0.9915), and 94.6676 (3.8720) for T2, and by 0.0018180 (0.9818), 0.0698 (0.3770), and 78.7704 (11.5084) for T3.
Values of the—maxℓ(·) and AIC for all fitted distributions.
E
EE
EG
Weibull
Gamma
CE2G
MW
GEP
BS
BS-G
T1
AIC
1723.7
1657.2
1725.8
1630.5
1649.4
1616.0
1660.0
1659.3
1919.7
1708.5
BIC
1726.7
1663.2
1731.7
1636.5
1655.3
1624.9
1668.9
1668.2
1925.6
1717.3
T2
AIC
6649.8
5703.2
6651.8
5599.0
5605.9
5571.0
5664.7
5705.3
5648.3
5601.3
BIC
6653.9
5711.3
6659.9
5607.1
5613.8
5583.1
5676.8
5717.4
5656.3
5613.4
T3
AIC
549.8
538.2
551.8
530.3
536.5
530.6
530.7
540.3
550.8
534.0
BIC
551.5
541.6
555.2
533.7
539.8
535.6
535.7
545.3
554.1
539.0
11. Concluding Remarks
In this paper, a new lifetime distribution is provided and discussed. The CE2G distribution accommodates increasing, decreasing, and bathtub failure rate functions and arises in a latent complementary risks scenario, where the lifetime associated with a particular risk is not observable but only the maximum lifetime value among all risks. The properties of the proposed distribution are discussed, including a formal proof of its probability density function and explicit algebraic formulas for its survival and hazard functions, moments, rth moment of the ith order statistic, mean residual lifetime, modal value, and the observed Fisher information matrix. Maximum likelihood inference is implemented straightforwardly. The practical importance of the new distribution was demonstrated in three applications where the CE2G distribution provided the best fit in comparison with several other former lifetime distributions.
Appendix
In this appendix, we show the values of the elements of the observed Fisher information matrix in (33). From (32), we obtain(A.1)Iαα=∑i=1n(ciα2+(1-ci)Liαln2(Li)Ri+(1-ci)Li2αln2(Li)Ri2-(1+ci)(1-θ)Liαln2(Li)Ti-(1+ci)(1-θ)2Li2αln2(Li)Ti2),Iαθ=Iθα=∑i=1n((1+ci)Liαln(Li)Ti+(1+ci)(1-θ)Li2αln(Li)Ti2),Iαλ=Iλα=∑i=1n(-ciXiLi+α(1-ci)Liαln(Li)XiLiRi+(1-ci)LiαXiLiRi+α(1-ci)Li2αln(Li)XiLiRi2-α(1+ci)(1-θ)Liαln(Li)XiLiTi-(1+ci)(1-θ)LiαXiLiTi-α(1+ci)(1-θ)2Li2αln(Li)XiLiTi2),Iθθ=∑i=1n(ciθ2-(1+ci)Li2αTi2),Iθλ=Iλθ=∑i=1n(α(1+ci)LiαXiLiTi+α(1+ci)(1-θ)Li2αXiLiTi2),Iλλ=∑i=1n(ciλ2+(α-1)ciyiXiLi+(α-1)ciXi2Li2-α(1-ci)LiαyiXiLiRi-α(1-ci)LiαXi2(1-α)Li2Ri+α2(1-ci)Li2αXi2Li2Ri2+α(1+ci)(1-θ)LiαyiXiLiTi+α(1+ci)(1-θ)LiαXi2(1-α)Li2Ti-α2(1+ci)(1-θ)2Li2αXi2Li2Ti2),
where Li=1-e-λyi, Ri=1-Liα, Ti=1-(1-θ)Liα, and Xi=yie-λyi.
Acknowledgments
V. Marchi and F. Louzada are supported by the Brazilian organizations CAPES and CNPq, respectively. The authors are grateful to Dr. Gauss Cordeiro, Editor of this special issue in, as well as to the anonymous Referees for their comments, criticisms, and suggestions, which lead to important improvements.
AdamidisK.LoukasS.A lifetime distribution with decreasing failure rateGuptaR. D.KunduD.Exponentiated exponential family: an alternative to gamma and Weibull distributionsGuptaR. D.KunduD.Generalized exponential distributionsNadarajahS.KotzS.The exponentiated type distributionsKuşC.A new lifetime distributionBarreto-SouzaW.Cribari-NetoF.A generalization of the exponential-Poisson distributionLouzada-NetoF.CanchoV. G.BarrigaG. D. C.The Poisson-exponential distribution: a Bayesian approachLouzadaF.RomanM.CanchoV. G.The complementary exponential geometric distribution: model, properties, and a comparison with its counterpartLouzada-NetoF.Poly-hazard regression models for lifetime dataLawlessJ. F.CrowderM. J.KimberA. C.SmithR. L.SweetingT. J.CoxD. R.OakesD.GradshteynI. S.RyzhikI. M.BarakatH. M.AbdelkaderY. H.Computing the moments of order statistics from nonidentical random variablesRényiA.On measures of entropy and informationR Development Core TeamR Foundation for Statistical Computing, Vienna, Austria, 2010, http://www.R-project.orgBarreto-SouzaW.de MoraisA. L.CordeiroG. M.The weibull-geometric distributionAdamidisK.DimitrakopoulouT.LoukasS.On an extension of the exponential-geometric distributionMudholkarG. S.SrivastaD. K.Exponentiated weibull family: a reanalysis of the bus-motor-failure dataPerdonáG. S. C.Louzada-NetoF.A general hazard model for lifetime data in the presence of cure rateDavisD.An analysis of some failure dataHoelD. G.A representation of mortality data by competing risksAarsetM. V.How to identify a bathtub hazard rateBirnbaumZ. W.SaundersS. C.A new family of life distributions