Estimations of Generalized Exponential Distribution Parameters Based on Type I Generalized Progressive Hybrid Censored Data

Type I generalized progressive hybrid censoring scheme is a combination of Type I and Type II progressive hybrid censoring schemes, and it is one of the most recent advancements in data censoring. In this article, based on Type I generalized progressive hybrid censoring data from generalized exponential distribution, the maximum likelihood and Bayesian estimators of distribution's parameters as well as the reliability and hazard functions are approximately calculated. Also, the credible interval estimators of these quantities are obtained. Since these quantities cannot be obtained in closed form, so simulation and analysis using a Monte Carlo simulation study with Gibbs sampling are taken. Finally, an illustrative example using real data set is presented to compare the proposed procedures presented and developed here.


Introduction
Computational methods and data processing are among the most important sciences that have different applications in many fields such as medicine, engineering, agriculture, and various vital fields. For a large data, it was necessary to find method for statistical inference by using small samples taken in a certain way of this large data, in order to limit the time and the cost; hence, the idea of censored samples started. There are a variety of scenarios where observed data is censored in nature, including reliability and life testing experiments. The two most popular censoring techniques are Type I and Type II censoring schemes. Epstein [1] initially proposed the hybrid censoring scheme, which is a mixture of Type I and Type II censoring schemes. One of the main problems of Type I, Type II, and hybrid censoring (HC) methods is that they do not allow for unit removal at whatever point other than the experiment's finish. To address this issue, a more comprehensive censoring scheme known as progressive Type II censoring was implemented. Balakrishnan and Cramer [2] provide a full discussion of these censoring schemes as well as some recent advances. The disadvantages of the progressive Type II censoring scheme are that the time of the experiment can be very long if the units are highly reliable. Therefore, Kundu and Joarder [3] proposed a progressive hybrid censoring scheme (PHCS). Under progressive hybrid censoring scheme, the time on experiment will be no more than T. Some recent studies on progressive hybrid censoring have been carried out by many authors including Lin et al. [4], Panahi [5], Mohie El-Din et al. [6], and El-Din et al. [7]. On the other side, one of the progressive hybrid censoring scheme's drawbacks is that it cannot be used when there are only a few failures before time T.
Cho et al. [8] suggest a Type I generalized PHCS that allows us to notice a prespecified number of failures for this reason. As a result, a specified number of failures and their lifetimes are always supplied under the Type I generalized PHCS. The life testing experiment based on this censoring scheme can reduce total test duration as well as the cost incurred due to unit failures. Furthermore, because there are more failed observations, statistical estimating efficiency improves. The Type I generalized PHCS have been investigated for instance by El-Din et al. [9], Nagy et al. [10,11], and Nagy and Alrasheedi [12].
We look at statistical inference for a generalized exponential (GE) distribution under Type I generalized PHCS in this paper. Gupta and Kundu [13] provide the probability density function (PDF) and cumulative distribution function (CDF) of the GE distribution, respectively.
where α and λ denote the shape and scale parameters, respectively. For simplicity, let ψðx ; λÞ = 1 − exp ð−λxÞ; then, PDF and CDF of the GE distribution can be rewritten, respectively, by the following: Therefore, the reliability (survival) and hazard functions may be written, respectively, as follows: The GE distribution is a very flexible and favorably skewed model that is used to replace the lognormal, gamma, and Weibull distributions. It is often used to examine data that is skewed in a positive direction. Inferences for the GE distribution have been discussed by many authors including Gupta and Kundu [14] and Jaheen [15].
The aim of this article is that we consider the analysis of the Type I progressive censoring data from generalized exponential distribution to calculate the maximum likelihood and Bayesian estimators of unknown parameters and also calculate the approximate confidence intervals. The remainder of this paper is as follows. In Section 2, we provide an overview of the Type I generalized PHCS and compute the likelihood function of the model based on Type I generalized PHCS. The ML estimators with the corresponding approximate confidence intervals of the GE distribution's parameters, the reliability function and hazard function, are derived in Section 3. In Section 4, we use MCMC with the Gibbs sampling procedure to compute the Bayesian estimates of the parameters, reliability and hazard functions for the GE distribution, and also construct their credible intervals. Under Type I generalized censoring scheme, simulation studies to compare the efficacy of the offered inference methodologies are carried out in Section 5. In Section 6, some computational results with real data are presented for illustrating all the inferential methods developed here. Finally, Section 7 concludes the paper.

The Likelihood Model Description
Consider life testing experiment in which n equivalent units are tested. For T ∈ ð0,∞Þ and integers k, m ∈ f1, 2, ⋯, ng and R = ðR 1 , ⋯ − R m Þ are prefixed such that k < m. Let d denote the number of observed failures up to time T, and Y k:m:n and Y m:m:n are the times of failure k, m, respectively. According to the Type I generalized PHCS which introduced by Cho et al. [8], the experiment termination time is T * = max fY k:m:n , min fY m:m:n , Tgg and we have the following observations: (1) Suppose T is reached before the k th failure, then the experiment terminates at Y k:m:n and we will observe fY 1:m:n <⋯<Y d:m:n < Y d+1:m:n <⋯<Y k:m:n g (2) Suppose that the T is reached between the k th and m th , then the experiment terminates at T and we will observe fY 1:m:n <⋯<Y k:m:n < Y k+1:m:n <⋯<Y D:m:n g (3) Suppose that the m th failure occurs before T ði:e:, Y m:m:n ≤ TÞ, then the experiment terminates at X m:m:n and we will observe fY 1:m:n <⋯<Y k:m:n < Y k+1:m:n <⋯<Y m:m:n g Thus, the joint density function based on the above cases can be written as follows: where where R * τ is the number of surviving units eliminated at T, as determined by the following: and x i = x i:m:n for simplicity of notation. Upon using Equations (3) and (4) in Equation (7), the likelihood function of α, β based on generalized Type I HPCS can be obtained as follows:

The ML Estimations
For computing the ML estimates, we must maximize the likelihood Lðα, λ | yÞ with respect to α and λ in order to compute the ML estimates. The logarithm of the likelihood function is given by: The ML estimates b α ML and b λ ML of α and λ can be obtained by simultaneously solving the following equations.
By using the invariance property of the ML estimator, the ML estimators of the corresponding reliability and hazard functions are given, respectively, bŷ 3.1. Approximate Confidence Intervals for α and λ. For large observation D * , the observed Fisher information matrix of the parameters α and λ is given by the following: where 3 Computational and Mathematical Methods in Medicine Based on the normal approximation of the ML estimators, are asymptotically normally distributed with mean 0 and variance 1; that is,

Approximate Confidence Intervals for SðtÞ and HðtÞ.
In this subsection, we calculate the estimated confidence intervals for the reliability and hazard functions using the delta method proposed by Greene [16]. The delta method is a general method for calculating confidence intervals for MLE functions that are too complex to calculate the variance analytically. It creates a linear approximation of the function and then calculates the variance of the simpler linear function that can be used for large sample inference (see [17]). Let where Then, the approximate estimates of Vð d SðtÞÞ and Vð d HðtÞÞ are given, respectively, by the following: where are asymptotically normally distributed with mean 0 and variance 1; these results yield ð1 − γÞ 100% approximate confidence intervals for SðtÞ and HðtÞ which are given by the following.
For more on different types of confidence intervals, we refer our readers to Banik et al. [18] and Almonte and Kibria [19] among others.

Bayesian Estimates
In this section, we derive the Bayesian estimates of the GE distribution's parameters α and λ, based on Type I generalized HPCS data. For the Bayesian estimations, it is under the premise that both distribution's parameters α and λ are independent and have gamma prior distributions, respectively, 4 Computational and Mathematical Methods in Medicine where hyper parameters a i , b i for all i = 1, 2 are the positive real constants that indicate prior knowledge of distribution's parameters; if a i , b i for all i = 1, 2 are set to be equal zero, then the informative priors πðαÞ and πðλÞ are reduced to the noninformative priors: The joint prior density of α and λ can be obtained by multiplying πðαÞ by πðλÞ as follows: Upon combining Equations (11) and (31), given the Type I generalized PHCS, the corresponding posterior distribution of α, λ is obtained as follows: The Bayesian estimate of a parametric function gðθÞ All Bayesian estimators based on the SEL function are in the form of a ratio of two integrals for which there are no closed-form solutions, so to estimate the above integrals, we must use appropriate numerical methods. To compute Bayesian estimates and build credible intervals for the distribution's parameters α and λ, we apply here the MCMC approaches. When compared to previous methods, the MCMC method gives an alternate way for parameter estimate that is more flexible. For more details about the MCMC approaches, for further details, see [20,21].

The Metropolis-Hastings Algorithm within Gibbs
Sampling. The Metropolis-Hastings (MH) algorithm was presented by Metropolis et al. [22] as a general Markov chain Monte Carlo (MCMC) approach, and Hastings [23] developed the MH algorithm. The MH approach can be used to generate random samples from any arbitrary complicated target distribution with any dimension that is known up to a normalizing constant. The Gibbs sampling method is a subset of the MCMC method. It can be used to produce a sequence of samples from two or more random variables' entire conditional probability distributions. Decomposing the joint posterior distribution into entire conditional distributions for each parameter and sample from them is required for Gibbs sampling. To produce a sample from the posterior density function π * ðα, λ | yÞ, we propose utilizing the Gibbs sampling approach. From Equation (32), the conditional posterior density function of α given π * ðα | λ, yÞ can be obtained as follows: Similarly, the conditional posterior density function of λ given π * ðα | λ, yÞ can be obtained as follows: Furthermore, the conditional posterior distributions of α and λ in Equations (34) and (35) cannot be reduced analytically to well-known distributions; therefore, they cannot be sampled directly using usual methods, but their plots show that they are similar to normal distributions. We employ the MH technique within the Gibbs sampling scheme with normal proposal distribution to generate random numbers from these distributions. Now, we propose the following  Finally, some initial observations of size B, say, are eliminated as burn-in observations from a total N generated observations of α and λ, and the remaining N-B The Bayesian estimate of λ under SELF is given by the following: Substituting α i and λ i into Equations (5) and (6) to compute S i ðtÞ and H i ðtÞ, for i = 1, 2, ⋯, N, then the Bayesian estimates of SðtÞ and HðtÞ under SELF are computed, respectively, as follows: This algorithm is also useful in computing credible intervals of unknown parameters. ð1 − γÞ 100% symmetric credible intervals of α, λ, SðtÞ, and HðtÞ become, respectively, the following:

Simulation Study
In this section, to evaluate the performance of the proposed methods of different estimators of the parameters presented in the preceding sections, a simulation study is carried out. Without losing generality, sample of size (n = 60) is used for the simulation study with different sample sizes (k, m, where m = 2k) and different values of T and the following censoring schemes: We then generate Type I generalized PHCSs from a GE distribution with α = 2 and λ = 1:2. To compute the Bayesian estimates and the credible intervals using MCMC algorithm,by using the number of iterations N = 11,000, the ML estimates for unknown parameters α and λ have been used as initial values for running the MCMC algorithm. The first values of the generated sequences may be far from reminded converged sequences, so the first B = 1,000 values are removed here to avoid the effects of the initial values and the procedure repeated 1000 times. We use the informative gamma priors for the two distribution's parameters α, λ with hyper parameters a 1 = a 2 = 1 and b 1 = b 2 = 2 (prior 1) and   Tables 1-4, respectively. Also, the average length (AL) and coverage probability (CP) of the 95% approximate confidence intervals are displayed in Tables 5-8.
To check the convergence of MCMC samples, we provide the key diagnostic test and trace plots with posterior density plots for different parameters and censoring schemes using the three above schemes at m = 40 and T = 1:5 with prior 1 and prior 2. Figures 1, 2, 3, and 4 show the trace plot iterations for posterior densities of α, λ, SðtÞ, and hðtÞ, respectively. All censoring schemes are plotted in the same way, and it has been found that the trace plots of all censoring schemes converge very well.
From Tables 1-4, as expected, we observe that the Bayesian estimates computed under informative prior have smaller MSEs than the Bayesian estimates computed under noninformative prior and ML estimates. Moreover, in most cases, the MSEs are decreases with increasing sample size ðm, kÞ with the same time T. Also, for the same sample sizes, the MSEs are decreases with increasing the time T. Furthermore, in most cases, under the scheme 1, MSEs are smaller than corresponding with scheme 2 and scheme 3.
From Tables 5-8 may be observed that the AL of Bayesian intervals with informative priors is shorter than the corresponding length of Bayesian intervals with noninformative priors; this is to be expected. As a result, if prior information is available, it should be used. Finally, there are no significance differences between the AL of ML intervals and the corresponding AL of Bayesian intervals with    R * = 2, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 3 ð Þ , R * τ = 0 13 Computational and Mathematical Methods in Medicine noninformative priors. Hence, we can say that performance of the maximum likelihood method is worse than the Bayesian method based on informative priors.

Real Data Analysis
In this section, we perform the following data analysis for illustrative purpose. The data set is from Lawless [24]. The data given here arose in tests on endurance of deep groove ball bearings. The data are the numbers of million revolutions before failure for each of the 23 ball bearings in the life test. It has been analyzed by several authors. It has been used earlier by Gupta and Kundu [14] that the two-parameter GE distribution can be used quite effectively to analyze this data set. Also, the Kolmogorov-Smirnov goodness-of-fit test with a total sample size of 23 was conducted for the GE distribution. The test statistics D = 0:105825 and the corresponding p value of 0.93511 were obtained. As a result, the data can be considered fit to the GE distribution. We have created three Type I generalized PHCS from this uncensored data set, by fixed m = 15, k = 9, and R = ð2, 0 ð7Þ , 3, 0 ð5Þ , 3Þ, and in different values of T, we have the censoring schemes as given in Table 9. The point and the 95% confidence interval estimates of the parameters α, λ, Sðt = 50Þ, and hðt = 50Þ using the ML and informative and noninformative priors are presented in Table 10.

Conclusions and Discussion
In this paper, using the Type I progressive hybrid censored data from generalized exponential distribution, we construct the maximum likelihood and Bayesian estimators for the distribution's parameters, and the maximum likelihood and Bayesian estimators of the reliability and hazard functions are computed. Using the delta technique, we determined the approximate confidence intervals of the reliability and hazard functions, as well as the approximate confidence intervals of the distribution's unknown parameters.
Finally, we used the Markov chain Monte Carlo approach to perform a Bayesian estimate procedure and determine credible intervals. The results showed that the Bayesian estimation is more reliable than the ML estimation.

Data Availability
The data used to support the findings of this study are included within the article.

Conflicts of Interest
The authors declare that there is no conflict of interest.