Estimation of Generalized Gompertz Distribution Parameters under Ranked-Set Sampling

+is paper studies estimation of the parameters of the generalized Gompertz distribution based on ranked-set sample (RSS). Maximum likelihood (ML) and Bayesian approaches are considered. Approximate confidence intervals for the unknown parameters are constructed using both the normal approximation to the asymptotic distribution of the ML estimators and bootstrapping methods. Bayes estimates and credible intervals of the unknown parameters are obtained using differential evolution Markov chain Monte Carlo and Lindley’s methods. +e proposed methods are compared via Monte Carlo simulations studies and an example employing real data. +e performance of both ML and Bayes estimates is improved under RSS compared with simple random sample (SRS) regardless of the sample size. Bayes estimates outperform the ML estimates for small samples, while it is the other way around for moderate and large samples.


Introduction
Gompertz distribution was introduced by Gompertz [1] to describe human mortality and to establish actuarial tables. It was also found to be useful in medical sciences because it gives a good fit to data coming from clinical trials on ordered subjects [2]. Gompertz distribution has been extensively studied in the literature (see, for example, El-Din et al. [3] and the reference therein). is paper focuses on the threeparameter generalized Gompertz distribution, which was proposed by El-Gohary et al. [4].
A random variable X is said to have a generalized Gompertz (GG) distribution with parameter vector θ � (λ, c, θ), denoted as X: GG(λ, c, θ), if its probability density function and its distribution function are given by e GG distribution covers the generalized exponential distribution and the one-parameter exponential distribution as special cases when c goes to zero and θ � 1. It also covers the Gompertz distribution when θ � 1.
e GG distribution takes different shapes of the failure rate curve, namely, increasing, constant, decreasing, or bathtub depending on the value of θ (the shape parameter). e GG distribution is considered as a strong candidate distribution for the analysis of reliability data [4] and survival data [5]. Demir and Saracoglu [6] studied maximum likelihood estimation of the GG distribution parameters under progressively type II censored data. Ahmed [7] studied maximum likelihood and Bayesian estimation of the lifetime parameters of the GG distribution under progressively type II censored data. Based on the GG distribution, Borges [5] developed a regression model for survival data. e author proposed an Expectation-Maximization algorithm to estimate the regression parameters. Abu-Zinadah and Al-Oufi [8] studied the estimation of the GG parameters under complete sample using ML, least-squares, weighted leastsquares, and percentiles estimation methods. Estimation of the GG parameters under type II censored sample was studied by Abu-Zinadah and Al-Oufi [9]. e use of GG distribution in lifetime data in the presence of cure fraction, censored data, and covariates was studied by Martinez [10]. Different from the above, this article discusses ML and Bayesian parameter estimation of the GG distribution under ranked-set sampling (RSS) scheme. RSS is a sampling scheme proposed by McIntyre [11] in the hope of improving the estimation of the population mean. RSS is very useful in situations where precise measurements of sample units are difficult, due to high cost or time consumption, but a set of sample units can be accurately ranked at negligible cost or time. For situations where RSS techniques have been found applicable, see Barnett and Moore [12], Wolfe [13], and Frey and Zhang [14].
RSS sampling scheme aims to collect observations from a population that are more representative of it than other probability sampling techniques such as the simple random sampling (SRS) based on the same number of the collected observations. To implement RSS of n observations from a population, follow the following steps: e resulted sample is denoted as X (i: m)j , the i th largest unit in a set of size m in the j th cycle, where i � 1, . . . , m and j � 1, . . . , r. Based on the above steps, the joint pdf of a RSS is given by the following equation (see Arnold et al. [15]): where . , r and f and F are the pdf and cdf of a random variable X. e remainder of this paper is organized as follows. Section 2 discusses ML estimation and confidence intervals of the model parameters under both RSS and SRS. Bayesian estimators along with credible intervals of the parameters are discussed in Section 3. Section 4 compares between ML and Bayesian method through a Monte Carlo simulation study. Real data example is considered in Section 5 to implement the proposed methods. Section 6 concludes the paper.

Maximum Likelihood Estimation
is section discusses the parameter estimation of the GG distribution using the ML estimation method under RSS. Moreover, interval estimation of the parameters is discussed based on the observed Fisher information matrix and the normal approximation to the asymptotic distribution of MLEs. Bootstrapping confidence intervals are also considered as an alternative to the normal approximation approach. For comparative purposes, the ML and interval estimation under SRS are investigated.

MLE under RSS.
Let (X (1: m)1 , X (2: m)1 , . . . , X (m: m)1 , . . . , X (1: m)r , X (2: m)r , . . . , X (m: m)r ) be a RSS from GG(λ, c, θ), with pdf given in 1. Let the vector be the vector of realizations. Substituting (1) and (2) in (3), the likelihood and the log-likelihood functions of a RSS from GG distribution will be as follows: e MLEs for the parameters λ, c, and θ are obtained by maximizing the likelihood in (4) or equivalently by maximizing the log-likelihood in (5). is can be accomplished by setting the partial derivatives in (6)-(8) equal to zero and solving these equations simultaneously. It can be seen that these equations do not have closed-form solutions; therefore, the Newton-Raphson method is used to obtain the estimates. e algorithm is comprised of the following steps: Step 1: start with an initial guess (λ (0) , c (0) , θ (0) ) as a starting point of the iterations.
To obtain confidence intervals of the parameters, the asymptotic properties of the MLE were used. e MLE is asymptotically normal with mean equal to the true parameter values and variance-covariance matrix equal to the inverse of the observed Fisher information matrix (see Lawless [16]). Observed Fisher information matrix is defined to be the matrix of the second partial derivatives of the negative log-likelihood with respect to the model parameters, that is where Journal of Probability and Statistics being the vector of realizations. e likelihood and the log-likelihood functions are given by . e first partial derivatives of ℓ SRS (λ, c, θ, x) are as follows: e MLEs of the model parameters based on SRS are obtained by setting the partial derivatives in (16)-(18) equal to zero. Clearly, this system of equations does not have closed-form solution; therefore, Newton-Raphson method will be used. e second partial derivatives of ℓ SRS (λ, c, θ, x) that are the elements of I(λ, c, θ) are as follows: Journal of Probability and Statistics

Bootstrap Confidence Interval.
Constructing confidence intervals for the model parameters using the normal approximations may not work well when the sample size n is small. Resampling methods are alternative ways that may provide more accurate approximate confidence intervals. One popular resampling method is the bootstrapping method. is section aims to discuss the percentile bootstrap (Boot-p) confidence interval proposed by Efron [17]. e Boot-p confidence interval can be described as follows: (i) Select a random sample (whether RSS or SRS) from the population and obtain the MLE θ of the model parameter θ as discussed in Section 2 (ii) Based on the specified sampling scheme (RSS or SRS), generate a bootstrap random sample from the GG distribution with parameters θ (iii) Obtain the MLE of the model parameters based on the bootstrap sample and denote this bootstrap estimate by θ * (iv) Repeat the second and third steps above N times to obtain θ * 1 , . . . , θ * N (v) Arrange the above estimates in ascending order to obtain the ordered estimates θ * confidence interval is then obtained from the 100α/2 and 100(1 − α/2) empirical percentiles of the bootstrap estimates obtained in the previous step

Bayesian Estimation
In this section, Bayes' estimates and the Bayesian credible intervals of the parameters (λ, c, θ) are obtained using Markov chain Monte Carlo Methods (MCMCs) under both RSS and SRS. Two important components of Bayesian analysis are the choice of prior distribution of the parameters and the loss function. e prior distribution reflects the prior knowledge or information about the parameters of interest prior to collecting the data. If there is no such knowledge, then a weakly informative prior could be considered. e loss function measures the loss incurred when estimating a parameter θ by an estimator θ and is used as a criterion for good estimators.
Independent gamma priors are assumed for the parameters, that is For the gamma prior to be weakly informative, the hyperparameters α i and β i , i � 1, 2, 3, are assumed to equal a small value such as 0.001. Bayesian inference is then obtained based on the posterior distribution, the distribution of the parameters given the data D, that is, where L(θ; D) is the likelihood function.
In this work, we will use the most widely used loss function in Bayesian inference, that is, the squared error loss (SEL) function, which is given by Bayes' estimator of the parameter θ based on SEL is the posterior mean, that is,

Bayesian Estimation under RSS.
e joint posterior distribution of the parameters λ, c, and θ under RSS can be obtained by combining the likelihood in (4) and the prior in (21) via Bayes' theorem. Up to a normalizing constant, it can be written as Using this posterior, one can obtain Bayes' estimator of any function g(λ, c, θ) of the parameters by finding the posterior mean, that is, 6 Journal of Probability and Statistics Clearly, the posterior distribution involves intractable integrals; this is because the likelihood function based on RSS is complicated. erefore, a Markov chain Monte Carlo (MCMC) method is proposed to obtain Bayes' estimates of the parameters. MCMC methods aim to generate samples from the joint posterior density function and to use them to compute the Bayes estimate of the parameters of interest. To implement the MCMC methodology, we consider the Metropolis-Hasting (M-H) sampler. e M-H sampler is summarized in the following steps: Step 1: start with initial guess (λ (0) , c (0) , θ (0) ) as a starting point of the M-H sampler.
Step 2: choose a proposal kernel from which it is easy to sample from and which have the main characteristics of the posterior. Denote this proposal by q(λ, c, θ).
Step 4: repeat Step 3 for large number of iterations, say M, until convergence is assured.
In our simulations, independent normal kernel was used as a proposal distribution. e mean of this proposal is taken to equal to the previously sampled value and standard deviation equal to the square root of the inverse of the observed Fisher information scaled with a factor of 2.38/ , where d � 3 is the dimension of the parameter space (see Gelman et al. [18]). e abovementioned M-H algorithm is called the Random-Walk M-H (RW-M-H), due to its randomness of proposing a new realization from the posterior. One of the main drawbacks of the RW-M-H algorithm in complex posteriors is its slow convergence due to the dependency between parameters. In our simulations, we noticed that the RW-M-H algorithm did not provide a true nominal coverage of the Bayesian credible interval. erefore, the differential evolution M-H (DE-M-H) developed by Braak [19] is used to improve the performance of the M-H algorithm. Braak [19] stated that the main advantage of the DE-M-H is its ability to handle issues as nonconvergence, colinear parameters, and multimodal densities.
e DE-M-H comprises of running multiple chains, say N, which are initialized from overdispersed states. e main feature is that the proposed value in each chain uses information from two randomly selected chains from the remaining chains. is allows the chains to learn from each other through the process. e DE-M-H can be implemented as follows. Set θ � (λ, c, θ).
e proposal for the i th chain, i � 1, . . . , N, is θ e proposed value is accepted with probability min(1, r * ), with (28) Step 3: repeat Step 2 for large number of iterations, say M.
For the DE-M-H algorithm, define the following: i is the previous state of the i th chain. θ (k− 1) l and θ (k− 1) j are previous states of randomly selected without replacement chains from the remaining chains without the i th chain.
is ensures that the chains are learning from each other. e is drawn from normal (0, σ 2 ), where σ 2 is chosen to be small. In our simulations, σ is taken to equal the standard deviations obtained from the observed Fisher information. c is a scaling factor used to provide an acceptable acceptance probability. e default choice is 2.38/ � � d √ (see Gelman et al. [18]). e resulted posterior samples can be used to find the posterior estimates and credible intervals as follows.
In addition to the point estimator, θ BAY , of θ, Bayesian credible interval can be obtained from the posterior samples. One popular credible interval is the highest posterior density (HPD) credible interval. e HPD interval can be constructed from the empirical cumulative distribution function (cdf ) of the posterior samples as the shortest interval for which the difference in the empirical cdf values of the endpoints is the desired nominal probability.

Bayesian Estimation under SRS.
e Bayesian approach under SRS is similar to the one under RSS. e only change is that the likelihood L SRS is used in place of L RSS in the formula of the posterior distribution in (25). DE-M-H methods were used to obtain posterior samples and making Bayesian inference.

Lindley's Approximation for Bayesian Estimates.
Another way of obtaining Bayesian estimates is by approximating the ratio of integrals in (26). Many attempts have been proposed in the literature to approximate such ratio of integrals. One popular method is proposed by Lindley [20]. Lindley's procedure is outlined as follows.
e ratio of integrals is written in the following form: where Λ(θ) � log(π(θ | x)) is the log posterior distribution. By expanding Λ(θ) using Taylor series expansion about the posterior mode θ * , Lindley obtained the Bayes estimator of g(θ) to be All functions are evaluated at the posterior mode θ * . e summations in 18 are over all subscripts and are from 1 to d, the dimension of the parameter vector θ. e subscripts denote the partial derivative of the function with respect to the corresponding component of θ, i.e., g ij � z 2 g/zθ i zθ j and Λ ijk � z 3 Λ/zθ i zθ j zθ k , and τ ij are the elements of the negative of the inverse of the Hessian matrix of Λ.

Simulation Study
In order to assess the performance of the proposed estimation methods (ML and Bayesian) under SRS and RSS schemes, a Monte Carlo simulation study of 5000 samples is conducted.
Comparisons are made based on the bias, mean square error (MSE), and coverage probability (CP) and half length (HL) of 95% confidence intervals. Different combinations of the parameter values are considered so as to cover different shapes of the probability density function of the GG distribution (see Figure 1). Since the conclusions were similar for almost all combinations of the parameter values, the results of the two combinations are presented, namely, (λ, c, θ) � (1, 0.5, 0.5) and (λ, c, θ) � (0.5, 0.5, 1). For each set of parameter values, 4 different sample sizes are studied, a small n � 10, one moderate n � 20, and two large n � 50 and 100 sample sizes. Bayesian estimates are obtained using the DE-M-H method with N � 10 chains, each having a length of 10000. e first half of each chain is considered as burn-in. e results of the simulation study are summarized in Tables 1-8. e following are observed: (i) Bayesian estimation using RW-M-H algorithm produces a coverage probability of a 95% credible interval lower than the nominal rate even for large sample sizes. is can be seen in Table 8. (iv) For moderate sample sizes, the ML methods produce a smaller MSE than the Bayesian method.
(v) In terms of the coverage probability (CP) and half length (HL) of the confidence interval, Bayesian approach produces better results (having more accurate CP and smaller HL) than the normal approximation of the ML approach under small and moderate sample sizes.
(vi) Confidence interval constructed using bootstrapping methods provides more accurate coverage probability with shorter half length compared with confidence intervals obtained using normal approximation for small and moderate sample sizes. Both methods perform almost the same as the sample size increases.
(vii) Bootstrapping confidence intervals were very comparable to Bayesian credible intervals in terms of half length and coverage probability.         In general, Bayesian approach is recommended for small sample sizes because it has better performance than the ML method. For moderate and large sample sizes, the ML approach is recommended due to its simplicity and speed. No matter which estimation method is used, we recommend using RSS over SRS as a sampling scheme because it has more efficient estimators even for small m.

Real Data Example
is section aims to implement and understand statistical inferential methods discussed in the previous sections via a real data example. e data considered are obtained from Aarset [21]. e data represent failure times of 50 devices obtained from a life-test experiment. ese data were analyzed by many authors assuming GG distribution under different sampling schemes (see, for example, Ahmed [7] and El-Gohary et al. [4]). e GG distribution was found to fit the data fairly well and better than other candidate distributions such as Gompertz and generalized exponential and exponential distributions. e data are given in Table 9.
We assumed that the data in Table 9 to be our population. e true values of the parameters are obtained in El-Gohary et al. [4] and are λ � 0.00143, c � 0.044, and θ � 0.421. To implement the proposed estimation methods, we analyzed 1000 bootstrap samples that were randomly   1  2  3  6  7  11  12  18  18  18  18  18  21  32  36  40  45  45  47  50  55  60  63  63  67  67  67  67  72  75  79  82  82  83  84  84  84  85  85  85  85  85  86  86 selected with replacement from the data. We considered two sample sizes n � 10, 20 and two values of the set size in RSS, m � 2, 5. e mean and relative MSE of the estimators and the half length and coverage probability of the confidence intervals are provided in Tables 10 and 11. DE-MCMC method was used to obtain Bayesian estimates and credible interval. Bootstrapping confidence intervals were constructed for the ML estimates.
It is clear that the results of the real data agree with the simulation study. Under both SRS and RSS, the Bayes estimators outperform the MLE in terms of having lower MSE and lower half length of the confidence intervals. e two estimation methods perform almost similar as the sample size increases. e performance of the two estimation methods is improved when using RSS compared with SRS.

Discussion and Conclusion
is paper develops ML and Bayesian methods to estimate GG distribution parameters under RSS. e ML estimates are obtained, and the corresponding confidence interval of the parameters is obtained using both the normal approximation of the distribution of the ML estimates and bootstrapping methods.
e Bayesian estimates are obtained under SEL function and weakly informative priors. It is observed that the Bayes estimators cannot be obtained in explicit forms, and therefore, differential evolution MCMC method is developed to obtain Bayesian point and interval estimates. Lindley's procedure is also studied to obtain Bayesian estimates. It was observed that Bayesian methods outperform the ML estimation methods in terms of MSE and coverage probability when the sample size is small. e opposite is true for moderate and large sample sizes. Under either method, estimates under RSS are more efficient than estimates under SRS.

Data Availability
e data used to support this study is obtained from the cited reference M. V. Aarset [21].

Conflicts of Interest
e authors declare that they have no conflicts of interest.