Efficient Estimation of the Generalized Quasi-Lindley Distribution Parameters under Ranked Set Sampling and Applications

Ranked set sampling is a very useful method to collect data when the actual measurement of the units in a population is difficult or expensive. Recently, the generalized quasi-Lindley distribution is suggested as a new continuous lifetime distribution. In this article, the ranked set sampling method is considered to estimate the parameters of the generalized quasi-Lindley distribution. Several estimation methods are used, including the maximum likelihood, the maximum product of spacings, ordinary least squares, weighted least squares, Cramer–von Mises, and Anderson–Darling methods. The performance of the proposed ranked set sampling based estimators is achieved through a simulation study in terms of bias and mean squared errors compared to the simple random sample. Additional results are obtained based on real data for the survival times of 72 guinea pigs and 23 ball bearings. The simulation study results and the real data applications showed the superiority of the proposed ranked set sampling estimators compared to the simple random sample competitors based on the same number of measuring units.


Introduction
One of the significant interesting fields in statistics is the cost-effective sampling methods. e motivation of this field arises from its superiority in facilitating data collection, especially when collecting data of interest consumes a long time or is expensive. Over the past decades, researchers developed different sampling methods in order to achieve reliable results with low cost and more accuracy.
McIntyre [1] proposed a new sampling method for estimating the mean of pasture and forage yields in Australia.
is method is known as ranked set sampling (RSS) and has greater efficiency than the commonly used simple random sampling (SRS). Later, Halls and Dell [2]; Takahasi and Wakimoto [3]; and Dell and Clutter [4] published others studies on the RSS method. Due to its cost-effectiveness, it is used in wide applications, including reliability, estimation of population parameters, statistical quality control, medicine, acceptance sampling plans, and so on (see Chen et al. [5][6][7], Haq et al. [8], and Al-Omari and Haq [9,10]).
It is well documented that the RSS method is an attractive procedure and more acclimated to the nature of the underlying data.
is advantage motivated researchers to propose and study new RSS schemes (see, for example, the double ranked set by Al-Saleh and Al-Kadiri [11]; median ranked set sampling by Mutllak [12]; neoteric ranked set sampling by Zamanzade and Al-Omari [13]; extreme ranked set sampling by Samawi et al. [14]; and L ranked set sampling by Al-Nasser [15]). For a good review of RSS and more motivations on this method, one can refer to Al-Omari and Bouza [16]; Al-Hadhrami and Al-Omari [17]; Haq et al. [18]; Santiago et al. [19]; and Haq et al. [20]. e problem of estimating distributions parameters, in general, is considered by many authors. For example, Nassar et al. [21] considered parameters estimation of the new extension of Weibull distribution. Nassar et al. [22] treated the estimation problem using the alpha power exponential distribution. Later, Afify and Mohamed [23] and Afifiy et al. [25] dealt with parameter estimation based on the new three-parameter exponential distribution and the Weibull Marshall -Olkin Lindley distribution, respectively. Recently, Alfaer et al. (2021) considered the extended log-logistic distribution and estimated its parameters. Although these works considered parameter estimations in related distributions, their works did not deal with sampling design techniques.
Undoubtedly, the parametric estimation method using the sampling design technique plays a vital role in statistical inference. Many studies considered the estimation of parameters based on RSS designs and their extensions using different estimation methods. Yousef and Al-Subh [26] estimated the Gumbel parameters using the maximum likelihood method, method of moment, and the method of regression. Hussian [27] used the Bayesian and maximum likelihood estimation methods to estimate the Kumaraswamy distribution parameters. Chen et al. [28] estimated the scale parameter for the scale distribution using moving extreme ranked set sampling, and Abu-Dayyeh et al. [29] considered the logistic method for parameter estimation based on both SRS and RSS. Pedroso et al. [30] considered the RSS to estimate the parameters of the twoparameter Birnbaum-Saunders distribution, and Akgul et al. [31] used the same RSS in system reliability estimation for generalized inverse Lindley distribution. (see also Taconeli and Bonat [32] for some estimation methods based on the RSS). For more details on parameter estimation, the readers are referred to Stokes [33]; Dey et al. [34]; Khamnei and Mayan [35]; and Al-Saleh and Al-Hadhrami [36].
is paper aims to study the performance of using RSS design in estimating the parameters of the generalized quasi-Lindley distribution (GQLD) introduced recently by Benchiha and Al-Omari [37]. A random variable X is said to follow a GQLD distribution with parameters θ and α if its pdf is given by with cdf given by (2) e first two moments of X, respectively, are e variance of the GQLD distribution is given by e corresponding reliability and hazard functions of the GQLD distribution, for x > 0, θ > − 1, α > 0, are given, respectively, by R GQLD (x; α, θ) � α 3 x 3 + 3(2θ + 1)α 2 x 2 + 6(θ + 1) 2 αx + 6(θ + 1) 2 e − αx 6(θ + 1) 2 , To the best of our knowledge, there are no published papers which used the RSS in estimating the parameters of GQLD. e remainder of this paper is organized as follows.
e RSS method is explained and the suggested various estimators for the GQLD are given in Section 2. In Section 3, a simulation study is provided to investigate the performance of the RSS estimators relative to the SRS counterparts based on the same number of measured units. Applications to real datasets fitted to the GQLD are given in Section 4. e paper is ended in Section 5 with concluding remarks and suggestions for future works.

Methods of Estimation
In this section, six methods of estimation are considered for estimating the unknown parameters θ and α of the GQLD distribution using RSS design. ese methods are the maximum likelihood method, method of maximum product of spacings, ordinary least squares method, weighted least squares method, Cramer-von Mises method, and Anderson-Darling method. e RSS strategy can be described as follows: (1) Select a simple random sample of size k 2 units from the desired population. Randomly partition them into k sets of each size k, where k is known as the set size.
(2) Rank the units within each set of size k from smallest to largest with respect to the variable of interest.
e resulting sample is denoted as X [i]j , i � 1, 2, . . . , k; j � 1, 2, . . . , n}, where X [i]j is the ith largest unit in a set of size k in the jth cycle. It is of interest to note that perfect ranking is assumed in this study.

Maximum Likelihood Estimation.
Let X [i]j , i � 1, 2, . . . , k; j � 1, 2, . . . , r} denote the ith order statistics from the ith set of size k at the jth cycle, and take it as the RSS data for X of a sample size n � kr. en, the maximum likelihood function based on the RSS sample is given by where e log-likelihood function, R RSS � lnL RSS (α, θ), is e estimators of α and θ of the GQLD using RSS can be obtained by solving the nonlinear equations

Method of Maximum Product of Spacings. Cheng and
Amin [38,39] introduced the method of maximum product of spacings (MPS). is method is based on maximization of the geometric mean of spacings in the data. e MPS is a consistent and efficient in most general cases. Consider the ordered units y (1:N) , y (2: N) , . . . , y (N:N) that form a ranked set sample of size N � nk where n is the number of cycle and k is the set size from the GQLD. en, the uniform spacing is given by where F(y (0:N) |α, θ) � 0 and F(y (N+1:N) |α, θ) � 1. Clearly, e natural logarithm of (11) is e estimators α MPS and θ MPS of the parameters α and θ, respectively, can also be obtained by solving the nonlinear equations: Mathematical Problems in Engineering where and can be then obtained numerically.

Methods of Least Squares.
Swain et al. [40] was the first who used the method of least squares to estimate the parameters of beta distribution based on one of the famous results in probability theory which indicates that F(X (i:N) ) ∼ Beta(i, N − i + 1) where F is a cumulative distribution function and x (i:N) is the ith order statistic of the random sample (x 1 , x 2 , . . . , x N ). erefore, in our case, we have Using the above expectations and variances, we obtain two variants of the least squares methods.

Ordinary Least Squares.
Let the ordered units y (1:N) , y (2:N) , . . . , y (N:N) constitute a ranked set sample of size N � nk. en, the ordinary least squares (OLS) estimators, say α OLS and θ OLS of the parameters α and θ, respectively, can be obtained by minimizing the function: with respect to α and θ. Alternatively, these estimates can also be obtained by solving the following nonlinear equations: where Ψ 1 (y (i: N) |α, θ) and Ψ 2 (y (i: N) |α, θ) are defined as in (14) and (15), respectively.

Weighted Least Squares. Consider the RSS units
. . , y (N: N) that form ranked set sampling of size N � nk. en, the weighted least squares (WLS) estimators of α and θ, say α WLS and θ WLS , respectively, can be obtained by minimizing the following function: with respect to α and θ. Equivalently, the estimates are the solution of the following nonlinear equations: where Ψ 1 (y (i:N) |α, θ) and Ψ 2 (y (i:N) |α, θ) are specified as in (14) and (15), respectively.

Methods of Minimum Distances.
Methods of estimation based on minimizing some famous goodness of fit statistics are useful in many cases and give good results. Here, two popular methods based on the minimization of test statistics between the theoretical and empirical cumulative distribution functions are considered. e methods are the Cramer-von Mises method and Anderson-Darling method (for more details, see D'Agostino and Stephens [41] and Luceño [42]). en, the Cramer-von Mises estimators (CV) α and θ of α and θ, respectively, are obtained by minimizing the function

Anderson-Darling Method.
Suppose that y (1: N) , y (2: N) , . . . , y (N: N) is a ranked set sample of size N � nk. en, the estimates based on the Anderson-Darling (AD) method for the GQLD distribution parameters α and θ, denoted by α AD and θ AD , can be obtained by minimizing the function with respect to α and θ, or equivalently by solving the following two equations: Mathematical Problems in Engineering where Ψ 1 (y (i: N) |α, θ) and Ψ 2 (y (i: N) |α, θ) are as specified in (14) and (15), respectively.

Simulation
In order to evaluate the performance of the estimation methods under RSS, a simulation study is conducted by using R software. 1000 samples are generated from the GQLD with different parameters values as (1, 1), (1, 3), (0.5, 1), and (0.8, 1.5) in different sizes for both RSS and SRS. For the RSS design, the number of cycles is selected to be n � 3, 4, and 5 while the set size k is taken as 5, 10, and 15.
In each case, we combine n with all values of k to study the effect of set size and the number of cycles. For the SRS design, the size of SRS is N � nk which is required for having the same size in each design. We have considered perfect ranking assumptions. e mean squared error (MSE) is calculated for each estimator in order to compare SRS and RSS. e MSE and the efficiency (Eff) are calculated by e results are reported in Tables 1-8  (iv) e MSE of the SRS estimators is decreasing when N � nk is increasing.
(v) In most cases, the efficiency is increasing when k is increasing. For instance, from

Application to Real Datasets
In this section, we illustrate the performance of the suggested estimators based on RSS design for two well-known real datasets. e first dataset presents the survival times (in days) of 72 guinea pigs infected with virulent tubercle bacilli, observed and reported by Bjerkedal [43]. e data were previously studied by Afify et al. [44]. e data observations are First, we fitted the GQLD model to both datasets. en, we considered the Kolmogorov-Smirnov (KS) test with its p value for quantifying the distance between the empirical distribution function of the real data and the cumulative distribution function using the estimators' parameters in each dataset.
e results are summarized in Table 9. As 6 Mathematical Problems in Engineering shown in this table, the p value for the corresponding critical value in each dataset is greater than 5%, which indicates that the GQLD model fitted both datasets well.
To show the superiority of RSS over the SRS using the different estimation methods, we considered the Kolmogorov-Smirnov (KS) test but now for quantifying the distance between the empirical distribution function of the real data and the cumulative distribution function using the estimators' parameters in each design, based on the choice of n and k. Note that we used the KS here as an alternative to the mean squared error and relative bias, and it is defined in our case as where n is the sample size. Of course, estimators with lower KS values and higher p value (greater than 5%) are better than the other competitions. Recall that the MLE estimators based on SRS for all datasets are considered the real population parameters.
For the first dataset, we considered the SRS design, and for each estimation method, we calculate the estimators using a sample of size nk � 12. en, we used a sample of sizes n � 2 and k � 6 for calculating estimators using the RSS design, and based on the cycles shown in Table 10, we compare SRS and RSS designs in terms of the KS distance value and p value. e results are given in Table 11, and the corresponding fittings are displayed in Figure 1.
For the second dataset, the sample size is selected as nk � 8 in the SRS design, while we used n � 2 and k � 4 for calculating estimators based on the RSS design using the cycles in Table 12. e estimators' values, KS distance, and p values are computed and summarized in Table 13, and the corresponding fittings are displayed in Figure 2.
e results in Tables 11 and 13 indicate that for the estimates based on the RSS design, the KS distance values are less than their counterparts using SRS design, and the corresponding p values based on the RSS estimators are greater than those by SRS design. Figures 1 and 2 support this claim.

Conclusion
In this paper, RSS-based estimation is presented for the GQLD. Six estimation methods are considered, including the maximum likelihood, the maximum product of spacings, ordinary least squares, weighted least squares, Cramer-von Mises, and Anderson-Darling methods. e performances of the proposed estimators are compared with their SRS counterparts using a simulation study and two applications of real data. e numerical simulation results demonstrate that the proposed RSS estimators are better than their SRS counterparts in terms of the MSE for all results presented in the tables based on the same number of measuring units. e results of the real data also confirm the superiority of the RSS design over the SRS design.
For future works, the authors are interested in modifying the GQLD to the transmuted GQLD (see, for example, [46]) and estimating its parameters using the modified robust extreme ranked set sampling [47].
Data Availability e real datasets related to guinea pigs and ball bearings used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.