Estimation of Population Median under Robust Measures of an Auxiliary Variable

In this paper, a generalized class of estimators for the estimation of population median are proposed under simple random sampling without replacement (SRSWOR) through robust measures of the auxiliary variable.(ree robust measures, decile mean, Hodges–Lehmann estimator, and trimean of an auxiliary variable, are used. Mathematical properties of the proposed estimators such as bias, mean squared error (MSE), and minimum MSE are derived up to first order of approximation. We considered various real-life datasets and a simulation study to check the potentiality of the proposed estimators over the competitors. Robustness is also examined through a real dataset. Based on the fascinating results, the researchers are encouraged to use the proposed estimators for population median under SRSWOR.


Introduction
Extensive work has been done on the estimation of the population mean, proportion, variance, regression coefficient, and so forth; but very little attention has been made to propose the efficient estimators of the median. In many situations, researchers are often interested in dealing with variables such as income, expenditure, taxes, consumption, and production; and the latter variables have highly skewed distributions. In such situations, the median is considerably a more appropriate measure of location than the mean. e problem of estimation of median under simple random sampling scheme has been discussed by Gross [1], Sedransk and Meyer [2], and Smith and Sedransk [3]. Kuk and Mak [4] were the first authors to investigate the estimation of the median using auxiliary information. After Kuk and Mak's [4] estimator, Singh et al. [5], Aladag and Cingi [6], Solanki and Singh [7], Shabbir and Gupta [8], Baig et al. [9], and Shabbir et al. [10] have developed different estimators for estimating finite population median based on the known conventional measures of the auxiliary variable under different sampling schemes. A brief explanation of Kuk and Mak's [4] estimator is as described as follows.
Let Y and X be the study and the auxiliary variables selected from a finite population Θ � Θ 1 , Θ 2 , Θ 3 , . . . , Θ N } of size "N" under simple random sampling without replacement (SRSWOR) subject to the constraint n < N. Further let Y i and X i , i � (1, 2, . . . , N)&y i and x i , i � (1, 2, . . . , n) be the values of the i th units of the population and sample, respectively. Let M Y and M X be the population median of the study and auxiliary variables with the probability density functions given by f Y (M Y ) and f X (M X ), respectively. We further assume that f Y (M Y ) and f X (M X ) are positive.
Suppose that y (1) ≤ y (2) ≤ y (3) ≤ · · · ≤ y (n) are the y values of sample units in ascending order; furthermore, let s be the integer such that Y (s) ≤ M Y ≤ Y (s+1) and p � s/n are the proportion of Y values in the sample which are less than or equal to M Y . Kuk and Mak [4] considered a two-way classification (p ij ) as given in Table 1. (or)ρ c (or)ρ � 4p 11 − 1 ranging from − 1 to +1 as p 11 increases from 0 to 0.5, where p 11 is the proportion of units in the population with X ≤ M X and Y ≤ M Y . Gross [1] proved that M Y is consistent and asymptotically normally distributed with mean M Y and variance where f � (n/N) is the sampling fraction. Efficiency of the ratio, product, and regression type estimators are ambiguous in the presence of the extreme values/outlier(s) in the dataset. In our present study, the problem under consideration is to estimate the median for finite population and suggest some generalized classes of estimators by utilizing known robust measures of an auxiliary variable under SRSWOR. e novelty of this work is as follows: (i) Robust measures (i.e., decile mean, Hodges-Lehmann estimator, and trimean) of an auxiliary variable are utilized for the first time to investigate the progressive estimation of the population median (ii) A variety of estimators can be generated through the proposed generalized estimator (iii) Robustness study is examined to check the performance of the proposed generalized estimator in the presence of outlier e following relative error terms and notations are used to obtain the mathematical properties such as bias, mean squared error (MSE), and minimum MSE of various estimators: be the population coefficient of variation of X, be the finite population correction(f.p.c)factor.
(3) e rest of the article is organized in the following way. Section 2 gives comprehensive details of existing estimators for the population median. Section 3 proposes generalized classes of estimators for estimating population median using robust measures of an auxiliary variable. Bias, mean squared error (MSE), and minimum MSE of generalized classes of estimators are derived up to the first degree of approximation in the same section. Four real-life datasets and a simulation study are performed in Section 4 to check the potential of the new estimators as compared to the existing ones. Robustness of the proposed estimators is evaluated by carrying out a real-life dataset in Section 5. Section 6 contains the concluding remarks and some recommendations.

Existing Median Estimators
e major drawback of all the suggested estimators for estimating population median is that they are based on the usual conventional measures of an auxiliary variable. In this section, we discuss the usual and well-known estimators for estimating population median under SRSWOR as suggested by different authors.
Kuk and Mak [4] suggested a ratio-type estimator by assuming the known median of the X variable.
e expression for mean square error of M R estimator is given as e exponential ratio-type estimator for estimating median is given as e MSE of M EX up to the first degree of approximation is given by Singh [11] developed an unbiased difference estimator which is given by where d is an unknown constant whose value needs to be determined. Minimum MSE of M D up to the first degree of approximation is as follows: Rao [12] and Gupta et al. [13], respectively, suggested three difference types of estimators for estimating median as e minimum MSE of M D2 at optimum values of e minimum MSE of M D3 at optimum values of Shabbir and Gupta [8] suggested a generalized difference type estimator for the estimation of median as where d 7 and d 8 are unknown constants whose values need to be determined, a and b are the known population parameters, and α 1 , α 2 and c are the scalar quantities.

Remark 2.
By substitution of the scalar quantities as (14) becomes

Proposed Generalized Estimator
One eminent disadvantage of existing estimators/class of estimators is that they are typically based on conventional measures. Efficiency of the estimators is uncertain in the occurrence of the extreme values in the dataset. In this section, we define a generalized class of estimators for the estimation of population median using robust measures of an auxiliary variable with the linear combination of nonconventional measures: quartile deviation, midrange, interquartile range, and quartile average. We included three robust measures: decile mean suggested by Rana et al. [14], Hodges-Lehmann estimator suggested by Hettmansperger and McKean [15], and the trimean suggested by Wang et al. [16]. For more details of these robust measures, see the works of Irfan et al. [17,18]. A generalized estimator for the estimation of population median is

Mathematical Problems in Engineering
where m 1 and m 2 are suitably chosen constants, and α i (i � 3 and 4) takes on the values 1, − 1, 2, − 2 for designing new estimators. Note that ψ and δ may be any constant values or functions of the known robust measures as well as nonconventional measures associated with X variable.

Remark 3.
Robust measures related to X are the following: Hodges-Lehmann: Remark 4. e nonconventional measures (i.e., interquartile range, midrange, quartile average, and quartile deviation) of an auxiliary variable can be defined as follows: (17), we get the following families of estimators:

Remark 5. By putting different values of α i (i � 3 and 4) in equation
(i) Put α 3 � 1 and α 4 � 2; proposed family of estimators reduces to (ii) Put α 3 � − 1 and α 4 � − 1; proposed family of estimators reduces to (iii) Put α 3 � − 1 and α 4 � − 2; proposed family of estimators reduces to (iv) Put α 3 � 2 and α 4 � 2; proposed family of estimators reduces to (v) Put α 3 � − 2 and α 4 � − 1; proposed family of estimators reduces to Remark 6. When we put robust measures of auxiliary variable with the linear combination of median, quartile deviation, midrange, interquartile range, and quartile average of an auxiliary variable in equation (17), we obtain different series of estimators such as . Some members of the class of estimator Table 2. Placing the same values of ψ and δ in , we obtain a number of estimators.

Remark 7.
Putting appropriate constants or known conventional parameters of the auxiliary variable in place of ψ and δ in equation (17), we can get many optimal estimators. Conventional parameters associated with auxiliary variable X are variance, standard deviation, coefficient of variation, coefficient of skewness, coefficient of kurtosis, coefficient of correlation, and so forth.

Bias, MSE, and Minimum MSE of T i(d) .
e suggested generalized class of estimators T i(d) in terms of e 0 and e 1 is expressed as follows: After some simplification of equation (23), we have Table  2: where ϑ � (ψM X /(ψM X + δ)). Subtracting M Y from both sides of equation (24), we get e bias of the proposed estimators, T i(d) , is defined as Taking expectations on both sides of equation (25), we get the bias of generalized class of estimators T i(d) : e MSE of the proposed estimators, T i(d) , is defined as Squaring both sides of equation (25), we have

Mathematical Problems in Engineering
Taking expectations on both sides of equation (29), we get the MSE of proposed estimators up to the first order of approximation as where Partially differentiating equation (30) with respect to m 1 and m 2 and equating them to zero, we get the optimal values of m 1 and m 2 as follows: Placing these optimal values in equation (30), we obtained the minimum MSE as given by

Application
In this section, comparison of the T i(d) estimators with other existing estimators under study is given by using real-life application and simulated datasets.

Real-Life Application.
We evaluated the performance of proposed class of estimators as compared to other competing estimators in terms of the MSE. For this purpose, we selected four real-life datasets: Population 1: source: Singh [11].
Y � number of fish caught in the year 1995 X � number of fish caught by the marine recreation fishermen in the previous year 1994.
Y � number of teacher's staff X � number of enrolled students Population 3: source: Singh [11].
Y � number of fish caught in the year 1995 X � number of fish caught by the marine recreation fishermen in the previous year 1993.
Y � number of households X � area in square miles Table 3 presents the detailed descriptions of each of the abovementioned populations.
We calculated the MSE and minimum MSE of all the estimators, that is,       [19]). e following are some important measures of the dataset: e following steps are made to carry out the simulation study: Step 1: select a SRSWOR of size n from the population of size N Step 2: use sample data from step 1 to find the MSE of all the existing and proposed estimators Step 3: perform 20,000 iterations to conduct step 1 and step 2 Step 4: get 20,000 values for MSE of all existing and proposed estimators Step 5: take the average of 20,000 values obtained in step 4 to get the simulated MSE of each estimator e following is revealed from (ii) Minimum MSE of all the proposed estimators is the least as compared to all the existing estimators under study (iii) As sample size increases, there is a decrease in the minimum MSE of all the proposed estimators It is concluded that our generalized estimator impeccably performs the best in the presence of extreme value(s).

Robustness of T i(d)
In this section, robustness is examined to check the perfor-

(35)
Scatter plot confirms the presence of the extreme value in the dataset. Scatter plot can be seen in Figure 1. erefore, we can access the robustness of the generalized estimator for this dataset. Numerical results based on the robustness study are reported in Table 11. It is revealed from Table 11 that the minimum MSE of all the proposed estimators is the least as compared to all the existing estimators under study. Moreover, as the sample size increases, the minimum MSE of all the proposed estimators decreases. erefore, it is concluded that our proposed estimator performs impeccably in the presence of the extreme value(s).

Concluding Remarks and Recommendations
We proposed the generalized classes of estimators for estimating population median under simple random sampling using robust measures of an auxiliary variable. Bias, mean squared error, and minimum mean squared error of the proposed generalized classes are derived up to the first degree of approximation. Four real-life datasets are used to check the numerical performance of the new estimators. A simulation study through a real dataset is also conducted to assess the potential of suggested classes of estimators. Robustness is also examined through a real dataset. On the basis of numerical findings, it is concluded that the new generalized classes can generate optimum estimators. erefore, use of the proposed generalized class is recommended for future applications.
e possible extensions of this work are to estimate the following: (1) finite population median under other sampling designs like stratified random sampling, double sampling, rank set sampling, and so forth; (2) other