To obtain the best estimates of the unknown population parameters have been the key theme of the statisticians. In the present paper we have suggested some estimators which estimate the population parameters efficiently. In short we propose a ratio, product, and regression estimators using two auxiliary variables, when there are some maximum and minimum values of the study and auxiliary variables, respectively. The properties of the proposed strategies in terms of mean square errors (variances) are derived up to first order of approximation. Also the performance of the proposed estimators have shown theoretically and these theoretical conditions are verified numerically by taking four real data sets under which the proposed class of estimators performed better than the other previous works.
1. Introduction
In the literature of survey sampling, the use of ancillary information provided by auxiliary variables was discussed by various statisticians in order to improve the efficiency of their constructed estimators or to obtain improved estimators for estimating some most common population parameters, such as population mean, population total, population variance, and population coefficient of variation. In such a situation, ratio, product, and regression estimators provide better estimates of the population parameters. The work of Neyman [1] is considered as the early works where auxiliary information has been used. After that a lot of work has been done for estimating finite population mean and other population parameters using auxiliary information and for improving their efficiency. For a more related work one can go through Das and Tripathi [2, 3], Upadhyaya and Singh [4], Singh [5], and so forth. Sisodia and Dwivedi [6] have proposed ratio estimator using coefficient of variation of an auxiliary variable. Kadilar and Cingi [7] have suggested an estimator for population mean using two auxiliary variables. Khan and Shabbir [8] have introduced the idea of ratio type estimator or the estimation of population variance using quartiles of an auxiliary variable. Mouatasim and Al-Hossain [9] have studied reduced gradient method for minimax estimation of a bounded poisson mean in which concept of auxiliary variables can be easily placed and study. Further Al-Hossain [10] has studied inference on compound Rayleigh parameters with progressively type II censored samples wherein censored samples can be chosen as to auxiliary variables. Recently Khan and Shabbir [11] suggested different estimators of finite population mean using maximum and minimum values.
Let us consider a finite population of size N of different units U={U1,U2,U3,…,UN}. Let y, x1, and x2 be the study and the auxiliary variables with corresponding values yi, x1i, and x2i, respectively, for the ith unit i={1,2,3,…,N} defined on a finite population U. Let Y-=(1/N)∑i=1Nyi, X-1=(1/N)∑i=1Nx1i, and X-2=(1/N)∑i=1Nx2i be the population means of the study as well as auxiliary variables, respectively, let Sy2=(1/N-1)∑i=1N(yi-Y-)2, Sx12=(1/N-1)∑i=1N(x1i-X-1)2, and Sx22=(1/N-1)∑i=1N(x2i-X-2)2 be the corresponding population variances of the study as well as auxiliary variables, respectively, let Cy, Cx1, and Cx2 be the coefficient of variation of the study as well as auxiliary variables, respectively, and let ρyx1, ρyx2, and ρx1x2 be the population correlation coefficient among y, x1, x2 and between x1 and x2, respectively.
In order to estimate the unknown population mean, we take a random sample of size n units from the finite population U by using simple random sample without replacement. Let y, x1, and x2 be the study and the auxiliary variables with corresponding values yi, x1i, and x2i, respectively, for the ith unit i={1,2,3,…,n} in the sample. Let y-=(1/n)∑i=1nyi, x-1=(1/n)∑i=1nx1i, and x-2=(1/n)∑i=1nx2i be the sample means of the study as well as auxiliary variables, respectively, and let S^y2=(1/n-1)∑i=1n(yi-y-)2, S^x12=(1/n-1)∑i=1n(x1i-x-1)2, and S^x22=(1/n-1)∑i=1n(x2i-x-2)2 be the corresponding sample variances of the study as well as auxiliary variables, respectively. Also let C^y, C^x1, and C^x2 be the sample coefficient of variation of the study variable y as well as auxiliary variables x1 and x2, respectively, and let S^yx1, S^yx2, and S^x1x2 be the sample covariances between y, x1, and x2 and between x1 and x2, respectively.
The usual unbiased estimator to estimate the population mean of the study variable is
(1)y-=∑i=1nyin.
The variance of the estimator y- up to first order of approximation is given as follows:
(2)var(y-)=θSy2,
where θ=1/n-1/N.
In many real data sets there exist some large (ymax) or small values (ymin) and to estimate the unknown population parameters without considering this information is very sensitive in case the result will be either overestimated or underestimated. In order to handle this situation Sarndal [12] suggested the following unbiased estimator for the estimation of finite population mean using maximum and minimum values:
(3)y-S={y-+cifsamplecontainsyminbutnotymaxy--cifsamplecontainsymaxbutnotyminy-forallothersamples,
where c is a constant, which is to be found for minimum variance.
The minimum variance of the estimator y-S up to first order of approximation is given as
(4)var(y-S)min=var(y-)-θ(ymax-ymin)22(N-1),
where the optimum value of copt is
(5)copt=(ymax-ymin)2n.
The ratio estimator for estimating the unknown population mean of the study variable using two auxiliary variables is given by
(6)Y-^R2=y-X-1X-2x-1x-2.
The mean square error of the estimator Y-^R2 up to first order of approximation is given by
(7)MSE(Y-^R2)=θ[Sy2+R12Sx12+R22Sx22+2R1R2Sx1x2-2R2Syx2-2R1Syx1].
The product estimator for estimating the unknown population mean of the study variable using two auxiliary variables is given by
(8)Y-^P2=y-x-1x-2X-1X-2.
The mean square error of the estimator Y-^P2 up to first order of approximation is given by
(9)MSE(Y-^P2)=θ[Sy2+R12Sx12+R22Sx22+2R1R2Sx1x2+2R2Syx2+2R1Syx1].
When there are two auxiliary variables, then the regression estimator to estimate the finite population mean is given by
(10)Y-^lr2=y-+b1(X-1-x-1)+b2(X-2-x-2),
where b1=Syx1/Sx12 and b2=Syx2/Sx22 are the sample regression coefficients between y and x1 and between y and x2, respectively.
The variance of the estimator Y-^lr2 up to first order of approximation is given as
(11)MSE(Y-^lr2)=θSy2[1-ρyx12-ρyx22+2ρyx1ρyx2ρx1x2].
2. Proposed Estimators
On the lines of Sarndal [12], we propose a ratio, product, and regression estimators using two auxiliary variables when there are some maximum and minimum values of the study variables and the auxiliary variables, respectively.
Case 1.
When the correlation between the study variable and the auxiliary variable is positive, the selection of the larger value of the auxiliary variable the larger the value of study variable is to be expected, and the smaller the value of auxiliary variable the smaller the value of study variable is to be expected, and using such type of information the ratio estimator using two auxiliary variables becomes
(12)Y-^RC2=y-C11X-1x-1C21X-2x-2C31={(y-+c1)X-1(x-1+c2)X-2(x-2+c3)ifsamplecontainsyminand(x1min,x2min),(y--c1)X-1(x-1-c2)X-2(x-2-c3)ifsamplecontainsymaxand(x1max,x2max),y-X-1x-1X-2x-2forallothersamples,(13)Y-^lr(P1)=y-C11+b1(X-1-x-1C21)+b2(X-2-x-2C31),
where (y-C11=y-+c1,x-1C21=x-1+c2,x-2C31=x-2+c3) if the sample contains ymin and (x1min,x2min). (y-C12=y--c1,x-2C22=x-1-c2,x-2C32=x-2-c3) if the sample contains ymax and (x1max,x2max). And (y-C11=y-,x-1C21=x-1,x-2C31=x-2) for all other samples.
Case 2.
Similarly when the correlation is negative the selection of the larger value of the auxiliary variable the smaller the value of study variable is to be expected, and the smaller the value of auxiliary variable the larger the value of study variable is to be expected, and using such type of information the product estimator using two auxiliary variables becomes
(14)Y-^PC2=y-C12x-1C22X-1x-2C32X-2={(y-+c1)(x-1-c2)X-1(x-2-c3)X-2ifsamplecontainsyminand(x1max,x2max),(y--c1)(x-1+c2)X-1(x-2+c3)X-2ifsamplecontainsymaxand(x1min,x2min),y-x-1X-1x-2X-2forallothersamples,Y-^lr(P2)=y-C12+b1(X-1-x-1C22)+b2(X-2-x-2C32),
where (y-C12=y-+c1,x-1C22=x-1-c2,x-2C32=x-2-c3) if the sample contains ymin and (x1max,x2max), (y-C11=y--c1,x-1C21=x-1+c2,x-2C31=x-2+c3) if the sample contains ymax and (x1min,x2min), and (y-C12=y-,x-1C22=x-1,x-2C32=x-2) for all other samples. Also c1, c2, and c3 are unknown constants, whose value is to be determined for optimality conditions.
To obtain the properties of the proposed estimators in terms of bias and mean square error, we define the following relative error terms and their expectations.
e0=(y-c1-Y-)/Y-, e1=(x-1c2-X-1)/X-1, and e2=(x-2c3-X-2)/X-2, such that E(e0)=E(e1)=E(e2)=0. Consider also
(15)E(e02)=θY-2(Sy2-2nc1N-1(ymax-ymin-nc1)),E(e12)=θX-12(Sx12-2nc2N-1(x1max-x1min-nc2)),E(e22)=θX-22(Sx22-2nc3N-1(x2max-x2min-nc3)),E(e0e1)=θYX-1(Syx1-nN-1×(c2(ymax-ymin)+c1(x1max-x1min)-2nc1c2)nN-1),E(e0e2)=θYX-2(Syx2-nN-1×(c3(ymax-ymin)+c1(x2max-x2min)-2nc1c3)nN-1),E(e1e2)=θX-1X-2(Sx1x2-nN-1×(c3(x1max-x1min)+c2(x2max-x2min)-2nc2c3)nN-1).
Rewriting (12), Y-^RC2 in terms of ei’s, we have
(16)Y-^RC2=Y-(1+e0)(1+e1)-1(1+e2)-1.
Expanding the right hand side of above equation and including terms up to second powers of ei’s, that is, up to first order of approximation, we have
(17)Y-^RC2-Y-=Y-(e0-e1-e2+e22+e12+e1e2-e0e2-e0e1).
On squaring both sides of (17) and keeping ei’s powers up to first order of approximation, we have
(18)(Y-^RC2-Y-)2=Y-2[e02+e12+e22-2e0e1-2e0e2+2e1e2].
Taking expectation on both sides of (18), we get mean square error up to first order of approximation, given as
(19)MSE(Y-^RC2)=[2nθ(c1-R2c3-R1c2)N-1θ(Sy2+R12Sx12+R22Sx22+2R1R2Sx1x2-2R2Syx2-2R1Syx1)-2nθ(c1-R2c3-R1c2)N-1×{(ymax-ymin)-R1(x1max-x1min)-R2(x2max-x2min)-n(c1-R2c3-R1c2)}2nθ(c1-R2c3-R1c2)N-1].
To find the minimum mean squared error of Y-^RC2, we differentiate (19) with respect to c1, c2, and c3, respectively; that is,
(20)∂MSE(Y-^RC2)∂c1=0or(ymax-ymin)-R1(x1max-x1min)-R2(x2max-x2min)-2n(c1-R1c2-R2c3)=0,∂MSE(Y-^RC2)∂c2=0or(ymax-ymin)-R1(x1max-x1min)-R2(x2max-x2min)-2n(c1-R1c2-R2c3)=0,∂MSE(Y-^RC2)∂c3=0or(ymax-ymin)-R1(x1max-x1min)-R2(x2max-x2min)-2n(c1-R1c2-R2c3)=0.
On differentiating (19), with respect to c1, c2, and c3, respectively, we get one equation with three unknowns and so unique solution is not possible; so let
(21)c1=(ymax-ymin)2n,c2=(x1max-x1min)2n,c3=(x2max-x2min)2n.
On substituting the optimum value of c1, c2, and c3 from (21) in (19), we get the minimum mean square error of the proposed estimator, given as
(22)MSE(Y-^RC2)min=MSE(y-^R2)-θ2(N-1)×[(ymax-ymin)-R1(x1max-x1min)-R2(x2max-x2min)]2,
where MSE(Y-^R2)=θ[Sy2+R12Sx12+R22Sx22+2R1R2Sx1x2-2R2Syx2-2R1Syx1].
Similarly the mean square error of the product estimator, up to first order of approximation, is given by
(23)MSE(Y-^PC2)min=MSE(y-^P2)-θ2(N-1)×[(ymax-ymin)+R1(x1max-x1min)+R2(x2max-x2min)]2,
where MSE(Y-^P2)=θ[Sy2+R12Sx12+R22Sx22+2R1R2Sx1x2+2R2Syx2+2R1Syx1].
Now the minimum variance of the regression estimator in the case of positive correlation, up to first order of approximation, is given by
(24)MSE(Y-^lr(P1))min=MSE(Y-^lr2)-θ2(N-1)×((ymax-ymin)-β1(x1max-x1min)-β2(x2max-x2min))2,
where MSE(Y-^lr2)=θSy2[1-ρyx12-ρyx22+2ρyx1ρyx2ρx1x2].
Similarly for the case of negative correlation, the minimum variance of the regression estimator, up to first order of approximation, is given by
(25)MSE(Y-^lr(P2))min=MSE(Y-^lr2)-θ2(N-1)×[(ymax-ymin)+β1(x1max-x1min)+β2(x2max-x2min)]2.
But when there is positive and negative correlation, the regression estimator gives us better result, and so for both cases (positive and negative correlation) we write the variance as
(26)MSE(Y-^lr(P))min=MSE(Y-^lr2)-θ2(N-1)×((ymax-ymin)-|β1|(x1max-x1min)-|β2|(x2max-x2min))2.
3. Comparison of Estimators
In this section, we have compared the proposed estimators with the ratio, product, and regression estimators and some of their efficiency comparison condition has been carried out under which the proposed estimators perform better.
By (7) and (22),
(27)[MSE(Y-^R2)-MSE(Y-^RC2)min]≥0
if
(28)[(ymax-ymin)-R1(x1max-x1min)-R2(x2max-x2min)]2≥0.
By (9) and (23),
(29)[MSE(Y-^P2)-MSE(Y-^PC2)min]≥0
if
(30)[(ymax-ymin)+R1(x1max-x1min)+R2(x2max-x2min)]2≥0.
By (11) and (26),
(31)[MSE(Y-^lr2)-MSE(Y-^lr(P))min]≥0
if
(32)[(ymax-ymin)-|β1|(x1max-x1min)-|β2|(x2max-x2min)]2≥0.
From (i), (ii), and (iii) we have observed that the proposed estimators performed better than the other existing estimators because the conditions are in the form of a square and greater than zero which is always true.
4. Numerical Illustration
In this section we demonstrate the performance of the suggested estimators over various other estimators, through four real data sets. The description and the necessary data statistics of the populations are given as follows.
Population 1 (source: Agricultural Statistics (1999) [13], Washington, DC.)
The mean squared error of the proposed and the existing estimators is shown in Table 1.
MSE of the existing and the proposed estimators.
Estimator
Population 1
Population 2
Population 3
Population 4
MSE(·)
MSE(·)
MSE(·)
MSE(·)
Y-^R2
1671738.947
0.0004
1513620.253
1440668.498
Y-^P2
12164963.92
0.0029
11772394.82
11494548.32
Y-^lr2
1254819.885
0.0003
1217975.563
1231973.764
Proposed
Y-^RC2
1293565.557
0.0003
957662.57
911989.0387
Y-^PC2
9654413.760
0.0021
8830692.238
8616039.070
Y-^lrP
971781.5401
0.00025
753868.3966
769477.0316
5. Conclusion and Future Work
We have developed some ratio, product, and regression estimators under maximum and minimum values using two auxiliary variables. The proposed estimators under certain efficiency conditions are shown to be more efficient than the ratio, product, and regression estimators using two auxiliary variables. The results are shown numerically in Table 1 where we observed that the performance of the proposed estimators is better than the usual ratio, product, and the regression estimators using two auxiliary variables. We can easily implement the concept of auxiliary variables minimax or maximin estimation of a bounded Poisson (respectively some other distribution) mean and censors samples. Thus the proposed estimators may be preferred over the existing estimators for the use of practical applications.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
NeymanJ.Contributions to the theory of sampling human populations19383320110111610.1080/01621459.1938.10503378DasA. K.TripathiT. P.Sampling strategies for population mean when the coefficient of variation of an auxiliary character is known1980427686DasA. K.TripathiT. P.A class of sampling strategies for population mean using information on mean and variance of an auxiliary characterProceedings of the Indian Statistical Institute Golden Jubilee International Conference on Statistics: Applications and New DirectionsDecember 1981Calcutta, India174181UpadhyayaL. N.SinghH. P.Use of transformed auxiliary variable in estimating the finite population mean199941562763610.1002/(SICI)1521-4036(199909)41:5<627::AID-BIMJ627>3.3.CO;2-NMR1720232ZBL0963.62014SinghS.Golden and Silver Jubilee year-2003 of the linear regression estimators2004Toronto, CanadaAmerican Statistical Association43824389SisodiaB. V. S.DwivediV. K.A modified ratio estimator using coefficient of variation of auxiliary variable19813311318KadilarC.CingiH.A new estimator using two auxiliary variables2005162290190810.1016/j.amc.2003.12.130MR2111874ZBL1058.62013KhanM.ShabbirJ.A ratio type estimator for the estimation of population variance using quartiles of an auxiliary variable201323157162MouatasimA. E.Al-HossainA.Reduced gradient method for minimax estimation of a bounded poisson mean200922185199Al-HossainY.Inferences on compound rayleigh parameters with progressively type-II censored samples201376885892KhanM.ShabbirJ.Some improved ratio, product and regression estimators of finite population mean when using minimum and maximum values20132013743186810.1155/2013/431868SarndalC. E.Sample survey theory vs general statistical theory: estimation of the population mean197240112http://www.nass.usda.gov/Publications/Ag_Statistics/1999/index.asphttp://www.nass.usda.gov/Publications/Ag_Statistics/