To improve the efficiency of an estimator with two auxiliary variables, we propose a new estimator of a finite population mean under simple random sampling. The bias and mean square error expressions of the proposed estimator have been obtained. In a comparison study, we found that the new estimator was consistently better than those of Abu-Dayyeh et al., Kadilar and Cingi, and Malik and Singh, as well as the regression estimator using two auxiliary variables, and that the minimum MSE values of the previous three above reported estimators were equal. We used four numerical examples in agricultural, biomedical, and power engineering to support these theoretical results, thus enriching the theory of survey samples by the development of new estimators with two auxiliary variables.
National Natural Science Foundation of China11461051Natural Science Foundation of Inner Mongolia2017MS(LH)01011. Introduction
In sampling theory, it is a well-established phenomenon that supplementary information provided by auxiliary variables or auxiliary attributes often improves the accuracy of estimators of unknown population parameters. Ratio-, product-, and regression-type estimators are three such methods. For this reason, some authors have exploited the use of auxiliary variables and attributes at the estimation stage to increase estimator efficiency. For example, the planting area and the proportion of good seeds in agricultural engineering are two important auxiliary variables when estimating average cotton output. Similarly, the breed of cow in animal husbandry engineering is an important auxiliary attribute when estimating average milk yield. Thus, auxiliary information can be used in the field of education, biostatistics, the medical research, agricultural and biomedical engineering, and so on.
In the literature, some authors have proposed many efficient ratio-, product-, and regression-type estimators using one auxiliary variable or attribute, including Singh and Vishwakarma [1], Grover and Kaur [2, 3], Singh et al. [4], Singh and Solanki [5], and Gupta and Shabbir [6]. More recently, several authors have proposed efficient estimators of finite population mean using two variables or attributes, including, Abu-Dayyeh et al. [7], Kadilar and Cingi [8], Malik and Singh [9], Sharma and Singh [10], and Muneer et al. [11]. Although these studies are detailed and elaborated, the formulas of minimum MSE are not given, and the difference of minimum MSE values between these studies seems not to have been noticed.
In this paper, we compare the estimators reported by Abu-Dayyeh et al. [7], Kadilar and Cingi [8], and Malik and Singh [9] and introduce a new estimator with two auxiliary variables to estimate a finite population mean for the variable of interest. We obtained bias and mean square error (MSE) equations for the proposed estimator, and we compared the new estimator against those with relatively high efficiencies. An empirical study using four datasets in agricultural, biomedical, and power engineering was conducted, and we obtained satisfactory results, both theoretically and numerically. The analysis of these issues is of great significance for understanding agricultural, biomedical, and power engineering. Therefore, the proposed estimator could be applied across a broad spectrum of sampling survey.
2. Materials and Methods2.1. Abu-Dayyeh Estimator
Abu-Dayyeh et al. [7] proposed the following estimator of population mean when the population means X¯1 and X¯2 of the auxiliary variables were known:(1)y-AD=y-X¯1x¯1α1X¯2x¯2α2,where y¯ denotes the sample means of the variable y, x¯i and X¯i (i=1,2) denote, respectively, the sample and the population means of the variable xi (i=1,2), and α1 and α2 are real numbers.
The MSE of y-AD is given by(2)MSEy¯AD≅1-fnY¯2Cy2+α12Cx12+α22Cx22-2α1CyCx1ρyx1-2α2CyCx2ρyx2+2α1α2Cx1Cx2ρx1x2,where f=n/N; n and N are, respectively, the number of units in the sample and the population; Cy2, Cx12, and Cx22 are the coefficients of variation of Y, X1, and X2, respectively; and ρx1x2, ρyx1, and ρyx2 are the correlation coefficients between X1 and X2, Y and X1, and Y and X2, respectively.
To minimize MSEy-AD, the optimum values of α1 and α2 are given by (3)α1∗=Cyρyx1-ρx1x2ρyx2Cx11-ρx1x22,α2∗=Cyρyx2-ρx1x2ρyx1Cx21-ρx1x22.
The minimum MSE of y¯AD can be shown as(4)MSEminy¯AD=δY¯2Cy2LA,where δ=1-f/n; L=1-ρx1x22-ρyx12+2ρx1x2ρyx1ρyx2-ρyx22; A=1-ρx1x22.
2.2. Kadilar and Cingi Estimator
Kadilar and Cingi [8] proposed an estimator using two auxiliary variables, x1 and x2, to estimate the population mean Y¯, as follows:(5)y-KC=y-X¯1x¯1α1X¯2x¯2α2+b1X¯1-x¯1+b2X¯2-x¯2,where b1=Syx1/Sx12 and b2=Syx2/Sx22; Sx12 and Sx22 are the variances of Y, X1, and X2, respectively; and Syx1 and Syx2 are the covariance between Yand X1 and Y and X2, respectively.
The MSE of y¯KC is given by(6)MSEy¯KC≅1-fnY¯2Cy2+α12Cx12+α22Cx22-2α1CyCx1ρx1x2ρyx2-2α2CyCx2ρx1x2ρyx1+2α1α2Cx1Cx2ρx1x2-Cy2ρyx12-Cy2ρyx22+2Cy2ρx1x2ρyx1ρyx2.
To minimize MSEy-KC, the optimum values of α1 and α2 are given by (7)α1∗=Cyρx1x2ρyx1ρx1x2-ρyx2Cx11-ρx1x22,α2∗=Cyρx1x2ρyx2ρx1x2-ρyx1Cx21-ρx1x22.
The minimum MSE of y¯KC can be shown as(8)MSEminy¯KC=δY¯2Cy2LA.
2.3. Malik and Singh Estimator
Malik and Singh [9] proposed an estimator to estimate the population mean Y¯, as follows:(9)y-MS=y-expX¯1-x¯1X¯1+x¯1β1X¯2-x¯2X¯2+x¯2β2+b1X¯1-x¯1+b2X¯2-x¯2,where β1 and β2 are real numbers.
The MSE of y¯MS is given by(10)MSEy¯MS≅δY¯2Cy2+14β12Cx12+14β22Cx22+12β1β2Cx1Cx2ρx1x2-β1Cx1Cyρyx1-β2Cx2Cyρyx2+Y¯β1b1Cx12X¯1+β2b2Cx22X¯2+β1b2Cx1Cx2ρx1x2X¯2+β2b1Cx1Cx2ρx1x2X¯1-2b1Cx1Cyρyx1X¯1-2b2Cx2Cyρyx2X¯2+b12Cx12X¯12+2b1b2Cx1Cx2ρx1x2X¯1X¯2+b22Cx22X¯22.
To minimize MSEy-MS, the optimum values of β1 and β2 are given by (11)β1∗=2Cyρx1x2ρyx1ρx1x2-ρyx2Cx11-ρx1x22,β2∗=Cyρx1x2ρyx2ρx1x2-ρyx1Cx21-ρx1x22.
The minimum MSE of y¯KC can be shown as(12)MSEminy¯MS=δY¯2Cy2LA.
2.4. The Regression Estimator
Rao [12] proposed an estimator using one auxiliary variable, x1, to estimate the population mean Y¯, as follows:(13)y¯Rao=w1y¯+w2X¯1-x¯1.
Similarly, following Rao, a regression estimator of Y¯ using two auxiliary variables, x1 and x2, is given by(14)y¯RE=w1y¯+w2X¯1-x¯1+w3X¯2-x¯2,where w1, w2, and w3 are real constants.
The MSE of y¯RE is given by(15)MSEy¯RE=Y¯2+w121+δCy2+δX¯22Cx22w32+X¯1Cx1w2X¯1Cx1w2+2X¯2Cx2w3ρx1x2-2Y¯w1Y¯+δCyX¯1Cx1w2ρyx1+X¯2Cx2w3ρyx2.The optimum values of w1, w2, and w3, obtained by minimizing (15), respectively, are given by (16)w1∗=AA+δCy2L,w2∗=Y¯Cyρyx1-ρx1x2ρyx2Cx1X¯1A+δCy2L,w3∗=Y¯Cyρyx2-ρx1x2ρyx1Cx2X¯2A+δCy2L.
The minimum MSE of y¯RE can be shown as(17)MSEminy¯RE=δY¯2Cy2LA+δCy2L.
2.5. The Proposed Estimator
Singh and Espejo [13] proposed an estimator using one auxiliary variable, x, to estimate the population mean Y¯, as follows:(18)Y¯SE=y¯2X¯x¯+x¯X¯.
Inspired by this work, we propose a new estimator with two auxiliary variables, as follows:(19)y¯pr=k1y¯+k2X¯1-x¯1+k3X¯2-x¯24X¯1x¯1+x¯1X¯1X¯2x¯2+x¯2X¯2,where k1, k2, and k3 are real constants.
Let ε0=y¯/Y¯-1, ε1=x¯1/X¯1-1, and ε2=x¯2/X¯2-1. Under simple random sampling without replacement (SRSWOR), we have the following expectations:(20)Eε0=Eε1=Eε2=0,Eε02=δCy2,Eε12=δCx12,Eε22=δCx22,Eε0ε1=δρyx1CyCx1,Eε0ε2=δρyx2CyCx2,Eε1ε2=δρx1x2Cx1Cx2.
The proposed estimator y¯pr can be rewritten as (21)y¯pr=k1Y¯ε0+1-k2X¯1ε1-k3X¯2ε241ε1+1+ε1+11ε2+1+ε2+1.
By rewriting y¯pr, we have(22)y¯pr=k1Y¯ε0+1-k2X¯1ε1-k3X¯2ε24ε1+1+1-ε1+ε12+⋯ε2+1+1-ε2+ε22+⋯.
By retaining only the terms up to the second degree of ε’s, we have(23)y¯pr-Y¯≅k1-1Y¯+k1Y¯ε0-k2X¯1ε1-k3X¯2ε2+12k1Y¯ε12+12k1Y¯ε22.
The bias of the proposed estimator is given by(24)Biasy¯pr=Ey¯pr-Y¯≅Y¯k1-1+12δk1Cx12+12δk1Cx22.
The MSE of this new estimator with two auxiliary variables is given by (25)MSEy¯pr=Ey¯pr-Y¯2≅Y¯2k1-12-δk1Cx12+Cx22+δk12Cx12+Cx22+Cy2-2δk1Y¯Cyk2Cx1ρyx1X¯1+k3Cx2ρyx2X¯2+δk22Cx12X¯12+2Cx1Cx2k2k3ρx1x2X¯1X¯2+k32Cx22X¯22.
The optimum values of k1, k2, and k3 are given by (26)k1∗=A2+δCx12+Cx222A+Cy2δL+AδCx12+Cx22,k2∗=CyY¯ρyx1-ρx1x2ρyx22+δCx12+Cx222Cx1X¯1A+Cy2δL+AδCx12+Cx22,k3∗=CyY¯ρyx2-ρx1x2ρyx12+δCx12+Cx222Cx2X¯2A+Cy2δL+AδCx12+Cx22.
The minimum MSE of y¯pr can be shown as (27)MSEminy¯pr=δY¯24Cy2L-AδCx12+Cx2224A+Cy2δL+AδCx12+Cx22.
2.6. Comparison of y¯pr with Some Existing Estimators
We compared the MSE of the proposed estimator with two auxiliary variables given in (27) with the MSE of the estimator reported by Abu-Dayyeh et al. [7], as given in (4), Kadilar and Cingi [8], as given in (8), Malik and Singh [9], as given in (12), and the regression estimator, as given in (17), as follows:(28)MSEminy¯pr<MSEminy¯RE<MSEminy¯AD=MSEminy¯KC=MSEminy¯MS,always.
To examine the merits of the proposed estimator, we considered four natural population datasets in agricultural, biomedical, and power engineering. We used the following formula to calculate the percent of relative efficiency of different estimators:(31)PREϕ,y¯AD=MSEy¯ADMSEϕ×100,where ϕ=y¯RE or y¯KC or y¯MS or y¯pr.
Population I (source in biomedical engineering [14])
Y: number of “placebo” children.
X1: number of paralytic polio cases in the placebo group.
X2: number of paralytic polio cases in the “not inoculated” group.
MSE and PRE values of different estimators about population I can be seen in Table 1.
MSE and PRE values of different estimators about population I.
n
n/N
y¯AD
y¯KC
y¯MS
y¯RE
y¯pr
MSE
10
0.294
0.738607
0.738607
0.738607
0.716737
0.433563
15
0.441
0.390406
0.390406
0.390406
0.384209
0.297354
20
0.588
0.211030
0.211030
0.211030
0.209207
0.182492
25
0.735
0.105515
0.105515
0.105515
0.105057
0.098166
PREϕ,y¯AD
10
0.294
100
100
100
103.05
170.36
15
0.441
100
100
100
101.61
131.29
20
0.588
100
100
100
100.87
115.64
25
0.735
100
100
100
100.44
107.47
MSE and PRE values of different estimators about population II can be seen in Table 2.
MSE and PRE values of different estimators about population II.
n
n/N
y¯AD
y¯KC
y¯MS
y¯RE
y¯pr
MSE
10
0.056
1.703302
1.703302
1.703302
1.688617
1.607564
50
0.278
0.260505
0.260505
0.260505
0.260159
0.258197
90
0.500
0.100194
0.100194
0.100194
0.100143
0.099852
130
0.722
0.038536
0.038536
0.038536
0.038529
0.038486
PREϕ,y¯AD
10
0.056
100
100
100
100.870
105.955
50
0.278
100
100
100
101.133
100.894
90
0.500
100
100
100
100.051
100.343
130
0.722
100
100
100
100.020
100.132
MSE and PRE values of different estimators about population III can be seen in Table 3.
MSE and PRE values of different estimators about population III.
n
n/N
y¯AD
y¯KC
y¯MS
y¯RE
y¯pr
MSE
10
0.182
0.15048093
0.15048093
0.15048093
0.15040335
0.15032137
20
0.364
0.05981617
0.05981617
0.05981617
0.05980391
0.05979094
30
0.545
0.02821517
0.02821517
0.02821517
0.02821245
0.0282096
40
0.727
0.01279088
0.01279088
0.01279088
0.01279032
0.01278973
PREϕ,y¯AD
10
0.182
100
100
100
100.0516
100.1061
20
0.364
100
100
100
100.0205
100.0422
30
0.545
100
100
100
100.0097
100.0199
40
0.727
100
100
100
100.0044
100.0090
MSE and PRE values of different estimators about population IV can be seen in Table 4.
MSE and PRE values of different estimators about population IV.
n
n/N
y¯AD
y¯KC
y¯MS
y¯RE
y¯pr
MSE
10
0.333
21.30709
21.30709
21.30709
21.30557
17.16434
15
0.500
10.54174
10.54174
10.54174
10.54137
9.52763
20
0.667
6.70838
6.70838
6.70838
6.70823
6.29770
25
0.833
3.51391
3.51391
3.51391
3.51387
3.40123
PREϕ,y¯AD
10
0.333
100
100
100
100.0071
124.1358
15
0.500
100
100
100
100.0035
110.6438
20
0.667
100
100
100
100.0022
106.5210
25
0.833
100
100
100
100.0012
103.3129
The relative efficiency was studied based on the traditional regression- or ratio-type estimators in many literatures (Abu-Dayyeh et al. [7], Haq and Shabbir [15], and Verma et al. [16]). However, we studied the relative efficiency based on the higher efficient estimators using two auxiliary variables and found that the efficiency of the proposed estimator is higher than the estimators noted above under any conditions.
Under different sample sizes and different datasets, we notice from the data given in Tables 1, 2, 3, and 4 that the proposed estimator of a finite population mean using two auxiliary variables is always more efficient than the estimators y¯AD, y¯KC, y¯MS, and y¯RE. Although the expressions of the estimators reported by Abu-Dayyeh et al., Kadilar and Cingi, and Malik and Singh are different, we note from the comparative study above that the minimum MSE values for the estimators reported by Abu-Dayyeh et al. [7], Kadilar and Cingi [8], and Malik and Singh [9] are equal and have same expression. The regression estimator has a smaller MSE value than the three estimators noted above. With increase in sample fraction, the MSE and PRE values of the proposed estimator decrease. Therefore, a smaller sampling fraction yields better results relative to MSE and PRE values when compared to a larger sampling fraction. Moreover, with a large sampling fraction, the efficiency differential among all estimators in the present study is very small. Therefore, it is suggested that our estimators be used with small sampling fractions. From this viewpoint, the proposed estimator can save survey cost. In some sampling yields, the sample fraction is not very large due to the irreversibility or the high cost of the test. Then the accuracy of the proposed estimator is higher. Haq and Shabbir [15] also reported on estimators of finite population mean using two auxiliary attributes and found that their MSE values were reduced when sample size increased. Therefore, the findings of the present are consistent with that study.
4. Conclusions
In this paper, we proposed the improved estimator of a finite population mean by utilizing information on two auxiliary variables in SRS. Bias and MSE expressions of the proposed estimator, y¯pr, were obtained. We clearly proved that the new estimator is always more efficient than the estimators reported by Abu-Dayyeh et al. [7], Kadilar and Cingi [8], and Malik and Singh [9], as well as the regression estimator using two auxiliary variables. These theoretical conditions are also satisfied by the results of four numerical examples in agricultural, biomedical, and power engineering. It should be noted that a smaller sample size yields better results relative to MSE and PRE values when compared to a larger sample size. Thus, for use with small sample size, the suggested estimator would be cost-saving in actual practice and are, therefore, recommended for efficient estimation of finite population mean.
Conflicts of Interest
The author declares that they have no conflicts of interest.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (no. 11461051) and Natural Science Foundation of Inner Mongolia Autonomous Region of China (no. 2017MS(LH)0101), and the author gratefully acknowledges this support.
SinghH. P.VishwakarmaG. K.Modified exponential ratio and product estimators for finite population mean in double sampling200736217225GroverL. K.KaurP.An improved estimator of the finite population mean in simple random sampling20116147552-s2.0-7995293867310.3233/MAS-2011-0163GroverL. K.KaurP.An improved exponential estimator for finite population mean in simple random sampling using an auxiliary attribute201121873093309910.1016/j.amc.2011.08.035MR28514112-s2.0-80055007130SinghR.MalikS.ChaudharyM. K.VermaH. K.AdewaraA.A general family of ratio-type estimators in systematic sampling201257382SinghH. P.SolankiR. S.Improved estimation of population mean in simple random sampling using information on auxiliary attribute2012218157798781210.1016/j.amc.2012.01.047MR2900113Zbl060365732-s2.0-84858340986GuptaS.ShabbirJ.On improvement in estimating the population mean in simple random sampling2008355-655956610.1080/02664760701835839MR2516856Zbl1183.620142-s2.0-42549085602Abu-DayyehW. A.AhmedM. S.AhmedR. A.MuttlakH. A.Some estimators of a finite population mean using auxiliary information20031392-328729810.1016/S0096-3003(02)00180-7MR1948641Zbl1019.620082-s2.0-0037448164KadilarC.CingiH.A new estimator using two auxiliary variables2005162290190810.1016/j.amc.2003.12.130MR2111874Zbl1058.620132-s2.0-10444273228MalikS.SinghR.An improved estimator using two auxiliary attributes201321923109831098610.1016/j.amc.2013.05.014MR3069008Zbl1302.620202-s2.0-84879097949SharmaP.SinghR.A class of exponential ratio estimators of finite population mean using two auxiliary variables201511222122910.18187/pjsor.v11i2.759MR33946042-s2.0-84941570046MuneerS.ShabbirJ.KhalilA.Estimation of finite population mean in simple random sampling and stratified random sampling using two auxiliary variables20174652181219210.1080/03610926.2015.1035394MR3576706Zbl1364.62019RaoT. J.On certain methods of improving ratio and regression estimators199120103325334010.1080/03610929108830705MR11448902-s2.0-1842553991SinghH. P.EspejoM. R.On linear regression and ratio-product estimation of a finite population mean2003521596710.1111/1467-9884.00341MR1973882ChoudhuryS.SinghB. K.A class of chain ratio-product type estimators with two auxiliary variables under double sampling scheme201241224725610.1016/j.jkss.2011.09.002MR3255295Zbl1296.620232-s2.0-84859103657HaqA.ShabbirJ.An improved estimator of finite population mean when using two auxiliary attributes2014241142410.1016/j.amc.2014.04.069MR32234052-s2.0-84901639933VermaH. K.SharmaP.SinghR.Some families of estimators using two auxiliary variables in stratified random sampling2015362140150MR3340925