JPS Journal of Probability and Statistics 1687-9538 1687-952X Hindawi Publishing Corporation 939701 10.1155/2014/939701 939701 Research Article An Improved Class of Chain Ratio-Product Type Estimators in Two-Phase Sampling Using Two Auxiliary Variables Vishwakarma Gajendra K. Kumar Manish Bai Zhidong Department of Applied Mathematics Indian School of Mines Dhanbad Jharkhand 826004 India ismdhanbad.ac.in 2014 6 3 2014 2014 13 09 2013 23 01 2014 6 3 2014 2014 Copyright © 2014 Gajendra K. Vishwakarma and Manish Kumar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This paper presents a technique for estimating finite population mean of the study variable in the presence of two auxiliary variables using two-phase sampling scheme when the regression line does not pass through the neighborhood of the origin. The properties of the proposed class of estimators are studied under large sample approximation. In addition, bias and efficiency comparisons are carried out to study the performances of the proposed class of estimators over the existing estimators. It has also been shown that the proposed technique has greater applicability in survey research. An empirical study is carried out to demonstrate the performance of the proposed estimators.

1. Introduction

The use of auxiliary information for estimating population mean of the study variable has greater applicability in survey research. It is utilized at the estimation stage and design stage to obtain an improved estimator compared to those not utilizing auxiliary information. The use of ratio and product strategies in survey sampling solely depends upon the knowledge of population mean X - of the auxiliary variable X .

The ratio estimator was developed by Cochran  to estimate the population mean Y - of the study variable Y by using information on auxiliary variable X , positively correlated with Y . The ratio estimator is most effective when the relationship between Y and X is linear through the origin and the variance of Y is proportional to X . Robson  defined a product estimator that was revisited by Murthy . The product estimator is used when the auxiliary variable X is negatively correlated with the study variable Y .

When the population mean X - of the auxiliary variable X is not known before the start of a survey, then a first-phase sample of size n is selected from the population of size N on which only the auxiliary variable X is measured in order to furnish a good estimate of X - . And then a second-phase sample of size n is selected from the first-phase sample of size n on which both the study variable Y and the auxiliary variable X are measured. This procedure of selecting the samples from the given population is known as two-phase sampling (or double sampling). The concept of double sampling was first introduced by Neyman . Some contribution to two-phase sampling has been made by Sukhatme , Hidiroglou and Sarndal , Fuller , Hidiroglou , Singh and Vishwakarma , and Sahoo et al. .

We can use either one or two (or more than two) auxiliary variables while estimating population mean of the study variable; keeping this fact, Chand  introduced chain ratio estimators. This led various authors including Kiregyera , Singh and Upadhyaya , Prasad et al. , Singh et al. , Singh and Choudhury , and Vishwakarma and Gangele  to modify the chain type estimators and discuss their properties.

When the population mean Z - of another auxiliary variable Z which has a positive correlation with X (i.e., ρ X Z > 0 ) is known and if ρ Y X > ρ Y Z > 0 , then it is advisable to estimate X - by X - = x - ( Z - / z - ) , which would provide a better estimate of X - as compared to x - .

The usual chain type ratio and product estimators of Y - under double sampling scheme using two auxiliary variables X and Z are given, respectively, by (1) y - R dc = y - x - x - Z - z - , y - P dc = y - x - x - z - Z - .

Singh and Choudhury  suggested the following exponential chain type ratio and product estimators of Y - under double sampling scheme using two auxiliary variables X and Z : (2) y - Re dc = y - exp { ( x - / z - ) Z - - x - ( x - / z - ) Z - + x - } , y - Pe dc = y - exp { x - - ( x - / z - ) Z - x - + ( x - / z - ) Z - } , where x - and z - are the sample means of X and Z , respectively, based on the first-phase sample of size n drawn from the population of size N with the help of Simple Random Sampling Without Replacement (SRSWOR) scheme. Also, y - and x - are the sample means of Y and X , respectively, based on the second-phase sample of size n drawn from the first-phase sample of size n with the help of SRSWOR scheme.

2. Proposed Estimator

It has been theoretically established that, in general, the linear regression estimator is more efficient than the ratio (product) estimator except when the regression line of Y on X passes through the neighborhood of the origin, in which the efficiencies of these estimators are almost equal. However, owing to stronger intuitive appeal, survey statisticians favour the use of ratio and product estimators. Further, we note that, in many practical situations, the regression line does not pass through the neighborhood of the origin. In these situations, the ratio estimator does not perform well as the linear regression estimator. Considering this fact, Singh and Ruiz Espejo  made an attempt to improve the performance of these estimators and suggested the following ratio-product type estimator for population mean Y - under double sampling scheme using single auxiliary variable X : (3) y - RP d = y - [ α x - x - + ( 1 - α ) x - x - ] , where α is a real constant.

We propose the following exponential chain ratio-product type estimator for population mean Y - under double sampling scheme using two auxiliary variables X and Z : (4) y - RPe dc = y - [ α exp { ( x - / z - ) Z - - x - ( x - / z - ) Z - + x - } g + ( 1 - α ) exp { x - - ( x - / z - ) Z - x - + ( x - / z - ) Z - } ] , where α is a real constant to be determined such that the Mean Square Error (MSE) of the proposed estimator y - RPe dc is minimum. For α = 1 , y - RPe dc y - Re dc , whereas, for α = 0 , y - RPe dc y - Pe dc .

Remark. It is noted that the proposed estimator in (4) is a special case of the class of estimators y - class = y - H ( x - , z - ) proposed by Srivastava , where H ( · ) is a parametric function such that H ( x - s 1 , Z - ) = 1 and satisfies certain regularity conditions defined in Srivastava .

3. Bias and MSE of the Proposed Estimator

To obtain the Bias and Mean Square Error (MSE) of the proposed estimator y - RPe dc , we consider (5) y - = Y - ( 1 + e 0 ) , x - = X - ( 1 + e 1 ) , x - = X - ( 1 + e 1 ) , z - = Z - ( 1 + e 2 ) , such that (6) E ( e 0 ) = E ( e 1 ) = E ( e 1 ) = E ( e 2 ) = 0 , where | e 0 | < 1 , | e 1 | < 1 , | e 1 | < 1 , | e 2 | < 1 .

Let C Y , C X , and C Z be the coefficients of variation of Y , X , and Z , respectively. Also, let ρ Y X , ρ Y Z , and ρ X Z be the correlation coefficients between Y and X , Y and Z , and X and Z , respectively. Then, we have (7) E ( e 0 2 ) = f 1 C Y 2 , E ( e 1 2 ) = f 1 C X 2 , E ( e 1 2 ) = f 2 C X 2 , E ( e 2 2 ) = f 2 C Z 2 , E ( e 0 e 1 ) = f 1 ρ Y X C Y C X , E ( e 0 e 1 ) = f 2 ρ Y X C Y C X , E ( e 0 e 2 ) = f 2 ρ Y Z C Y C Z , E ( e 1 e 1 ) = f 2 C X 2 , E ( e 1 e 2 ) = f 2 ρ X Z C X C Z , E ( e 1 e 2 ) = f 2 ρ X Z C X C Z , where (3) f 1 = ( 1 n - 1 N ) , f 2 = ( 1 n - 1 N ) , f 3 = f 1 - f 2 = ( 1 n - 1 n ) , C Y 2 = S Y 2 Y - 2 , C X 2 = S X 2 X - 2 , C Z 2 = S Z 2 Z - 2 , ρ Y X = S Y X S Y S X , ρ Y Z = S Y Z S Y S Z , ρ X Z = S X Z S X S Z , S Y 2 = 1 ( N - 1 ) i = 1 N ( Y i - Y - ) 2 , S X 2 = 1 ( N - 1 ) i = 1 N ( X i - X - ) 2 , S Z 2 = 1 ( N - 1 ) i = 1 N ( Z i - Z - ) 2 , S Y X = 1 ( N - 1 ) i = 1 N ( Y i - Y - ) ( X i - X - ) , S Y Z = 1 ( N - 1 ) i = 1 N ( Y i - Y - ) ( Z i - Z - ) , S X Z = 1 ( N - 1 ) i = 1 N ( X i - X - ) ( Z i - Z - ) .

Now, expressing the estimator y - RPe dc in terms of e 0 , e 1 , e 1 , and e 2 and neglecting the terms of e 0 , e 1 , e 1 , and e 2 involving degree greater than two, we get (9) y - RPe dc = Y - [ 1 2 1 + α ( e 1 - e 2 - e 1 + e 0 e 1 - e 0 e 2 - e 0 e 1 ) - α 2 ( e 1 2 - e 2 2 - e 1 2 ) + e 0 - 1 2 ( e 1 - e 2 - e 1 + e 0 e 1 - e 0 e 2 - e 0 e 1 ) + 1 4 ( e 1 2 - e 2 2 - e 1 2 - e 1 e 2 + e 1 e 2 - e 1 e 1 ) + 1 8 ( e 1 2 + e 2 2 + e 1 2 ) ]

To the first degree of approximation, the Bias and Mean Square Error (MSE) of the proposed estimator y - RPe dc are given by (10) B ( y - RPe dc ) = Y - [ ( 4 α - 1 ) 8 × { f 3 C X 2 + f 2 C Z 2 } - ( 2 α - 1 ) 2 × { f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z } ( 4 α - 1 ) 8 ] , (11) MSE ( y - RPe dc ) = Y - 2 [ f 1 C Y 2 + ( 2 α - 1 ) 2 4 { f 3 C X 2 + f 2 C Z 2 } f - ( 2 α - 1 ) { f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z } ( 2 α - 1 ) 2 4 ] .

To the first degree of approximation, the expressions for Bias and Mean Square Error (MSE) of the estimators y - R dc , y - P dc , y - Re dc , y - Pe dc , and y - RP d are, respectively, given by (3) B ( y - R dc ) = Y - [ f 3 C X 2 + f 2 C Z 2 - f 3 ρ Y X C Y C X - f 2 ρ Y Z C Y C Z ] , B ( y - P dc ) = Y - [ f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z ] , B ( y - Re dc ) = Y - [ 3 8 { f 3 C X 2 + f 2 C Z 2 } - 1 2 { f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z } ] , B ( y - Pe dc ) = Y - [ - 1 8 { f 3 C X 2 + f 2 C Z 2 } + 1 2 { f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z } ] , B ( y - RP d ) = Y - [ α f 3 C X 2 - ( 2 α - 1 ) f 3 ρ Y X C Y C X ] , (13) MSE ( y - R dc ) = Y - 2 [ f 1 C Y 2 + f 3 C X 2 + f 2 C Z 2 - 2 f 3 ρ Y X C Y C X - 2 f 2 ρ Y Z C Y C Z C Y 2 ] , MSE ( y - P dc ) = Y - 2 [ f 1 C Y 2 + f 3 C X 2 + f 2 C Z 2 + 2 f 3 ρ Y X C Y C X + 2 f 2 ρ Y Z C Y C Z C Y 2 ] , MSE ( y - Re dc ) = Y - 2 [ f 1 C Y 2 + 1 4 { f 3 C X 2 + f 2 C Z 2 } f - { f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z } 1 4 ] , MSE ( y - Pe dc ) = Y - 2 [ f 1 C Y 2 + 1 4 { f 3 C X 2 + f 2 C Z 2 } e + { f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z } 1 4 ] , MSE ( y - RP d ) = Y - 2 [ f 1 C Y 2 + 4 α 2 f 3 C X 2 - 4 α f 3 { C X 2 + ρ Y X C Y C X } + f 3 { C X 2 + 2 ρ Y X C Y C X } ] .

3.1. Optimum Value of <inline-formula> <mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M112"> <mml:mrow> <mml:mi>α</mml:mi></mml:mrow> </mml:math></inline-formula>

As we know, α is determined so as to minimize the Mean Square Error (MSE) of the estimators y - RP d and y - RPe dc . So, the optimum values of α , for which MSE ( y - RP d ) and MSE ( y - RPe dc ) are minimum, are obtained by using the following conditions: (14) α MSE ( y - RP d ) = 0 , α MSE ( y - RPe dc ) = 0 .

The optimum value of α , which minimizes the Mean Square Error (MSE) of the estimator y - RP d , is given by (15) α opt = 1 2 [ 1 + ρ Y X C Y C X ] .

The optimum value of α , which minimizes the Mean Square Error (MSE) of the estimator y - RPe dc , is given by (16) α opt = f 3 ( 2 ρ Y X C Y C X + C X 2 ) + f 2 ( 2 ρ Y Z C Y C Z + C Z 2 ) 2 ( f 3 C X 2 + f 2 C Z 2 ) .

Substituting the value of α from (15) in (13), we get the minimum MSE of y - RP d as (17) MSE ( y - RP d ) min = Y - 2 [ f 1 C Y 2 - f 3 ρ Y X 2 C Y 2 ] .

Substituting the value of α from (16) in (11), we get the minimum MSE of y - RPe dc as (18) MSE ( y - RPe dc ) min = Y - 2 [ f 1 C Y 2 - ( f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z ) 2 f 3 C X 2 + f 2 C Z 2 ] .

4. Efficiency Comparisons

It is well known that the Bias and variance of the usual unbiased estimator y - for population mean in SRSWOR are (19) B ( y - ) = 0 , (20) V ( y - ) = f 1 S Y 2 = f 1 Y - 2 C Y 2 .

From (11), (13), and (20), we have

MSE ( y - RPe dc ) < V ( y - ) , if (21) α < 4 ( f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z ) + f 3 C X 2 + f 2 C Z 2 2 ( f 3 C X 2 + f 2 C Z 2 ) ,

MSE ( y - RPe dc ) < MSE ( y - R dc ) , if (22) α < 4 ( f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z ) - ( f 3 C X 2 + f 2 C Z 2 ) 2 ( f 3 C X 2 + f 2 C Z 2 ) ,

MSE ( y - RPe dc ) < MSE ( y - P dc ) , if (23) α < 4 ( f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z ) + 3 ( f 3 C X 2 + f 2 C Z 2 ) 2 ( f 3 C X 2 + f 2 C Z 2 ) ,

MSE ( y - RPe dc ) < MSE ( y - Re dc ) , if (24) α < 2 ( f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z ) f 3 C X 2 + f 2 C Z 2 ,

MSE ( y - RPe dc ) < MSE ( y - Pe dc ) , if (25) α < 2 ( f 3 ρ Y X C Y C X + f 2 ρ Y Z C Y C Z ) + f 3 C X 2 + f 2 C Z 2 f 3 C X 2 + f 2 C Z 2 ,

MSE ( y - RPe dc ) < MSE ( y - RP d ) , if (26) α < 4 ( f 3 ρ Y X C Y C X - f 2 ρ Y Z C Y C Z ) + ( 3 f 3 C X 2 - f 2 C Z 2 ) 2 ( 3 f 3 C X 2 - f 2 C Z 2 ) .

The range of α provides enough scope for choosing many estimators that are more efficient than the above considered estimators.

5. Empirical Study

To examine the merits of the proposed estimator of Y - , we have considered the following natural population datasets.

Population I (source: Cochran ) is shown as follows:

Y : number of “placebo” children,

X : number of paralytic polio cases in the “placebo” group,

Z : number of paralytic polio cases in the “not inoculated” group

N = 34 , n = 15 , n = 10 , Y - = 4.92 , X - = 2.59 , and Z - = 2.91 ,

ρ Y X = 0.7326 , ρ Y Z = 0.6430 , ρ X Z = 0.6837 , C Y 2 = 1.0248 , C X 2 = 1.5175 ,and C Z 2 = 1.1492 .

Population II (source: Murthy ) is shown as follows:

Y : area under wheat in 1964,

X : area under wheat in 1963,

Z : cultivated area in 1961,

N = 34 , n = 10 , n = 7 , Y - = 199.44 , X - = 208.89 , and Z - = 747.59 ,

ρ Y X = 0.9801 , ρ Y Z = 0.9043 , ρ X Z = 0.9097 , C Y 2 = 0.5673 , C X 2 = 0.5191 , and C Z 2 = 0.3527 .

Here, we have computed

the Absolute Relative Bias (ARB) of different suggested estimators of Y - using the formula (27) ARB ( · ) = | Bias ( · ) Y - | ,

the Percentage Relative Efficiencies (PREs) of different suggested estimators of Y - with respect to y - using the formula (28) PRE ( · , y - ) = V ( y - ) MSE ( · ) × 100 .

6. Conclusion

It is observed from Table 1 that,

for population I, (29) ARB ( y - ) < ARB ( y - RPe dc ) < ARB ( y - Re dc ) < ARB ( y - Pe dc ) < ARB ( y - RP d ) < ARB ( y - R dc ) < ARB ( y - P dc ) ,

for population II, (30) ARB ( y - ) < ARB ( y - RP d ) < ARB ( y - R dc ) < ARB ( y - Re dc ) < ARB ( y - Pe dc ) < ARB ( y - RPe dc ) < ARB ( y - P dc ) .

Absolute Relative Bias (ARB) of different estimators of   Y - .

Estimators Population I Population II
y - 0.0000 0.0000
y - R dc 0.0369 0.0042
y - P dc 0.0564 0.0513
y - Re dc 0.0068 0.0079
y - Pe dc 0.0165 0.0198
y - RP d 0.0222 0.0008
y - RPe dc 0.0058 0.0243

From Table 2, we see that the Percentage Relative Efficiency (PRE) of the proposed estimator y - RPe dc , for populations I and II, is more as compared to all other existing estimators, that is, usual unbiased estimator y - , chain type ratio estimator y - R dc , chain type product estimator y - P dc , exponential chain type ratio estimator y - Re dc , exponential chain type product estimator y - Pe dc , and ratio-product type estimator y - RP d .

Percentage Relative Efficiencies (PREs) of different estimators of   Y - with respect to y - .

Estimators Population I Population II
y - 100 100
y - R dc 136.91 730.81
y - P dc * *
y - Re dc 184.36 259.55
y - Pe dc * *
y - RP d 133.95 156.96
y - RPe dc 189.27 763.30

* Data is not applicable.

Finally, from Tables 1 and 2, we conclude that the proposed estimator y - RPe dc (based on two auxiliary variables X and Z ) is a more appropriate estimator in comparison to other existing estimators as it has appreciable efficiency as well as lower relative bias.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Acknowledgments

The authors are grateful to the editor Professor Zhidong Bai and the learned referee for their comments leading to the improvement of the paper.

Cochran W. G. The estimation of the yields of the cereal experiments by sampling for the ratio of grain to total produce The Journal of Agricultural Science 1940 30 262 275 10.1017/S0021859600048012 Robson D. S. Applications of multivariate polykays to the theory of unbiased ratio-type estimation Journal of the American Statistical Association 1957 52 511 522 MR0092323 10.1080/01621459.1957.10501407 ZBL0078.33504 Murthy M. N. Product method of estimation The Indian Journal of Statistics A 1964 26 69 74 MR0193706 ZBL0138.13002 Neyman J. Contribution to the theory of sampling human populations Journal of American Statistical Association 1938 33 101 116 10.1080/01621459.1938.10503378 ZBL0018.22603 Sukhatme B. V. Some ratio-type estimators in two-phase sampling Journal of the American Statistical Association 1962 57 628 632 MR0145632 10.1080/01621459.1962.10500551 ZBL0106.34203 Hidiroglou M. A. Sarndal C. E. Use of auxiliary information for two phase sampling Survey Methodology 1998 24 11 20 Fuller W. A. Two-phase sampling Proceedings of the Annual Meeting of the Survey Methods Section of the Statistical Society of Canada 2000 23 30 Hidiroglou M. A. Double sampling Survey Methodology 2000 27 143 154 Singh H. P. Vishwakarma G. K. Modified exponential ratio and product estimators for finite population mean in double sampling Austrian Journal of Statistics 2007 36 3 217 225 Sahoo L. N. Mishra G. Nayak S. R. On two different classes of estimators in two-phase sampling using multi-auxiliary variables Model Assisted Statistics and Applications 2010 5 1 61 68 2-s2.0-77649125372 10.3233/MAS-2010-0143 Chand L. Some ratio type estimators based on two or more auxiliary variables [Ph.D. dissertation] 1975 Ames, Iowa, USA Iowa State University Kiregyera B. A chain ratio-type estimator in finite population double sampling using two auxiliary variables Metrika 1980 27 4 217 223 10.1007/BF01893599 MR598730 ZBL0445.62022 Singh G. N. Upadhyaya L. N. A class of modified chain-type estimators using two auxiliary variables in two phase sampling Metron 1995 53 3-4 117 125 MR1409762 ZBL0859.62017 Prasad B. Singh R. S. Singh H. P. Some chain ratio-type estimators for ratio of two population means using two auxiliary characters in two phase sampling Metron 1996 54 1-2 95 113 MR1450036 ZBL0018.22603 Singh S. Singh H. P. Upadhyaya L. N. Chain ratio and regression type estimators for median estimation in survey sampling Statistical Papers 2007 48 1 23 46 10.1007/s00362-006-0314-y MR2288170 Singh B. K. Choudhury S. Exponential chain ratio and product type estimators for finite population mean under double sampling scheme Global Journal of Science Frontier Research 2012 12 6 Vishwakarma G. K. Gangele R. K. A class of chain ratio-type exponential estimators in double sampling using two auxiliary variates Applied Mathematics and Computation 2014 227 171 175 10.1016/j.amc.2013.11.027 MR3146307 Singh H. P. Ruiz Espejo M. Double sampling ratio-product estimator of a finite population mean in sample surveys Journal of Applied Statistics 2007 34 1-2 71 85 10.1080/02664760600994562 MR2364242 ZBL1119.62310 Srivastava S. K. A generalized estimator for the mean of a finite population using multi-auxiliary information Journal of American Statistical Association 1971 66 404 407 10.1080/01621459.1971.10482277 ZBL0226.62055 Cochran W. G. Sampling Techniques 1977 New York, NY, USA John Wiley & Sons MR0474575 Murthy M. N. Sampling Theory and Methods 1967 Calcutta, India Statistical Publishing Society MR0474578