JPS Journal of Probability and Statistics 1687-9538 1687-952X Hindawi Publishing Corporation 723982 10.1155/2014/723982 723982 Research Article A Study on the Chain Ratio-Type Estimator of Finite Population Variance http://orcid.org/0000-0002-7187-7585 Olufadi Yunusa 1 Kadilar Cem 2 Chow Shein-chung 1 Department of Statistics and Mathematical Sciences Kwara State University PMB 1530 Malete Ilorin Nigeria kwasu.edu.ng 2 Department of Statistics Hacettepe University Beytepe 06800 Ankara Turkey hacettepe.edu.tr 2014 2422014 2014 12 08 2013 15 01 2014 15 01 2014 24 02 2014 2014 Copyright © 2014 Yunusa Olufadi and Cem Kadilar. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

We suggest an estimator using two auxiliary variables for the estimation of the unknown population variance. The bias and the mean square error of the proposed estimator are obtained to the first order of approximations. In addition, the problem is extended to two-phase sampling scheme. After theoretical comparisons, as an illustration, a numerical comparison is carried out to examine the performance of the suggested estimator with several estimators.

1. Introduction

Variations are present everywhere in our daily life. It is the law of nature that no two things or individuals are exactly alike. For instance, a physician needs a full understanding of variations in the degree of human blood pressure, body temperature, and pulse rate for adequate prescription. A manufacturer needs constant knowledge of the level of variations in people’s reaction to his product to be able to know whether to reduce or increase his price or improve the quality of his product. An agriculturist needs an adequate understanding of the variations in climatic factors especially from place to place (or time to time) to be able to plan on when, how, and where to plant his crop.

It is well known that the use of auxiliary information in sample survey designs results in efficient estimators of population parameters, such as variance, under some realistic conditions. For example, when information is available on the auxiliary variable that is positively correlated with the study variable, the ratio estimator is a suitable estimator for the estimation of the population variance.

Let P be a finite population consisting of N units, P 1 , P 2 , , P N . The units of this finite population are identifiable in the sense that they are uniquely labeled from 1 to N and the label on each unit is known. Let y be the character under study taking the value y i on the units P i ( i = 1,2 , , N ) and assume a sample of size n is drawn by the simple random sampling without replacement (SRSWOR).

Suppose in a survey problem that we are interested in estimating the population variance, S y 2 . Isaki  presented the ratio estimator for the population variance using the auxiliary information. The problem of estimating the population variance using information on single auxiliary variable has also been discussed by various authors including Prasad and Singh [2, 3], Biradar and Singh , Rueda Garcia and Arcos Cebrian , Arcos et al. , Kadilar and Cingi , and Singh et al. .

The mean square error (MSE) of the classical estimator of the population variance, S y 2 , which we denote as t 0 , is V ( t 0 ) = S y 4 A 0 . Quite often, information on many auxiliary variables is available in the survey which can be utilized to increase the precision of the estimate. The ratio estimator of population variance for a single auxiliary variable denoted as t 1 suggested by Isaki  and the two-phase sampling (TPS) estimator of t 1 denoted as t 1 * are as follows: (1) t 1 = s y 2 S x 1 2 s x 1 2 , MSE ( t 1 ) = S y 4 ( A 0 + A 1 - 2 A 3 ) , t 1 * = s y 2 s x 1 * 2 s x 1 2 , MSE ( t 1 * ) = MSE ( t 1 ) - S y 4 ( A 1 * - 2 A 3 * ) .

Following Olkin , Isaki  also presented the ratio estimator of variance using two auxiliary variables as follows: (2) t 2 = W 1 s y 2 s x 1 2 S x 1 2 + W 2 s y 2 s x 1 2 S x 2 2 , MSE ( t 2 ) = S y 4 ( C 1 + W 1 2 C 2 - 2 W 1 C 3 ) , t 2 * = M 1 s y 2 s x 1 2 s x 1 * 2 + M 2 s y 2 s x 2 2 s x 2 * 2 ( TPS approach    of    t 2 ) , MSE ( t 2 * ) = S y 4 ( D 1 + M 1 2 D 2 - 2 M 1 D 3 ) , where W i and M i , for i = 1,2 , are weights chosen to minimize the MSE of t 2 and t 2 * . Further, W i = 1 and M i = 1 , where (3) s x 1 2 = 1 n i = 1 n ( x 1 i - X - 1 ) 2 , s x 2 * 2 = 1 n i = 1 n ( x 2 i - X - 2 ) 2 , s x 1 * 2 = 1 n i = 1 n ( x 1 i - x - 1 ) 2 , s x 2 * 2 = 1 n i = 1 n ( x 2 i - x - 2 ) 2 , x - 1 = 1 n i = 1 n x 1 i , x - 2 = 1 n i = 1 n x 2 i , C 1 = A 0 + A 2 - 2 A 4 , C 1 * = A 2 * - 2 A 4 * , D 1 = C 1 - C 1 * , C 2 = A 1 + A 2 - 2 A 5 , C 2 * = A 1 * + A 2 * - 2 A 5 * , D 2 = C 2 - C 2 * , C 3 = A 2 + A 3 - A 4 - A 5 , C 3 * = A 2 * + A 3 * - A 4 * - A 5 * , D 3 = C 3 - C 3 * , A 0 = 1 n ( λ 400 - 1 ) , A 1 = 1 n ( λ 040 - 1 ) , A 2 = 1 n ( λ 004 - 1 ) , A 3 = 1 n ( λ 220 - 1 ) , A 4 = 1 n ( λ 202 - 1 ) , A 5 = 1 n ( λ 022 - 1 ) , A 1 * = 1 n ( λ 040 - 1 ) , A 2 * = 1 n ( λ 004 - 1 ) , A 3 * = 1 n ( λ 220 - 1 ) , A 4 * = 1 n ( λ 202 - 1 ) , A 5 * = 1 n ( λ 022 - 1 ) , λ = μ a b c μ 200 a / 2 μ 020 b / 2 μ 002 c / 2 , μ a b c = 1 N - 1 i = 1 N ( y i - Y - ) a ( x 1 i - X - 1 ) b ( x 2 i - X - 2 ) c , where a , b , and c are nonnegative integers.

Several authors (Srivastava et al. , Upadhyaya et al. , and Singh et al. ) adopted TPS procedure proposed by Chand  and have suggested some chain ratio-type estimators for estimating population mean Y - of y . In the same vein, Gupta et al.  and Singh et al.  proposed the following classes of estimators under the assumption that the population variance of the first auxiliary variable S x 1 2 is not known, but the population variance of another auxiliary variable X 2 closely related to X 1 is available. The MSEs of the estimators suggested by Gupta et al.  and Singh et al.  are, respectively, given by (4) t 3 = s y 2 ( s x 1 2 s x 1 * 2 ) I 1 ( s x 2 * 2 S x 2 2 ) I 2 , MSE min ( t 3 ) = s y 4 [ A 0 - ϕ ( A 3 2 A 1 ) - ( A 4 2 n A 2 ) ] , t 4 = s y 2 ( s x 1 2 s x 1 * 2 ) J 1 ( s x 2 * 2 S x 2 2 ) J 2 ( s x 2 2 S x 2 2 ) J 3 , MSE min ( t 4 ) = s y 4 A 0 [ 1 - δ γ 0.12 * 2 - θ ρ * 2 ] , where I 1 , I 2 , and J i for i = 1,2 , 3 are constants chosen to minimize the MSE of t 3 and t 4 ; ϕ = ( 1 / n - 1 / n ) ; θ = n / n ; δ = ( ( n - n ) / n ) ; γ 0.12 * 2 = ( A 2 A 3 2 - 2 A 3 A 4 A 5 + A 1 A 4 2 ) / A 0 ( A 1 A 2 - A 5 2 ) ; ρ * 2 = A 4 2 / A 0 A 2 .

In most studies, several variables are considered simultaneously either to explain or estimate (predict) the study variable. In most cases, information on several auxiliary variables closely related to the study variable may be easily obtained on all units in the population. For example, while conducting an educational survey, the investigator may be interested in studying characteristics such as age, gender, hours spent on studying per day, sitting position, parent’s educational level, parent’s income, relationship with lectures and access to facilities (e.g., library, internet, laboratory), among others. With the main aim of suggesting a more efficient estimator, we propose in this paper, under SRSWOR, a chain ratio-type estimator for estimating the population variance when information on two auxiliary variables is available. In addition, the problem is extended to the case of TPS.

2. The Suggested Estimator

Following Abu-Dayyeh et al. , we define an estimator for estimating the population variance, S y 2 , as follows: (5) t = s y 2 ( S x 1 2 s x 1 2 ) α 1 ( S x 2 2 s x 2 2 ) α 2 , where α 1 and α 2 are real constants to be determined such that the MSE of t is minimum.

To determine the bias and MSE of t , we define (6) s y 2 = S y 2 ( 1 + k 0 ) ; s x 1 2 = S x 1 2 ( 1 + k 1 ) ; s x 2 2 = S x 2 2 ( 1 + k 2 ) , such that (7) E ( k 0 ) = E ( k 1 ) = E ( k 2 ) = 0 , E ( k 0 2 ) = A 0 , E ( k 1 2 ) = A 1 , E ( k 2 2 ) = A 2 , E ( k 0 k 1 ) = A 3 , E ( k 0 k 2 ) = A 4 , E ( k 1 k 2 ) = A 5 ,

Now, expressing t in terms of k ’s, we have (8) t = S y 2 ( 1 + k 0 )    ( 1 + k 1 ) - α 1 ( 1 + k 2 ) - α 2 = S y 2 ( 1 + k 0 )    ( 1 - α 1 k 1 + α 1 ( α 1 + 1 ) 2 k 1 2 ) × ( 1 - α 2 k 2 + α 2 ( α 2 + 1 ) 2 k 2 2 ) .

We assume that | k 1 | < 1 and | k 2 | < 1 so that ( 1 + k 1 ) - 1 and ( 1 + k 2 ) - 1 are expandable in terms of k ’s. By expanding the right hand side of (8), multiplying, and neglecting terms involving power of k ’s greater than two, we have (9) t - S y 2 = S y 2 ( α 1 ( α 1 + 1 ) 2 k 0 - α 1 k 1 - α 2 k 2 + α 1 α 2 k 1 k 2 - α 1 k 0 k 1 - α 2 k 0 k 2 + α 1 ( α 1 + 1 ) 2 k 1 2 + α 2 ( α 2 + 1 ) 2 k 2 2 ) .

Taking expectations on both sides of (9), we get the bias of t , to the first degree of approximation, as (10) B ( t ) = S y 2 ( α 1 2 2 A 1 + α 2 2 2 A 2 + α 1 α 2 A 5 - α 1 A 3 - α 2 A 4 ) .

Squaring both sides of (9) and neglecting terms of k ’s involving power greater than two, we have (11) ( t - S y 2 ) 2 = S y 4 ( k 0 2 + 2 α 1 α 2 k 1 k 2 - 2 α 1 k 0 k 1 - 2 α 2 k 0 k 2 + α 1 2 k 1 2 + α 2 2 k 2 2 ) .

Taking expectations on both sides of (11), we get the MSE of t , to the first order of approximation, as (12) MSE ( t ) = S y 4 ( A 0 + α 1 2 A 1 + α 2 2 A 2 - 2 α 1 A 3 - 2 α 2 A 4 + 2 α 1 α 2 A 5 α 1 2 ) . The optimal values of α 1 and α 2 in (12) could be obtained by differentiating (12) with respect to α 1 and α 2 and equalizing to zero. After a little algebraic simplification, we have (13) α 1 * = A 2 A 3 - A 4 A 5 A 1 A 2 - A 5 2 , α 2 * = A 1 A 4 - A 3 A 5 A 1 A 2 - A 5 2 . We can obtain the minimum MSE of t by simply substituting the optimal equations of α 1 and α 2 in (12).

3. Suggested Estimator in TPS

In certain practical situations, when S x 2 is not also known, the technique of TPS sometimes referred to as double sampling is used. This scheme requires the collection of information on x 1 and x 2 in the first phase sample s of size n ( n < N ) and on y for the second phase sample s of size n ( n < n ). The estimator t * in TPS will take the following form: (14) t * = s y 2 ( s x 1 * 2 s x 1 2 ) α 3 ( s x 2 * 2 s x 2 2 ) α 4 .

To obtain the bias and MSE of t * , we write (15) s y 2 = S y 2 ( 1 + k 0 ) , s x 1 2 = S x 1 2 ( 1 + k 1 ) , s x 1 * 2 = S x 1 2 ( 1 + k 1 * ) , s x 2 2 = S x 2 2 ( 1 + k 2 ) , s x 2 * 2 = S x 2 2 ( 1 + k 2 * ) , Note that (16) E ( k 1 * ) = E ( k 2 * ) = 0 , E ( k 1 * 2 ) = A 1 * , E ( k 2 * 2 ) = A 2 * , E ( k 1 k 1 * ) = A 1 * , E ( k 2 k 2 * ) = A 2 * , E ( k 0 k 1 * ) = A 3 * , E ( k 0 k 2 * ) = 1 n ( λ 202 - 1 ) = A 4 * , E ( k 1 k 1 * ) = E ( k 2 k 1 * ) = E ( k 1 * k 2 * ) = A 5 * . Expressing t * in terms of k ’s and following the procedure explained in Section 2, we get the bias and MSE of the estimator, t * , respectively, as (17) B ( t * ) = S y 2 ( α 3 ( α 3 + 1 ) 2 E 3 + α 4 ( α 4 + 1 ) 2 E 4 + α 3 α 4 E 5 - α 3 E 1 - α 4 E 2 α 4 ( α 4 + 1 ) 2 ) , MSE ( t * ) = MSE ( t ) - S y 4 ( α 3 2 A 1 * + α 4 2 A 2 * + 2 α 3 α 4 A 5 * + 2 α 3 A 3 * + 2 α 4 A 4 * α 3 2 ) , where (18) E 1 = A 3 - A 3 * , E 2 = A 4 - A 4 * , E 3 = A 1 - A 1 * , E 4 = A 2 - A 2 * , E 5 = A 5 - A 5 * . Minimization of (17), with respect to α 3 and α 4 , yields their optimum values as (19) α 3 * = E 1 E 4 - E 2 E 5 E 3 E 4 - E 5 2 , α 4 * = E 2 E 3 - E 1 E 5 E 3 E 4 - E 5 2 . Substitution of α 3 * and α 4 * in (17) gives the minimum value of the MSE of t * .

4. Efficiency Comparisons

In this section, we considered the theoretical comparisons of the performances of the suggested estimators ( t and t * ) with respect to the traditional estimator ( t 0 ), Isaki  ratio estimators t 1 , t 1 * , t 2 , and t 2 * (for single and double auxiliary variables), Gupta et al.  estimator, ( t 3 ) and Singh et al.  estimator ( t 4 ) which are investigated. We have the following conditions: (20) ( i ) MSE ( t ) - MSE ( t 0 ) < 0 H 1 <    0 , ( ii ) MSE ( t ) - MSE ( t 1 ) < 0 H 2 <    0 , ( iii ) MSE ( t ) - MSE ( t 2 ) < 0 H 3 + H 4 <    0 , ( iv ) MSE ( t ) - MSE ( t 3 ) < 0 H 5 < H 6 , ( v ) MSE ( t ) - MSE ( t 4 ) < 0 H 5 < H 7 , ( vi ) MSE ( t * ) - MSE ( t 1 * ) < 0 MSE ( t ) - MSE ( t 1 ) < S y 4 H 8 , ( vii ) MSE ( t * ) - MSE ( t 2 * ) < 0 MSE ( t ) - MSE ( t 2 ) < S y 4 H 9 , where (21) H 1 = α 1 2 A 1 + α 2 2 A 2 + 2 α 1 α 2 A 5 - 2 α 1 A 3 - 2 α 2 A 4 , H 2 = ( α 1 2 - 1 ) A 1 + α 2 2 A 2 - 2 ( α 1 - 1 ) A 3 H 2 - 2 α 2 A 4 + 2 α 1 α 2 A 5 , H 3 = ( α 1 - W 1 ) [ ( α 1 + W 1 ) A 1 - 2 A 3 ] H 2 + ( α 2 + W 1 ) [ ( α 2 - W 1 ) A 2 - 2 A 4 ] + 2 A 4 , H 4 = 2 ( α 1 α 2 + W 1 2 - W 1 ) A 5 - ( 1 - 2 W 1 ) A 2 , H 5 = α 1 * 2 A 1 + α 2 * 2 A 2 - 2 α 1 * A 3 - 2 α 2 * A 4 + 2 α 1 * α 2 * A 5 , H 6 = ϕ ( A 3 2 A 1 ) - ( A 4 2 n A 2 ) , H 7 = δ γ 0.12 * 2 - θ ρ * 2 , H 8 = ( α 3 2 - 1 ) A 1 * + α 4 2 A 2 * H 2 - 2 ( α 3 - 1 ) A 3 * + 2 α 4 A 4 * + 2 α 3 α 4 A 5 * , H 9 = H + C 1 * + M 1 2 C 2 * + 2 M 1 C 3 * .

5. Numerical Illustration

In this section, we illustrate the performance of various estimators of the population variance, S y 2 , by considering the data about Y : output, X : number of workers, and Z : fixed capital, given in Murthy . The data summary is briefly presented as follows: (22) N = 80 , n = 10 , λ 400 = 2.2667 , λ 040 = 3.6500 , λ 004 = 2.8664 , λ 220 = 2.3377 , λ 202 = 2.2208 , λ 022 = 3.1400 . The MSE and percent relative efficiency (PRE) of various estimators of S y 2 , with respect to the conventional estimator, t 0 , have been computed and presented in Table 1. Note that, for the calculation of the MSE of t * , we take n = 25 and also note that the minimum MSE of t 2 and t 2 * is obtained using MSE min ( t 2 ) = S y 4 ( C 1 - C 3 2 / C 2 ) and MSE min ( t 2 * ) = S y 4 ( D 1 - D 3 2 / D 2 ) .

The MSE and PRE of the different estimators with respect to t 0 .

Estimators MSE PRE
t 0 0.1267 100
t 1 0.1241 103
t 1 * 0.1251 101
t 2 ( opt ) 0.0586 217
t 2 ( opt ) * 0.0859 147
t 3 0.0543 233
t 4 0.0479 265
t ( opt ) 0.0451 281
t ( opt ) * 0.0774 164

Table 1 reveals that the suggested estimator t has the smallest MSE and thus the highest PRE among other estimators considered in this study. The suggested estimator in TPS t * also provides a sufficient improvement in variance estimation compared to the existing ones ( t 1 * and t 2 * ). It is also observed from Table 1 that the TPS estimators are less efficient than their corresponding.

6. Conclusion

We have developed a new estimator for estimating the finite population variance under SRSWOR, which is found to be more efficient than the traditional estimator, Isaki  ratio estimators (using single and double auxiliary variables), Gupta et al.  estimator, and Singh et al.  estimator when certain conditions, as outlined in Section 4, are satisfied. This theoretical inference is also supported by the result of an application with original data. In future, we hope to extend the estimators suggested here for the development of a new estimator in the stratified random sampling.

Conflict of Interests

The authors declare that there is no conflict of interests regarding the publication of this paper.

Isaki C. T. Variance estimation using auxiliary information Journal of the American Statistical Association 1983 78 381 117 123 MR696855 10.1080/01621459.1983.10477939 ZBL0512.62017 Prasad B. Singh H. P. Some improved ratio-type estimators of finite population variance in sample surveys Communications in Statistics 1990 19 3 1127 1139 10.1080/03610929008830251 MR1075488 ZBL0900.62058 Prasad B. Singh H. P. Unbiased estimators of finite population variance using auxiliary information in sample surveys Communications in Statistics 1992 21 5 1367 1376 10.1080/03610929208830852 MR1173735 ZBL0800.62049 Biradar R. S. Singh H. P. An alternative to ratio estimator of population Variance Assam Statistical Review 1994 8 2 18 33 Rueda Garcia M. Arcos Cebrian A. Repeated substitution method: the ratio estimator for the population variance Metrika 1996 43 2 101 105 10.1007/BF02613900 MR1392163 ZBL0897.62013 Arcos A. Rueda M. Martínez M. D. González S. Román Y. Incorporating the auxiliary information available in variance estimation Applied Mathematics and Computation 2005 160 2 387 399 10.1016/j.amc.2003.11.010 MR2102817 ZBL1058.62012 Kadilar C. Cingi H. Improvement in estimating the population mean in simple random sampling Applied Mathematics Letters 2006 19 1 75 79 10.1016/j.aml.2005.02.039 MR2189819 ZBL1058.62012 Singh H. P. Singh S. Kim J. M. Efficient use of auxiliary variables in estimating finite population variance in two-phase sampling Communications of the Korean Statistical Society 2010 17 2 165 181 Olkin I. Multivariate ratio estimation for finite populations Biometrika 1958 45 154 165 MR0092328 ZBL1058.62012 Srivastava S. R. Srivastava S. R. Khare B. B. Chain ratio type estimator for ratio of two population means using auxiliary characters Communications in Statistics 1989 18 10 3917 3926 10.1080/03610928908830131 MR1040684 ZBL0696.62013 Upadhyaya L. N. Kushwaha K. S. Singh H. P. A modified chain ratio-type estimator in two-phase sampling using multi auxiliary information Metron 1990 48 1–4 381 393 MR1159670 ZBL0850.62151 Singh V. K. Singh H. P. Singh H. P. Shukla D. A general class of chain estimators for ratio and product of two means of a finite population Communications in Statistics 1994 23 5 1341 1355 10.1080/03610929408831325 MR1281216 ZBL0825.62147 Chand L. Some ratio-type estimators based on two or more auxiliary variables [Ph.D. thesis] 1975 Ames, Iowa, USA Iowa State University Gupta R. K. Singh S. Mangat N. S. Some chain ratio type estimators for estimating finite population variance Aligarh Journal of Statistics 1992-1993 12-13 65 69 Abu-Dayyeh W. A. Ahmed M. S. Ahmed R. A. Muttlak H. A. Some estimators of a finite population mean using auxiliary information Applied Mathematics and Computation 2003 139 2-3 287 298 10.1016/S0096-3003(02)00180-7 MR1948641 ZBL1019.62008 Murthy M. N. Sampling Theory and Methods 1967 Calcutta, India Statistical Publishing Society MR0474578