Calibration Approach for Variance Estimation of Small Domain

Department of Statistics, Faculty of Science & Technology, Mahatma Gandhi Kashi Vidyapith, Varanasi, India Department of Mathematics and Statistics, International Islamic University, Islamabad, Pakistan Department of Mathematics and Statistics, PMAS-Arid Agriculture University, Rawalpindi, Pakistan Departmentof Statistics, Federal Urdu University of Arts, Science and Technology, Islamabad, Pakistan Department of Economics, Maasai Mara University, Narok, Kenya


Introduction
Generally, the sample surveys are applied to estimate the population parameters. However, this study is interested in estimating the subpopulations total. If these subpopulations are classi ed through some characteristics, like socio-economic, geographical regions may also term to be the domains. e estimate of these subpopulations (domains) total has become a very popular and e ective tool when framing the program and policies of the government and private sectors. Hence, subpopulations demand has been growing acceleratory for a couple of years. Purcell and Kish [1] have classi ed the subpopulation due to the size with respect to population. e simple classi cation of the subpopulations is as follows: Major subpopulation: it comprises 1/10 of the population or more. For example, the major geographical regions like (north, east, west, south, central), 10-year age group, or major classes like occupations.
Minor subpopulation: it comprises between 1/10 to 1/ 100 of the population, for example, state population, single-year age, two-fold classi cation like education and occupations.
Mini subpopulation: it comprises between 1/100 to 1/ 10000 of the population, for example, the population of the counties (more than 3000 in the U.S.A) or three fold classi cations like age, education, and occupation.
Rare subpopulation: it comprises less than 1/10000 of the population. For example, health services regions are classi ed into local regions of residence. e subpopulation total estimate can be used to estimate the social exclusion and well-being levels. Also, some of the environmental and epidemiological issues can be solved through subpopulation estimates. e consequence of the domain estimate with the popular direct and indirect methods has been explained in the literature (see Rao [2] and Singh [3]). Rahman [4] has been given the direct and indirect estimation with the model-based ideas. e availability of units in the study subpopulation depends on the method. Hence, in such a situation we used the indirect method. In the indirect method, the sample selected from the population (which consists of the subpopulations) than the subpopulation. e surrounding units are utilized when the units in the study subpopulation may be low or near zero for some subpopulations. is is due to the restriction of the units in the study subpopulation so, utilized the surrounding subpopulations units. If the surrounding units are similar in nature to the subpopulation, then the estimate gives a precise result that is acceptable at the desired level. e indirect estimates for subpopulations parameters like total have been explained through an auxiliary variable by Rao and Molina [5]. Singh et al. [6] have estimated the poverty indicators, many socio-economic indicators, and food insecurity for subpopulations total. In the indirect method, the availability of the auxiliary variable for the subpopulation has significant importance to Tikkiwal et al. [7]. e ratio and regression estimators through the calibration approach for population total have been discussed by numerous authors, such as Särndal [8], Singh and Mohl [9], Singh et al. [10], Wu and Sitter [11]. e extension of the calibration-based mean estimation through an auxiliary variable has been discussed by Koyuncu and Kadilar [12] and Koyuncu [13]. Furthermore, the estimation of the subpopulation parameter by Khare et al. [14]. e calibrate estimator was initiated by Horvitz and ompson [15] on the study variable Y, and estimate the population total through simple random sampling without replacement designs. Deville and Särndal [16] have incorporated the auxiliary variable under the restriction of minimum chi-square distance. Särndal [8] has estimated the variance of the ratio and regression estimators for the population parameter. e mean square error using model based approach has been given by Slud and Maiti [17]. Whenever, recently classes of the estimators for variance estimation have been discussed by Bhushan et al. [18,19]. We are motivated by Sarndal [8] and hence propose the calibration estimate for the subpopulation total as well as their variance. e rest of the article is constructed as follows: firstly, the Methodology employed is explained. Also, evaluate the variance of the proposed estimator using a calibration approach with the low and high levels, respectively. en, a class of calibration estimators is presented. Furthermore, theoretical and empirical comparisons along with Concluding Results are presented and summarized. Finally, Recommendations and Applications are outlined.

Methodology
Consider the a th subpopulation Select a sample from the population in which the selected sampling units of a th domains are S a S a ⊂ Ω a . e overall sample of size n is A a�1 n a � n. A sample is selected by the population rather than the subpopulation. e auxiliary variable of the subpopulation should be known. Hence, the auxiliary information for the proposed estimator is utilized. In the current work, both ideas of Deville and Särndal [16], and Tikkiwal et al. [7] are used in the same context. e proposed indirect generalized regression (GREG) estimator of a th subpopulation total is written as follows: where t i ; i � 1, . . . , n represent the updated weight which is close to the design weight d i . e utilization of the auxiliary variable can be a good option in the calibration estimator of the a th domain. Also, a th subpopulation total of the auxiliary variable is equal to the sum of the n th units equal to the subpopulation total of auxiliary variable X. e auxiliary equation of the subpopulation total of X is (2) e minimum chi-square distance function is where q i is a chosen constant. Utilize (2) and (3), the calibration equation can be written as follows: Partially differentiate (4) w.r. to new weights t i,a , Simplifying (5), then Substitute t i,a in the auxiliary equation (2), we obtain the value of δ which further substitute in (5), we obtain the new weight as follows: e new weight t i,a substitute in (1), the proposed estimator will be

Variance of T PD,a through Low-Level Calibration Approach
Särndal et al. [20] provided the variance of the GREG estimator. Zaman and Bulut [21] recently discussed the variance of ratio estimators. e proposed estimator is like a regression estimator. For obtaining the variance of the population's total estimator of Deville and Särndal [16], the sample is taken based on Yates and Grundy [22] where e proposed estimator is an indirect estimator for subpopulation hence we take the asymptotic concept of an unbiased estimator. e variance of the subpopulation total is written as follows: where e i,a � y i,a − βX a , β and D ij as defined previously. e members of the proposed estimator are as follows: (I) For q i � 1, the proposed estimator in (8) reduced to the GREG estimator, as follows: (8) will be ratio estimator, as follows: e variance of T P D,DS ,a can be obtain with the help of (9). e variance of the proposed estimator where e variance of the ratio estimator and e value of GREG is under the simple random sampling without replacement. e probability of i th unit is selected π i � n/N, the probability of j th unit is selected π j � n/N and the probability of both i th and j th units are selected π ij � n(n − 1)/N(N − 1). e variance of the low-level calibration of the GREG estimator of the subpopulation total.
where e i,a � y i,a − βX a , f � n/N and the variance of the ratio estimator that given in (15) is where e i,a � y i,a − y/xX a . e indirect regression estimator is We used the idea of Deng and Wu [23] and estimated the variance of the subpopulation total. e variance of the lowlevel calibration of ratio estimator V L (T YG,2,a ) can be estimated for the subpopulation using equation in (15).
where e i,a � y i,a − βX a and write (X a /X) h up to second order, neglect higher order due to a small value If substitute h � 2 in (19) the then variance of the ratio estimator.
We obtain the variance of the linear regression estimator which is a special case of the estimator of the variance of ratio estimator of domain total. If the subpopulation total is equivalent to the population total that is (X a /X) � 1, then it reduces into the linear form of the class of estimators of (19). e variance of the regression estimator is written as follows: e variance of low-level calibration V L (T YG,1,a ) of the regression estimator is more efficient than the ratio estimator when (X a /X) > 1. However, for (X a /X) < 1 always variance ratio estimator of the low-level calibration approach is lower than the value of the regression estimator of the low-level calibration approach. e variance of the GREG estimator is equal to the class of estimates of Deng and Wu [23].

Variance of T PD,a through High Level
Calibration Approach e high-level calibration is the adjustment of the weight function of the selected units of i th unit, j th unit, and both i th and j th units. We estimate the variance and checked the variance of the high-level calibration approach of ratio and where Ω ij is the new weight which is very close to D ij . e simple calibration equation where a ) 2 is known variance with the auxiliary information for each subpopulation total should be known X a � N a i�1 x i . is is an auxiliary equation (see (3)) of the Horvitz-ompson which is written as X HT � n i�1 d i x i . We estimated the value of the GREG estimator for the subpopulation total. Utilize the information of the auxiliary variable of the census registers, previous survey value, or administrative registers. e estimation of the variance of regression and ratio estimators have been given by Singh and Srivastava [24], Srivastava and Jhajj [25], Isaki [26], and Wu [27]. Fullar [28] has given the adjustment the weight for the regression estimator. e minimum chi-square distance function of the new weights Ω ij and weight of i th and j th units D ij (i,j�1,2,...,n) from the population where L ij is the chosen constant. e estimated value of D under the restriction of (23) is where e optimum value of the Lagrange's multiplier λ a with the help of (23) and (24) is written as follows: Substitute the value of Ω ij from (26) in (22). e GREG estimator is e members of calibration based approach of higherlevel calibration approach are (I) If we put the weights L ij,a � 1/(d i x i,a − d j x j,a ) 2 within (28), then the variance of the ratio estimator through high-level is where 2 , the sample variance is an asymptotic unbiased estimate of the population variance. (II) For L ij,a � 1, the equation (28) will be the variance of the regression estimator where (31) Equation (31) shows that the high-level calibration estimator is different from Deng and Wu [23]. e GREG and ratio estimators are members of the class of estimators of Srivastava and Jhajj [25] where F (., .) is the function such that F(1, 1) � 1 holds the certain regularity conditions is better than the low-level calibration estimates, Srivastava and Jhajj [25] and Deng and Wu [23].

Class of Calibration Estimators
is section is presenting the class of estimators. A class of estimators is a collection of various estimators under certain regularity conditions that give the same variance. We assume r � X a / n i�1 d i x i and s � V(X HT )/V(X HT ). e variance of a class of estimators of the sub-population total is where the function F(r, s) is of r and s. For example, F(1, 1) � 1 possesses the following regularity conditions: (1) Function F(r, s) exists for all the values of (r, s) which contain the points (X a / n i�1 d i x i , V(X HT )/V(X HT )) inbound subset of two dimensional real spaces.
(2) First and second-order partial derivatives of the function F(r, s) exist and are also continuous and bounded.
Different members of the class of estimators are exists under regularity conditions. However, three members are e value of η and κ are depending on the estimated value. e asymptotic variance of the three estimators will be the same as Srivastava and Jhajj [25] and Singh and Singh [29]. Our proposed estimator is better than the Srivasatava and Jhajj [25] and Singh and Singh [29], hence also better than the class of estimator under the regularity restrictions 1 and 2.

eoretical Comparison.
e theoretical comparison is given due to keeping in mind, the efficiency of the high-level and low-level indirect estimated variances. e high-level variance of the ratio estimator for the subpopulation total is e estimated variance of the ratio estimator through the low-level calibration approach for a th subpopulation total is written as follows: Now, from both (37) and (35), the variance of the highlevel is lower as compared to the variance of the low-level of ratio estimate for S 2 Furthermore, a comparison of the high-level variance estimate of GREG estimator is e low-level variance estimate of the GREG estimator is With help of (37) and (38), the restriction is that last term (s 2 We can say that the high-level is more efficient than the low-level calibration approach of ratio and regression estimators.

Empirical Study
is section presents an empirical comparison based on the simulation. We take a real data set from Sarndal et al. [30] where the population of 1975 is considered as an auxiliary variable and population of 1985 is considered as a study variable.
e subpopulation is considered from Sweden's municipalities regions. However, we take only five regions (subpopulation) 1, 2, 3, 7, and 8 out of the existing eight subpopulations 1, 2, 3, 4, 5, 6, 7, and 8. e study subpopulations have units 25, 48, 32, 15, and 29. For the simulation, we select a random sample of approximately 10%, 20%, and 30% units from the population by study and auxiliary characters y and x, respectively. is process is repeated to the finite times and then obtained the estimated error. Equations (34)-(38) are utilized to obtain the variance of the ratio estimator and generalized estimator of a th subpopulation with low-level and high-level calibration. e variances of the ratio estimator through high-level calibration are lower as compared to the low-level calibration estimate for all subpopulations. e high-level estimated value is lower as compared to low-level calibration with the sizes 15, 30, and 45, (4.5 to 10.43), (1.75 to 9.5), and (2.53 to 7.1) percentage for domains, respectively. It is observed that the low-level calibration estimate variances are decreasing with the sizes increase from 15 to 45. A similar pattern is also observed for the high-level calibration approach. e estimated value of the variance of the GREG estimator through the high level is smaller than the low-level estimate for all the subpopulations. e high-level estimated value is lower than the low-level GREG with sizes 15, 30, and 45 in terms of percentage are (2.4 to 12.55), (1.78 to 8.8) and (2.32 to 10.41) for subpopulations, respectively. e width of the generalized is higher than the ratio estimator for the sub-populations. e GREG through low-level calibration estimate is decreasing with the variances when their sizes increase from 15 to 45. e variance of the ratio and regression estimators with low-level and high-level calibration approach on 15, 30, and 45 units are given. Figures 1-3

Concluding Results
Tables 1 and 2, and Figures 1-6 show that the high-level calibration estimate of ratio is preferred over than low-level calibration estimate of the subpopulations 1, 2, 3, 4, and 5. e regression estimate through calibration approach ratio is more effective than the generalized estimate. e low-level estimate is poor performance than the high-level estimate in both the estimators. e regression estimate is a higher length interval of the variance for the subpopulations than the ratio estimate. e discussed estimator generalized is a member of the generalized estimate of Srivastava and Jhajj [25]. e ratio estimate of high-level calibration is superior to Deng and Wu [23]. Based on the theoretical and empirical findings, we can conclude that the proposed estimate is more efficient than the regression estimates proposed by Srivastava and Jhajj [25] and Deng and Wu [23] for subpopulations 1, 2, 3, 4, and 5.

Recommendations and Applications.
e following recommendations are given as follows: (1) e recommendation points have analysis through theoretical, and empirical for the ratio and generalized estimates for subpopulation total. (2) e present estimate is utilized when the domain total of the auxiliary variable is available but the number of units in the subpopulation is small. (3) e indirect estimate value also depends on how much the subpopulation value of the auxiliary variable is closed to the estimated sample of the population.
(4) e subpopulation estimates of the variance of the high-level are a better option than the low-level variance estimates of ratio and regression estimates for the subpopulations. e high-level variance estimate of the ratio estimator can be introduced for the problems related to the health-related problems, environmental issues, and welfare programs like epidemiological issues, estimates for areas that are similar to estimates for those areas which are other parts.
Data Availability e data is included within the study for finding the results.

Conflicts of Interest
e authors declare that there are no conflicts of interest.  Mathematical Problems in Engineering 7