Estimation of a Finite Population Mean under Random Nonresponse Using Kernel Weights

Nonresponse is a potential source of errors in sample surveys. It introduces bias and large variance in the estimation of finite population parameters. Regression models have been recognized as one of the techniques of reducing bias and variance due to random nonresponse using auxiliary data. In this study, it is assumed that random nonresponse occurs in the survey variable in the second stage of cluster sampling, assuming full auxiliary information is available throughout. Auxiliary information is used at the estimation stage via a regression model to address the problem of random nonresponse. In particular, auxiliary information is used via an improved Nadaraya–Watson kernel regression technique to compensate for random nonresponse. $e asymptotic bias and mean squared error of the estimator proposed are derived. Besides, a simulation study conducted indicates that the proposed estimator has smaller values of the bias and smaller mean squared error values compared to existing estimators of a finite population mean. $e proposed estimator is also shown to have tighter confidence interval lengths at 95% coverage rate. $e results obtained in this study are useful for instance in choosing efficient estimators of a finite population mean in demographic sample surveys.


Introduction
Many authors such as [1][2][3][4] have looked at estimation of a finite population mean in the presence of nonresponse using various assumptions. However, the estimators developed in these studies need improvements on the efficiency and the bias. In the sequence of improving estimation of a finite population mean in the presence of random nonresponse, an improved Nadaraya-Watson kernel regression estimator is proposed in this study. e improved Nadaraya-Watson kernel regression technique was first fronted by [5]. To compensate for random nonresponse, auxiliary information is used in this study via an improved Nadaraya-Watson kernel regression technique due to [5].
An improvement of the Nadaraya-Watson estimator [6,7] has been proposed by [5] using local bandwidth factor λ ij determined using [8] algorithm. e improved Nadaraya-Watson estimator is given by where b is a smoothing parameter while the local bandwidth factor λ ij is given by where a is an arithmetic mean given by a � n i�1 m j�1 m(X ij )/mn while α is a sensitivity parameter which satisfies 0 ≤ α ≤ 1. It has been suggested by [8] that taking α � (1/2) produces good results.

The Proposed Estimator of Finite Population Mean Using Improved Nadaraya-Watson Kernel Regression Technique
Consider a finite population of size N consisting of M clusters with N j elements in the j th cluster. A sample of m clusters is selected so that n 1i units respond and n 2i units fail to respond. Let y ij denote the value of the survey variable y for unit j in cluster i, for i � 1, 2, . . . , N, j � 1, 2, . . . , N i , and let the population mean be given by e proposed estimator is given by where y ij is an estimator of the nonresponse component of the sample. Assuming auxiliary information X ij is known throughout, y ij can be obtained using the improved Nadaraya-Watson regression technique by so that the estimator of the finite population mean can be rewritten as A special case where n 1i � n 2i � n is assumed in this study.
is simplifies mathematical computations so that equation (7) can be rewritten as where m INW (x ij ) is the improved Nadaraya-Watson kernel regression estimator given in equation (1), which is a weighted sum of the values of the survey variable Y ij . Data are generated using a regression model given by where m(·) is an unknown smooth function of auxiliary random variables X ij . It is assumed that the error term e ij satisfies the following conditions: Var e ij � σ 2 ij , Cov e i , e j � 0, Hence, the unspecified function of the auxiliary random variables m(x ij ) is replaced by the improved Nadaraya-Watson kernel estimator m INW (x ij ). e estimator can be rewritten as where ) are the improved Nadaraya-Watson kernel weights, where K(·) is a given kernel function assumed to be symmetrical. Since the choice of the kernel function is not critical for the performance of the kernel regression estimator, a simplified Gaussian kernel with mean 0 and variance 1 is used in this study. is is given by In this case, the improved Nadaraya-Watson kernel estimation at any point x ij is given by where b is the bandwidth while λ ij is given in equation (2) due to [5]. is provides a way of estimating the nonresponse values of the survey variable Y ij , in the i th cluster given the auxiliary values x ij , for a specified kernel function.

e Asymptotic Bias of the Proposed Estimator.
e expected value of the proposed estimator is given by Rewriting equation (5) using the property of symmetry associated with Nadaraya-Watson estimator, Following the procedure by [9], equation (14) can be rewritten as where g(x ij ) is the estimated marginal density of auxiliary variables X ij . e bias of the estimator can be written as which reduces to Rewriting the regression model given by and substituting it in equation (15) gives Hence, the first term in equation (18) before taking expectation is given as 1 Mn Simplifying equation (21), the following is obtained: 1 Mn where Taking conditional expectation of equation (22) leads to (25) e following theorem due to [10] and applied by [11] was used in obtaining asymptotic bias and variance of the estimator using conditional expectations.

Theorem 1. Let K(w) be a symmetric density function with
wk(w)dw � 0 and w 2 k(w)dw � k 2 . Assume n and N increase together such that (n/N) ⟶ π with 0 < π < 1.
Using this theorem, the asymptotic bias and variance are derived in the following sections. From the conditions of the error term stated in equation (9), it follows that can be obtained as follows: Using substitution and change of variable technique given by Journal of Probability and Statistics equation (26) can be simplified to Using Taylor's series expansion about the point x ij , the k th order kernel can be derived as follows: Similarly, Therefore, expanding equation (28) up to order o((λ ij b) 2 ) and simplifying gives Using the conditions due to [10] given by (31) can further be simplified to obtain Hence, the expected value of the second term in equation (25) then becomes Simplifying equation (33) gives where Using the equation of the bias given in (16) and the conditional expectation in equation (25), the following equation for the conditional bias of the estimator was obtained: In the next subsection, the asymptotic variance of the estimator is also derived.

Asymptotic Variance of the Proposed Estimator.
Using equation (7), the conditional variance of the estimator is given as where m INW (x ij ) is given by is the estimated marginal density of auxiliary variables X ij ; for details see [6,7]. Rewriting the regression model From equation (24), Hence, where Expressing equation (40) in terms of expectation, the following equation is obtained Using the fact that the conditional expectation E(e ij /X ij ) � 0, the second term in equation (41) reduces to zero. erefore, where E(e ij /X ij ) 2 � σ 2 ij . Let X � X ij and x � x ij and make the following substitutions: so that

Journal of Probability and Statistics
Using the change of variables technique and simplifying, equation (44) reduces to Following the same procedure for getting the variance of can similarly be obtained as follows: Equation (46) can be rewritten as where X � (λ ij b)w + x so that dX � (λ ij b)dw. Changing variables and applying Taylor's series expansion about the point x ij leads to which gives Following the procedure by [12] and simplifying, equation (49) reduces to For large samples, as n ⟶ N, m ⟶ M, and b ⟶ 0, then mn(λ ij b) ⟶ ∞. Hence, the variance in equation (49) Substituting equation (45) in equation (52) yields the following:

Mean Squared Error of the Proposed
Estimator. e conditional MSE of the estimator of the finite population mean combines the conditional squared bias and the conditional variance of the estimator, that is, where H(w) � K(w) 2 dw and d k � w 2 K(w)dw. From equation (55), it is noted that if the sample size is large, that is, as n ⟶ N and m ⟶ M, the MSE of Y INW due to the kernel tends to zero for a sufficiently small bandwidth. e estimator Y is therefore asymptotically consistent since its MSE converges to zero in probability.

Simulation Study
A simulation experiment was conducted using R code in order to compare the performance of the proposed estimator in two-stage cluster sampling with the transformed estimator due to [13] and the nonparametric regression estimator due to [14]. An asymptotic framework is used where both the population number of clusters and the sample number of clusters are large. e number of clusters within each cluster N i is held constant so that no cluster dominates the population.
Both linear and nonlinear mean functions of auxiliary random variables due to [14] were considered in generating data, where x ∈ (0, 1). e equations of the mean functions used in simulating the data are given in Table 1. e population auxiliary values x ij of size M � 2000 are generated as identical and independently distributed uniform (0, 1) random variables. e survey values are only known for the respondents in the selected sample. Using the auxiliary values, the nonresponse values are generated, that is, for every generated value x ij , i � 1,2, ... , M; j � 1, 2,... , N i , the mean survey nonresponse values are generated as where e ij are identically and independently distributed normal random variables with mean zero and variance one. Besides, a Gaussian kernel with mean zero and variance one was used. A Gaussian kernel was used since it has smooth and continuous derivatives at every data point. Besides, an optimal bandwidth generated using cross-validation technique due to [15] was used. It has been noted by [15] that this bandwidth would lead to more informative estimates compared to other choices. e local bandwidth factor λ ij given in equation (2) was generated using the algorithm due to [8].
At stage one, a sample of clusters is generated first by simple random sampling using a sample of size m � 200. At stage two, subsamples of elements within every selected cluster are generated by simple random sampling with replacement using a random sample of size n i . e nonresponse mean survey values were then generated using equation (56). e estimates of the finite population mean were then computed using the estimator in equation (7). e values of bias and mean squared error values were also computed.
e 95% confidence intervals were then constructed for the estimators of the finite population mean for comparative purposes.

Simulation Results
e values of the bias, mean squared error, and confidence interval lengths are given in the following tables. Note that Y INW is the estimator of the finite population mean proposed in this study and Y TDM is the transformation of data method estimator of the finite population mean due to [13] whereas Y REG is the nonparametric regression estimator due to [14]. Both Y TDM and Y REG were used for comparative purposes with the proposed estimator. e biases of the estimators considered are presented in Table 2. Negative values of the bias imply underestimation while positive values of the bias indicate overestimation of the finite population mean by the different estimators. e proposed estimator has relatively smaller values of the bias followed by transformation of data method estimator due to [13]. e nonparametric-based estimator due to [14] has larger values compared to the other two estimators. It is also observed that the three estimators have relatively closer values of the bias in the quadratic mean function though the transformation of data method has positive bias at this mean function. Generally, among the three estimators of the finite population mean, the proposed estimator using the improved Nadaraya-Watson kernel regression technique performs better than the other two estimators in terms of bias.
Mean squared error combines both the variance and the squared bias terms of an estimator. e mean squared error values presented in Table 3 were simulated using the different mean functions indicated. e quadratic mean function gives the smallest value of the mean squared error of the proposed estimator followed by the linear function.
e estimator due to [14] has the largest value of the mean squared error in the jump function. Generally, it is noted from Table 3 that the mean squared error values for the proposed estimator are relatively smaller than the rest of the estimators considered. e transformation of data method estimator due to [13] follows closely in the second place with smaller mean squared error values compared to nonparametric regression-based estimator due to [14]. From this comparison of the mean squared error values, it can be concluded that the proposed estimator is more efficient than the other two estimators considered. It has got smaller MSE values in all the mean functions and thus outperforms the others in terms of efficiency. e 95% upper and lower confidence intervals were constructed for the estimators of the finite population mean. Confidence interval lengths were then obtained. e results are given in Table 4. From the values obtained, it is noted that the confidence interval lengths for the proposed estimator are much tighter than those of the estimators due to [13,14]. Hence, at 95% level of confidence, the estimator proposed in this study performs better than its rival estimators.

Conclusion
is study has developed an estimator of the finite population mean in two-stage cluster sampling, assuming random nonresponse occurs in the survey variable in the second stage of cluster sampling. Complete auxiliary    Journal of Probability and Statistics information is assumed to be available in both stage one and stage two of cluster sampling. Kernel weights developed using the improved Nadaraya-Watson regression technique were used in the estimation process. e theoretical properties of the proposed estimator such as asymptotic bias, variance, and mean squared error were derived. Simulation results show that the proposed estimator has smaller values of the bias, smaller mean squared error values, and tighter confidence interval lengths compared to the other estimators. erefore, the estimator of the finite population mean proposed in this study dominates the estimators due to [13,14], respectively.

Data Availability
e data used to support the theoretical findings were generated via simulation using R statistical package.

Disclosure
e abstract was to be presented in a conference organized by the World Academy of Science, Engineering and Technology, but due to financial constraints, participation and presentation in the conference was withdrawn and the organizer was informed accordingly.

Conflicts of Interest
e authors declare that they have no conflicts of interest.