Estimation of Finite Population Mean in Simple and Stratified Random Sampling by Utilizing the Auxiliary, Ranks, and Square of the Auxiliary Information

In this article, estimating the finite population means under simple random and stratified random sampling schemes. Our proposition is based on the notion of using auxiliary information in a more rigorous fashion. Specifically, we use ranks and squared values of the auxiliary information in addition to observed values of the auxiliary variable. The applicability of the proposed family of estimators is demonstrated by considering real data sets coming from diverse fields of applications. Moreover, the performance comparison is conducted with respect to a recently proposed family of estimators. The findings are encouraging and superior performance of the suggested family of estimators is witnessed and documented throughout the article.


Introduction
In this age of aggressive ow of information, the notion of using auxiliary information under the argument of maximum use of available information is well cherished. However, the applicability of supplementary information to enhance the e ciency of estimation procedures estimating the attributes of the population under study has a rich history in the multidisciplinary research literature. e advocacy of the utility of supportive information to assist the more elegant resolve of the estimation problem in hand can be tracked to Pierre-Simon Laplace-an eminent name of the eighteenth century academic circles. While trusted with the sensitive task of estimation of the total population of the eighteenth century France he advised " e register of births, which are kept with care in order to assure the condition of the citizens, can serve to determine the population of great empire without resorting a census of its inhabitants. But for this it is necessary to know the ratio of population to annual the birth." see [1]. e legitimacy of the aforementioned abstract idea can be witnessed through streams of research, fundamentally aiming to advance the theoretical and methodological frontiers dealing with the incorporation of additional information. For example, the seminal work of [2] instigated the idea of exploiting the underlying correlation structure deriving both the study variable and auxiliary variable. Over the time, many researchers have paid tribute to the notable contribution of [2] by proposing useful amendments into the original doctrine. For example, [3] proposed the expression for product estimator capitalizing on the exploitation of the negative degree of correlation prevalent between the study variable and the supportive variable. In procession, [4] provided the extensions of the classic ratio estimator and product estimator, namely, ratio-type exponential estimator and product-type exponential estimator, respectively. Yet another domain facilitating the incorporation of additional information in estimation procedure was motivated by the use of more profound functional forms known for producing estimators with minimal standard errors. Under the motivation, [5] proceeded by formulating a generalized family of exponentbased estimators encompassing numerous existing main stream estimators as members of the resultant class. For a more elaborative understanding of the ongoing research activities, one may also see [6][7][8][9][10][11]. Recognizing the utility of accurate estimating procedures, this research urges the development of a new family of estimators estimating the population means through the employment of more meticulous use of an auxiliary variable. e objectives are attained by capitalizing on the observed data, along with sample ranks and the second raw sample moment of auxiliary variable. It is noteworthy that the encapsulation of the second raw moment of the auxiliary information enables the investigators to anticipate the stochastic dynamics of the available information. Moreover, the use of ranks in association with a raw moment, covers parametric and nonparametric subtitles, simultaneously. e working of the devised mechanism is explored through the adaptation of a simple random sampling scheme and stratified random sampling framework. e applicability of the suggested formation is evaluated by employing on six diverse data sets coming from various fields of multi-disciplinary inquiries. e comparative performance of the proposed methodology is enumerated by means of rigorous mathematical and numerical pursuits. We launch a comparative investigation of the newly devised scheme with respect to [5] as they documented in their article "proposed estimator always performs better than the usual mean, ratio, product, exponential ratio, exponential product, classical regression, [6,11], and Grover and [2,8] estimators." e performance evaluation reveals the superior performance of the proposed family in comparison to the [5] family of estimators and thus outperforms the other noted estimators. In addition, our proposition accommodates [5] family as a special case and thus seals the generality of our technique. e rest of the article is arranged in seven major parts. In Section 2, we present preliminaries with reference to Simple Random Sampling (SRS) along with [5] proposed family of estimators. Section 3 is dedicated to the introduction of a proposed family of estimators, whereas the performance investigation is conducted in Section 4. Next, Section 5 documents the preliminaries when the Stratified Random Sampling (StRS) scheme was employed along with the extensions of [5] proposed family to incorporate the stratification existent in the population under study. In Section 6, we present the proposed family of estimators in the case of StRS. e performance evaluation is persuaded in Section 7, where general discussions are documented in Section 8.

Notation and Symbols.
Let Z be a finite population of N units, such as Z � Z 1 , Z 2 , . . . , Z N . We draw a sample of size n from the population through SRS without replacement (SRSWOR) scheme. Let Y i and X i are study and auxiliary variables, respectively. Moreover, let us denote ranks and squared values of auxiliary variable as R i and U i , respectively, for the ith(i � 1, 2, . . . , N) unit of the population.
Let, y � 1/n n i�1 y i and x � 1/n n i�1 x i are sample means of the study and auxiliary variable corresponding to the population means Y � 1/N N i�1 Y i and X � 1/N N i�1 X i , respectively. Similarly, let us define r � 1/n n i�1 r i as the sample mean of ranks of auxiliary variable and u � 1/n n i�1 u i as sample mean of squared values of auxiliary variable estimating the corresponding population attributes R � 1/N N i�1 R i and U � 1/N N i�1 U i , respectively. On these grounds, sample variances of study and auxiliary variables are defined as s 2 y � 1/n − 1 n i�1 (y i − y) 2 and 2 , whereas sample variability of ranks is quantified as s 2 r � 1/n − 1 n i�1 (r i − r) 2 and sample variance of squared values of the auxiliary variable is given as Furthermore, let us define coefficients of variation of X, Y, R, and U as C x , C y , C r , C u , where C y � S y /Y, C x � S x /X, C r � S r /R and C u � S u /U. We now define error terms as e 0 � (y − Y)/Y, where λ � (1/n − 1/N), commonly known as sample fraction. In the procession, the error covariances are derived as follows: E e 0 e 1 � λC y C x ρ yx , E e 0 e 2 � λC y C r ρ yr , E e 0 e 3 � λC y C u ρ yu , E e 1 e 2 � λC x C r ρ xr , E e 1 e 3 � λC x C u ρ xu , where ρ yx , ρ yr ,ρ yu , ρ xr , ρ xu , and ρ ru represents sample correlation coefficients defined as ρ yx � S yx /S y S x , ρ yr � S yr /S y S r , ρ yu � S yu /S y S u , ρ xr � S xr /S x S r , ρ xu � S xu /S x S u , and ρ ru � S ru /S r S u .

[9]
Family of Estimators. Reference [9] aided the estimation of finite population mean through the dual use of auxiliary information by proposing a general estimator as follows: where ω 1 , ω 2 and ω 3 are unknown quantities minimizing the MSE of the proposed estimator. e optimal values are simplified as under, where φ 2 yxr � ρ 2 yx + ρ 2 yr − 2ρ yx ρ yr ρ xr /1 − ρ 2 xr is the coefficient of multiple determination of Y on X and R x .
In equation (2), different settings of a and b offer different estimators and thus enables [9] of proposing a family of efficient estimators for estimating the population mean. Table 1 below comprehends the members of [9] family corresponding to various values of a and b. Reference [9] provided the expressions of bias and MSE of the family of the estimator as follows: respectively, where ϑ � aX/aX + b.

Proposed Family of Estimators
We now proceed by proposing a new family of estimators based on a more rigorous use of auxiliary information. e general expression of the proposed estimator is as follows: Where κ 1 , κ 2 , κ 3 , and κ 4 are unknown constants whose values are decided by minimizing the MSE of the proposed family of estimator, given in equation (6). Moreover, similar to that of [9], a and b can take varying values and thus provide different members of our proposed family of estimators. Table 2 presents various values of a and b and resultant estimators. Under the notion of fair comparison, we consider the same values of a and b as those of [9]. Next, we provide the calculations for bias and MSE of our proposition. By using error terms defined in Section 2.1, it is verifiable that the proposed estimator given in equation (6) is rewritable as follows: On further solving and keeping terms with second degree of e i s, we obtain the following equation:

Mathematical Problems in Engineering
By employing the expectation operator on both sides of equation (8), we attain the expression for bias as follows: e MSE of the proposed family of estimators is obtained by taking the expectation of the square of the equation (8). We obtain MSE as follows: e optimal values of κ 1 , κ 2 , κ 3 and κ 4 are found by minimizing equation (10) and are given as follows: where e minimum MSE of Y k is achieved by substituting optimal values of κ 1 , κ 2 , κ 3 and κ 4 is given by the following equation:

Performance Comparison
is section is dedicated to evaluate and compare the performance of the proposed family of estimators relative to [9] family of estimators. To show the superior performance of the proposed family of estimators with respect to [9] family numerically, we need to show that MSE min (Y Haq ) − MSE min (Y k ) > 0. By comparing MSEs Mathematical Problems in Engineering given in equations (5) and (11), we get a general expression providing the condition for superior performance of the proposed family, as follows: In the next procession, we empirically quantify the performance of all members of our proposed family (Table 2) by considering one by one comparison with members of [9] family (Table 1).

Evaluating Empirically.
e empirical performance investigation is performed by using three diverse and commonly used following data sets. Reference [9] also considered the same data sets to delineate the applicability of their proposed family.  Table 3 comprehends the performance comparison of ten members of both families presented in Tables 1 and 2. We offer percentage relative efficiencies (PREs) of each member of our family and [9] family with respect to SRS along with PREs with respect to each other. e superior performance of our proposed family is self evident in Table 3. As log as SRS is concerned, every member of both families outperform the usual estimation strategy. In the case of comparison between both families, the resulting PREs reveal a better performance of our proposed method than [9] family. ese findings are consistent for all three populations and all members of respective families.

Preliminaries with respect to StRS
Next, we demonstrate the applicability of our proposed method in the estimation of finite population mean when the sample is drawn through the StRS scheme.
For the purpose of consistency, we define Y, X, R, and U be the study variable, auxiliary variable, ranks and squared values of the auxiliary variable taking values Yih, Xih, Rih, and Uih, respectively, on the ith unit belongs to the hth stratum, where i � 1, 2, . . . , N h . us, W h � N h /N stays as the weight of hth stratum We then draw a sample of size nh from the hth stratum using the SRSWOR scheme for the estimation of population mean ensuring that the total sample size n � L h�1 n h . We now define the population mean of study variable as Next, we define expression of population variances within stratum such that 6 Mathematical Problems in Engineering where covariances are given as follows: Based on above-provided expressions, we now provide correlation coefficients when a stratified sampling scheme is used, such as For further mathematical proceeding, the relative error terms are defined as e above-mentioned expected values of errors can generally be written as follows: (18) [9] Family under StRS Scheme. We proceed by deriving a general expression of [9] proposition when the StRS method of sampling is under consideration such as,

Extending the
where w 1 , w 2 , and w 3 are unknown constants subject to the constraint of minimizing MSE. We drive the optimal values of w ′ s as follows: Furthermore, the bias and MSE of [9] family is derived as follows: respectively. Table 4 offers all members of [9] family extended to compensate the StRS scheme.

Proposed Family of Estimators for StRS
In this section, we proposed an extended version of our suggested family of estimators (equation (6)) to efficiently accommodate the underlying homogeneous structure prevalent in the population under study. e general estimator is given as follows: where κ 1 , κ 2 , κ 3 , and κ 4 are unknown constants minimizing the MSE of the proposed family. To calculate the bias while 8 Mathematical Problems in Engineering keeping e ′ s up till the second degree, we obtain the following equation: On further solving, the bias is calculated as follows: e MSE is deducted by squaring and taking expectation on both sides of equation (23). We obtain the following equation: where (29) Table 5 presents all members of our proposed family while taking into account the underlying stratification.

Performance Comparison
In this section, we advance by comparing both families, comprehended in Tables 4 and 5. To establish the efficiency of our proposed family in comparison to the [9], we need to show MSE min (Y * Haq ) − MSE min (Y * k ) > 0, which on simplification provides the general efficiency condition such as, We now proceed by empirically demonstrating the efficiency of each member of our family (Table 5) with respect to members of [9] extended family (Table 4). e objective is achieved by using three vibrant data sets. Tables 6-8 comprehend the population structures of the data sets under consideration. 7.1. Dataset 1: [13]. Y: the number of teachers and X: the number of students in both primary and secondary schools in Turkey in 2007 for 923 districts in six regions. [14]. Y: apple production amount in1999 and X: the number of apple trees in 1999. [14]. Y: apple production amount in1999 and X: the number of apple trees in 1999. Table 9 presents the performance evaluation while comparing each member of both families with the usual mean estimator and with each other for all above-mentioned data sets. As we anticipated, both families (proposed and extended Haq et al) outperform the usual mean estimation procedure in the case of the StRS scheme. Moreover, it motivating to witness the superior performance of our estimator, evident through the results of Table 9, for all data sets and for every member of the proposed family.

Discussion
is article delineates the developments on a family of estimators inherently capable of more rigorous use of auxiliary information while estimating the finite population mean. We propose a three folded use of auxiliary information where auxiliary information is supplemented through ranks and second raw moments of auxiliary variable. It is then mathematically and numerically demonstrated that the triplet use of extra information enhances the performance of the mean estimating family. e findings are perfectly align with the notion of using auxiliary information to aid the estimation of required attribute; we observe that more rigorous use of relevant information enhances the efficiency of estimating mechanism. e mathematical developments are established along the SRS and StRS methods of sampling. Furthermore, the proposition is applied to six commonly used data sets to assess the applicability of the introduced family. e performance comparison is conducted with respect to [9] suggested family of estimators. e findings reveal that more efficient use of supportive information      [9]. We anticipate that an alike strategy can be employed for the estimation of population variance but this is left as a future research topic.

Data Availability
e data sets used to support the study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.