Estimation of Finite Population Mean in Multivariate Stratified Sampling under Cost Function Using Goal Programming

In practical utilization of stratified random sampling scheme, the investigator meets a problem to select a sample that maximizes the precision of a finite population mean under cost constraint. An allocation of sample size becomes complicated whenmore than one characteristic is observed from each selected unit in a sample. In many real life situations, a linear cost function of a sample size h is not a good approximation to actual cost of sample survey when traveling cost between selected units in a stratum is significant. In this paper, sample allocation problem in multivariate stratified random sampling with proposed cost function is formulated in integer nonlinear multiobjective mathematical programming. A solution procedure is proposed using extended lexicographic goal programming approach. A numerical example is presented to illustrate the computational details and to compare the efficiency of proposed compromise allocation.


Introduction
It is common practice in sample survey related to agriculture, market, industries, and social research, and so forth that usually more than one characteristic is observed from each sampled unit of population.Stratified random sampling is more suitable than other survey designs used for obtaining information from heterogeneous population for reasons of economy and efficiency.The theory of stratified random sampling deals with the properties of estimator constructed from stratified random sample and with the best (optimum) choice of sample size to be selected from various strata either to maximize the precision of constructed estimator for a fixed cost or to minimize the cost of survey for fixed precision of estimator.The sample sizes selected according to above criteria are known as "optimum allocation." In general, variance of study variate varies from stratum to stratum that provides basis for selecting optimum sample size.
Tschuprow [1] and Neyman [2] independently proposed an allocation procedure that minimizes variance of sample mean under a linear cost function of sample size  ℎ in stratified random sampling scheme.Neyman [2] used Lagrange multiplier optimization technique to get optimum sample size for single variable under study.In stratified sampling, sample allocation problem becomes complicated when more than one characteristic is observed from each selected unit of a finite population.An allocation which is optimum for single characteristic may not be optimum for others unless the characteristics are highly correlated.There is need to use some compromise allocation criteria which produce an optimum allocation for all characteristics in some sense, for example, an allocation that minimizes the trace of variancecovariance matrix of the estimator of population mean or an allocation that minimizes the weighted average of variances or an allocation that maximizes the total relative efficiency of the estimators as compared to corresponding individual optimum allocation (Varshney et al. [3]).Many authors such as Dalenius [4,5], Ghosh [6], Folks and Antle [7], Chromy [8], Bethel [9], Jahan et al. [10,11], Khan et al. [12], Khan et al. [13,14], Ansari et al. [15], Khan et al. [16], and Varshney et al.

2
Journal of Applied Mathematics [17] used different compromise criterion to solve allocation problem in stratified random sampling scheme.
The cost of survey is an important factor of sample allocation to various strata.The linear cost function used in stratified sampling is given as where  denotes total budget available for survey,  ℎ for ℎ = 1, 2, . . .,  represents measurement per unit cost in the ℎth stratum,  0 represents fixed cost of survey, and  ℎ is number of sample units selected in the ℎth stratum.In many practical situations, measurement unit cost and travel cost within strata are important factors of survey cost.The nonlinear cost function including measurement unit cost and traveling cost within strata is good approximation to actual cost of survey.Beardwood et al. [18] suggested that the shortest rout among  randomly disperse destination within a region is asymptotically proportional to √  for large .Varshney et al. [17] used nonlinear cost function for large sample size given in (2).Consider where  ℎ is travel cost within ℎth stratum.The problem of finding the shortest rout among  ℎ selected units in ℎth stratum is often called the "shortest rout problem" in the operation research literature.If rout map and its length is given for each strata, we find shortest rout among  ℎ units within strata that  ℎ is either small or large.This shortest rout is used for practical purpose with confidence (Beardwood et al. [18]).Consider following proposed nonlinear cost function: where Ć =  −  0 and  represents the effect of travel within strata to cost function.The value of  is determined by solving shortest rout problem using methods discussed by Hiller and Lieberman [19].The cost function in (2) becomes particular case of our proposed cost function given in (3) if  = 0.5.Generally, Lagrange multiplier technique (LMT) is used to determine sample size.However, the constraint 2 ≤  ℎ ≤  ℎ , where  ℎ (ℎ = 1, 2, 3, . . ., ) is an integer neglected in using LMT.For integer value of sample size  ℎ , rounding rule is used which may lead to violating the optimality or feasibility conditions (or both).We need integer value of sample size  ℎ for practical implementation.Therefore, the authors did not try to use LMT and used integer programming for integer value of strata sample sizes  ℎ .
In this paper, we discuss compromise allocation based on minimization of coefficients of variation of regression estimators of population mean in multivariate stratified random sampling design under proposed nonlinear cost function (3).The problem is formulated in multiobjective integer nonlinear programming.The extended lexicographic goal programming technique is applied to solve formulated allocation problem.The GAMS-AlphaECP Rosenthal [20] optimization software is used to solve numerical example which illustrates the computational detail of allocation procedure.

Formulation of the Problem
Consider a population of  units divided in to  mutually exclusive strata of size  ℎ (ℎ = 1, 2, . . ., ) such that ∑  ℎ=1  ℎ = .The simple random sample of size  ℎ is drawn from each stratum independently.Suppose we observe   ( = 1, 2, . . .,  ℎ ,  = 1, 2, . . ., ),  ≥ 2, characteristics from each unit in ℎth stratum and estimate population mean of  ≥ 2 characteristics.Let  ℎ and  ℎ be the sample means and  ℎ and  ℎ the population means of study variable  ℎ and auxiliary variable  ℎ , respectively, of th characteristics in the ℎth stratum. 2 ℎ and  2 ℎ are population variance and  ℎ is population covariance between the th study and auxiliary variable in the ℎth stratum. ℎ =  ℎ / 2 ℎ and  ℎ =  ℎ / 2 ℎ are sample and population regression coefficients and  ℎ =  ℎ / is stratum weight.
The mean square error (MSE) of  , is given as where If we ignore the second term in RHS of (6) because it is independent of sample size  ℎ , then Since different characteristics are measured with different units, we need to use an estimate which should be independent of measurement unit.Therefore, coefficient of variation is used instead of mean square error; that is, where A sample size  ℎ is determined under proposed nonlinear cost function in (3) that minimizes coefficients of variation of the estimator of population mean for each characteristics   ( = 1, 2, . . ., ).This problem may be formulated in multiobjective integer nonlinear programming as in (12).Consider where  represents the feasible region that fulfills all constraints and sign restrictions.Any solution that exists within feasible region is implementable in practice.

Extended Lexicographic Goal Programming
Romero [21] proposed extended lexicographic goal programming method that provides a general framework which covers and allows the mixture of most common method of solving multiobjective decision making problems.It is also encompasses distance based multicriteria decision making technique.Romero [22] Let  *  be the individual optimum values of   obtained by solving above problem.These optimum values  *  specify objectives and try to achieve these objectives using multiobjective mathematical programming.Let Ẑ be values of objectives obtained by applying multiobjective optimization method.It is obvious that Ẑ ≥  *  or Ẑ −  *  ≥ 0 is the increase in   due to compromise among objectives using compromise criterion.Suppose this increase is  +  ≥ 0. To achieve these specified objectives, we must have or In goal programming method, we minimize the deviations  +  using additional constraint equation (15).To solve multiobjective allocation problem (12), the extended lexicographic goal programming has following mathematical model: where  is a constant that can assume minimum value zero and maximum value one. +  is positive deviational variable.

Some Other Compromise Allocations
In this section, some other compromise allocations are discussed for the sake of comparison with the proposed allocation.
Cochran's compromise allocation is given by where ℎ is the relative weights proposed by Khan et al.
The allocation problem formulated in multiobjective integer nonlinear programming is Minimize ( Subject to  3.

Proposed Compromise Allocation.
We used extended lexicographic goal programming model ( 16) for sample allocation to different strata taking into account two characteristics subject to ℎ (ℎ = 1, 2, 3, 4)  are integer Let Ẑ1 and Ẑ2 be the coefficients of variation at various values of constants  and Ć under proposed allocation given in Table 6.

Khan et al. Compromise Allocation.
We have applied model (18)

Discussion
In this section, a comparative study of proposed compromise allocation with Cochran compromise allocation, Khan et al. compromise allocation, and individual optimum allocation has been made.The comparison is based on trace of variancecovariance matrix of the estimates of finite population means under compromise allocations.We assume that characteristics are independent; therefore, covariances are zero.Table 3 gives a individual optimum allocation.Tables 4 and 5 give Cochran compromise allocation and Khan compromise allocation as discussed in Section 5.The proposed compromise allocation is given in Table 6.Table 4 shows that Cochran compromise allocation gives high trace values for  = 1, 1.5 as compared to proposed compromise allocation given in Table 6.For  = 0.5, 2.0, Cochran compromise allocation gives slightly low value of trace but is infeasible because corresponding cost exceeds the available cost.Table 5 shows that Khan et al. compromise allocation gives higher trace values than proposed compromise where   is the value of trace using individual optimum allocation and   is the value of trace using proposed compromise allocation.Table 7 shows that proposed compromise allocation provides more efficient estimates of population means as compared to individual optimum allocation.

Conclusion
On the basis of the comparison made in Section 6, we can conclude that the extended lexicographic goal programming approach always secures a feasible solution which is not granted Cochran's compromise method and it provides better results comparative to Khan et al. compromise approach and individual optimum allocation approach from the point of view of efficiency.

𝑌 1
denote the quantity of corn harvested in 2002;  2 denote the quantity of oats harvested in 2002;  1 denote the quantity of corn harvested in 1997;  2 denote the quantity of oats harvested in 1997.The data summary is given as  1 = 474973.90, 1 = 405654.19, 2 = 1576.25,and  2 = 2116.70.The detailed summary of data is given in Tables extended this work to make more general form of objective function.It is a technique used by decision makers for optimizing more than one objective under some constraints.In goal programming, all specified objectives are included in the model.The decision maker tries to minimize the potential deviations from specified objectives.Consider the following individual optimum problem: ℎ  ℎ are integers,  ℎ ∈ .ℎ = 1, 2, . . ., ,  = 1, 2, . . ., .

Table 2 :
Data summary. 2 are coefficients of variation under individual allocation at different values of  and Ć given in Table * The values Ẑ1 and Ẑ2 are the coefficients of variation under Khan et al. compromise allocation obtained by solving above model at different values of constants  and Ć given in Table5.

Table 7 :
PRE of proposed compromise allocation to individual optimum allocation.