A Stochastic Restricted Principal Components Regression Estimator in the Linear Model

We propose a new estimator to combat the multicollinearity in the linear model when there are stochastic linear restrictions on the regression coefficients. The new estimator is constructed by combining the ordinary mixed estimator (OME) and the principal components regression (PCR) estimator, which is called the stochastic restricted principal components (SRPC) regression estimator. Necessary and sufficient conditions for the superiority of the SRPC estimator over the OME and the PCR estimator are derived in the sense of the mean squared error matrix criterion. Finally, we give a numerical example and a Monte Carlo study to illustrate the performance of the proposed estimator.


Introduction
In linear regression analysis, the presence of multicollinearity among regressor variables may cause highly unstable least squares estimates of the regression parameters. With multicollinear data, some coefficients may be statistically insignificant and may have the wrong signs. To overcome this problem, different remedial methods have been proposed. One estimation technique designed to combat collinearity is using biased estimators, most notable of which are the Stein estimator by Stein [1], the principal components regression (PCR) estimator by Massy [2], the ordinary ridge regression (ORR) estimator by Hoerl and Kennard [3], and the Liu estimator by Liu [4]. Another method to combat multicollinearity is through the collection and use of additional information, which can be exact or stochastic restrictions [5]. When it comes to stochastic linear restrictions, Durbin [6], Theil and Goldberger [7], and Theil [8] proposed the ordinary mixed estimator (OME) by combining the sample model with stochastic restrictions. Some other important references on this subject are Li and Yang [9,10], Xu and Yang [11], Yang and Cui [12], Yang and Wu [13], Yang and Xu [14], and so on.
In this paper, we will introduce a stochastic restricted principal components (SRPC) regression estimator, which is defined by combining in a special way the ordinary mixed estimator and the principal components regression estimator.
We will compare the new estimator with the PCR estimator and the OME, respectively, in the sense of the criterion of the mean squared error matrix (MSEM).
The rest of the paper is organized as follows. In Section 2, the new estimator is introduced. In Section 3, some properties of the new estimator are discussed. A numerical example and a Monte Carlo simulation study are given in Section 4.
In addition to model (1), let us give some prior information about in the form of a set of independent stochastic linear restrictions as follows: where is a × 1 vector, is a × matrix with rank ( ) = , is a × 1 vector of disturbances, and is assumed to be known and positive definite. Furthermore, it is also assumed that the random vector is independent of .
For model (1), Massy [2] introduced the PCR estimator aŝ Xu and Yang [11] showed that the PCR estimator could be rewritten as follows:̂P For model (1) with the stochastic restrictions (4), the OME is given bŷ Ozkale [15] showed that the OME could be rewritten aŝ Now, the stochastic restricted principal components (SRPC) regression estimator can be obtained by combing the OME and PCR estimator. Substituting OLSE with PCR estimator in (8), we can get the new estimator as follows: Now, we can see that̂S RPC is a general estimator which includes the PCR estimator and OME as special cases: if = 0, then̂S RPC =̂P CR ; if = , then̂S RPC =̂O ME .
For the sake of convenience, we list some notations and important lemmas needed in the following discussions. For an × matrix , ≥ 0 means that is symmetric and positive semidefinite and > 0 means that is symmetric and positive definite.
By Lemma 1, the following lemma is straightforward.

The Superiority of the New Estimator
The bias vector and the covariance matrix of the SRPC estimator are given by The Scientific World Journal From (15), we can obtain that MSEM (̂S RPC ) Following the above procedure, we can get where 2 = ( − ) .
In order to comparêS RPC witĥP CR and̂O ME in the MSEM sense, now we investigate the following differences: In the following theorems, we will give the necessary and sufficient conditions for the new estimator to be superior to the PCR estimator and OME in the MSEM sense.

Numerical Example and Monte Carlo Simulation
In order to illustrate the performance of the proposed estimator, we first consider the real data example which was discussed in Gruber [17], and the data has also been analyzed by Akdeniz and Erol [18], Li and Yang [9], and Chang and Yang, [19]  ) ) ) ) ) ) ) ) .

(28)
In this experiment, we can note from the theorems that the comparison results depend on the unknown parameters and 2 . Consequently, we cannot exclude that our obtained results in the theorems will be held and the results may be changeable. For this, we replace them by their unbiased estimators, that is, the OLS estimators. The results below are all computed by R2.8.0.
From the data, we can obtain the following results: Following Chang and Yang [19], we choose the number of the principal components = 3, and consider the following stochastic linear restriction: The estimated MSE values of PCR, OME, and SRPC are obtained by replacing all unknown parameters by their OLS estimators, respectively. Table 1 gives the results.
From Table 1, we can observe that the estimated MSE value of the new estimator is smaller than those of PCR and OME, which is in accordance with the theoretical findings in Theorems 5 and 6.
To further identify the MSE performance of the new estimator, we are to perform a Monte Carlo simulation study. Specifically, the explanatory variables and the observations are generated by where are independent standard normal pseudorandom numbers, are independent normal pseudorandom numbers with mean zero and variance 2 , and is specified so that the correlation between any two explanatory variables is given by 2 . In addition, a stochastic linear constraint to the model is considered: In the simulation, we choose 2 = 1, = 6, = 50, 100, and = 2, 5. Four different sets of correlations, namely, = 0.9, = 0.99, = 0.999, and = 0.9999, are considered to show the weak, strong, and severe collinearity between the explanatory variables following Liu [20]. We choose the normalized eigenvector corresponding to the largest eigenvalue of as the true value of following Chang and Yang [19]. The experiment is replicated 10000 times by generating new error terms. Then, the estimated MSE for an estimator̃is calculated as follows: wherẽ( ) is the estimator of in the th replication of the experiment and = 10000. The simulation results are summarized in Tables 2 and 3, where the condition number of , that is, = 1 / 5 , is also given. From the simulation results shown in Tables 2 and 3, we can see that, with the increase of the level of multicollinearity, the estimated MSE values of the three estimators increase in general. However, the proposed estimator SRPC behaves  better than the competing estimators in most of the cases. In addition, the more severe the collinearity is, the more pronounced the superiority of SRPC is. Therefore, the proposed estimator is recommended when the explanatory variables are moderately or severely collinear.