A New Ridge-Type Estimator for the Linear Regression Model: Simulations and Applications

The ridge regression-type (Hoerl and Kennard, 1970) and Liu-type (Liu, 1993) estimators are consistently attractive shrinkage methods to reduce the effects of multicollinearity for both linear and nonlinear regression models. This paper proposes a new estimator to solve the multicollinearity problem for the linear regression model. Theory and simulation results show that, under some conditions, it performs better than both Liu and ridge regression estimators in the smaller MSE sense. Two real-life (chemical and economic) data are analyzed to illustrate the findings of the paper.


Introduction
To describe the problem, we consider the following linear regression model: where y is an n × 1 vector of the response variable, X is a known n × p full rank matrix of predictor or explanatory variables, β is an p × 1 vector of unknown regression parameters, ε is an n × 1 vector of errors such that E(ε) � 0, and V(ε) � σ 2 I n , I n is an n × n identity matrix. e ordinary least squares estimator (OLS) of β in (1) is defined as where S � X ′ X is the design matrix. e OLS estimator dominates for a long time until it was proven inefficient when there is multicollinearity among the predictor variables. Multicollinearity is the existence of near-to-strong or strong-linear relationship among the predictor variables. Different authors have developed several estimators as an alternative to the OLS estimator.
ese include Stein estimator [1], principal component estimator [2], ridge regression estimator [3], contraction estimator [4], modified ridge regression (MRR) estimator [5], and Liu estimator [6]. Also, some authors have developed two-parameter estimators to combat the problem of multicollinearity. e authors include Akdeniz and Kaçiranlar [7];Özkale and Kaçiranlar [8]; Sakallıoglu and Kaçıranlar [9]; Yang and Chang [10]; and very recently Roozbeh [11]; Akdeniz and Roozbeh [12]; and Lukman et al. [13,14], among others. e objective of this paper is to propose a new oneparameter ridge-type estimator for the regression parameter when the predictor variables of the model are linear or nearto-linearly related. Since we want to compare the performance of the proposed estimator with ridge regression and Liu estimator, we will give a short description of each of them as follows. estimator is obtained by minimizing the following objective function: with respect to β, will yield the normal equations where k is the nonnegative constant. e solution to (4) gives the ridge estimator which is defined as where S � X ′ X,W(k) � [I p + kS − 1 ] − 1 , and k is the biasing parameter. Hoerl et al. [15] defined the harmonic-mean version of the biasing parameter for the ridge regression estimator as follows: where is the estimated mean squared error form OLS regression using equation (1) and α i is ith coefficient of α � Q ′ β and is defined under equation (17). ere are a high number of techniques suggested by various authors to estimate the biasing parameters. To mention a few, McDonald and Galarneau [16]; Lawless and Wang [17]; Wichern and Churchill [18]; Kibria [19]; Sakallıoglu and Kaçıranlar [9]; Lukman and Ayinde [20]; and recently, Saleh et al. [21], among others.

Liu Estimator.
e Liu estimator of β is obtained by augmenting dβ � β + ε ′ to (1) and then applying the OLS estimator to estimate the parameter. e Liu estimator is obtained to be where F(d) � [S + I p ] − 1 [S + dI p ]. e biasing parameter d for the Liu estimator is defined as follows: where λ i is the ith eigenvalue of the X ′ X matrix and α � Q ′ β which is defined under equation (17). If d opt is negative, Ozkale and Kaçiranlar [8] adopt the following alternative biasing parameter: For more on the Liu [6] estimator, we refer our readers to Akdeniz and Kaçiranlar [7]; Liu [22]; Alheety and Kibria [23]; Liu [24]; Li and Yang [25]; Kan et al. [26]; and very recently, Farghali [27], among others.
In this article, we propose a new one-parameter estimator in the class of ridge and Liu estimators, which will carry most of the characteristics from both ridge and Liu estimators.

e New One-Parameter Estimator.
e proposed estimator is obtained by minimizing the following objective function: with respect to β, will yield the normal equations where k is the nonnegative constant. e solution to (11) gives the new estimator as where e new proposed estimator will be called the Kibria-Lukman (KL) estimator and denoted by β KL .

Properties of the New Estimator.
e proposed estimator is a biased estimator unless k � 0.
and the mean square error matrix (MSEM) is defined as To compare the performance of the four estimators (OLS, RR, Liu, and KL), we rewrite (1) in the canonical form which gives where e ridge estimator (RE) of α is where W(k) � [I p + kΛ − 1 ] − 1 and k is the biasing parameter. where where F d � (Λ + I) − 1 (Λ + dI). e proposed one-parameter estimator of α is where e following notations and lemmas are needful to prove the statistical property of α KL : [28].
e other parts of this article are as follows. e theoretical comparison among the estimators and estimation of the biasing parameters are given in Section 2. A simulation study has been constructed in Section 3. We conducted two numerical examples in Section 4. is paper ends up with concluding remarks in Section 5.

Comparison between α and α KL .
e difference between MSEM(α) and MSEM(α KL ) is We have the following theorem. Theorem 1. If k > 0, estimator α KL is superior to estimator α using the MSEM criterion, that is, Proof. e difference between (15) and (19) is where

Comparison between α(k) and α KL .
e difference between MSEM(α(k)) and MSEM(α KL ) is Scientifica Theorem 2. When λ max (HG − 1 ) < 1, estimator α KL is superior to α(k) in the MSEM sense if and only if where Proof. Using the dispersion matrix difference, It is obvious that, for k > 0, G > 0 and H > 0. According to Lemma 1, it is clear that G-H > 0 if and only if HG − 1 < 1, where λ max (HG − 1 ) < 1 is the maximum eigenvalue of the matrix HG − 1 . Consequently, V 1 is pd.
Proof. Using the difference between the dispersion matrix, where We observed that

Determination of Parameter k.
ere is a need to estimate the parameter of the new estimator for practical use. e ridge biasing parameter and the Liu shrinkage parameter were determined by both Hoerl and Kennard [3] and Liu [6], respectively. Different authors have developed other estimators of these ridge parameters. To mention a few, these include Hoerl et al. [15]; Kibria [19]; Kibria and Banik [31]; and Lukman and Ayinde [20], among others. e optimal value of k is the one that minimizes Differentiating m(k, d) with respect to k gives and setting (zp(k)/zk) � 0, we obtain 4 Scientifica e optimal value of k in (39) depends on the unknown parameter σ 2 and α 2 . ese two estimators are replaced with their unbiased estimate. Consequently, we have Following Hoerl et al. [15], the harmonic-mean version of (40) is defined as According toÖzkale and Kaçiranlar [8], the minimum version of (41) is defined as

Simulation Study
Since theoretical comparisons among the estimators, ridge regression, Liu and KL in Section 2 give the conditional dominance among the estimators, a simulation study has been conducted using the R 3.4.1 programming languages to see a better picture about the performance of the estimators.

Simulation Technique.
e design of the simulation study depends on factors that are expected to affect the properties of the estimator under investigation and the criteria being used to judge the results. Since the degree of collinearity among the explanatory variable is of central importance, following Gibbons [32] and Kibria [19], we generated the explanatory variables using the following equation: where z ij are independent standard normal pseudo-random numbers and ρ represents the correlation between any two explanatory variables. We consider p � 3 and 7 in the simulation. ese variables are standardized so that X ′ X and X ′ y are in correlation forms. e n observations for the dependent variable y are determined by the following equation: where e i are i.i.d N (0, σ 2 ), and without loss of any generality, we will assume zero intercept for the model in (44). e values of β are chosen such that β ′ β � 1 [33]. Since our main    [18] have found that the ridge regression estimator is better than the OLS when k is between 0 and 1. Kan et al. [26] also suggested a smaller value of k (less than 1) is better. Simulation studies are repeated 1,000 times for the sample sizes n � 30 and 100 and σ 2 � 1, 25, and 100. For each replicate, we compute the mean square error (MSE) of the estimators by using the following equation: where α * would be any of the estimators (OLS, ridge, Liu, or KL). Smaller MSE of the estimators will be considered the best one. e simulated results for n � 30, p � 3, and ρ � 0.70, 0.80 and ρ � 0.90, 0.99 are presented in Tables 1 and 2, respectively, and for n � 100, p � 3, and ρ � 0.7, 0.80 and ρ � 0.90, 0.99 are presented in Tables 3 and 4, respectively. e corresponding simulated results for n � 30, 100 and p � 7 are presented in Tables 5-8. For a better visualization, we have plotted MSE vs. d for n � 30, σ � 10, and ρ � 0.70, 0.90, and 0.99 in Figures 1-3, respectively. We also plotted MSE vs σ for n � 30, d � .50, and ρ � 0.90 and 0.99, which is presented in Figures 4 and 5, respectively. Finally, to see the effect of sample size on MSE, we plotted MSE vs. sample size for d � 0.5 and ρ � 0.90 and presented in Figure 6.     Tables 1-8 and Figures 1-6, it appears that, as the values of σ increase, the MSE values also increase (Figure 3), while the sample size increases as the MSE values decrease (Figure 4)

Numerical Examples
To illustrate our theoretical results, we consider two datasets: (i) famous Portland cement data originally adopted by Woods et al. [34] and (ii) French economy data from Chatterjee and Hadi [35], and they are analyzed in the following sections, respectively.

Example 1: Portland Data.
ese data are widely known as the Portland cement dataset. It was originally adopted by     Woods et al. [34]. It has also been analyzed by the following authors: Kaciranlar et al. [36]; Li and Yang [25]; and recently by Lukman et al. [13]. e regression model for these data is defined as where y i � heat evolved after 180 days of curing measured in calories per gram of cement, X 1 � tricalcium aluminate, X 2 � tricalcium silicate, X 3 � tetracalcium aluminoferrite, and X 4 � β-dicalcium silicate. e correlation matrix of the predictor variables is given in Table 9.     Table 10. It appears from Table 11 that the proposed estimator performed the best in the sense of smaller MSE.    e French economy data in Chatterjee and Hadi [37] are considered in this example. It has been analyzed by Malinvard [38] and Liu [6], among others. e variables are imports, domestic production, stock formation, and domestic consumption. All are measured in milliards of French francs for the years 1949 through 1966. e regression model for these data is defined as where y i � IMPORT, X 1 � domestic production, X 2 � stock formation, and X 3 � domestic consumption. e correlation matrix of the predicted variable is given in Table 12. e variance inflation factors are VIF 1 � 469.688,VIF 2 � 1.047, and VIF 3 � 469.338. e eigenvalues of the X ′ X matrix are λ 1 � 161779, λ 2 � 158, and λ 3 � 49.61, and the condition number is 32612. If we review the above correlation matrix, VIFs, and condition number, it can be said that there is presence of severe multicollinearity existing in the predictor variables. e biasing parameter for the new estimator is defined in (41) and (42). e biasing parameter for the ridge and Liu estimator is provided in (6), (8), and (9), respectively.
We analyzed the data using the biasing parameters for each of the estimators and presented the results in Tables 10  and 11. It can be seen from Tables 10 and 11 that the proposed estimator performed the best in the sense of smaller MSE.

Summary and Concluding Remarks
In this paper, we introduced a new biased estimator to overcome the multicollinearity problem for the multiple linear regression model and provided the estimation technique of the biasing parameter. A simulation study has been conducted to compare the performance of the proposed estimator and Liu [6] and ridge regression estimators [3]. Simulation results evidently show that the proposed estimator performed better than both Liu and ridge under some condition on the shrinkage parameter. Two sets of real-life data are analyzed to illustrate the benefits of using the new estimator in the context of a linear regression model. e proposed estimator is recommended for researchers in this area. Its application can be extended to other regression models, for example, logistic regression, Poisson, ZIP, and related models, and those possibilities are under current investigation [37,39,40].

Data Availability
Data will be made available on request.

Conflicts of Interest
e authors declare that they have no conflicts of interest.