Combination forecasting takes all characters of each single forecasting method into consideration, and combines them to form a composite, which increases forecasting accuracy. The existing researches on combination forecasting select single model randomly, neglecting the internal characters of the forecasting object. After discussing the function of cointegration test and encompassing test in the selection of single model, supplemented by empirical analysis, the paper gives the single model selection guidance: no more than five suitable single models can be selected from many alternative single models for a certain forecasting target, which increases accuracy and stability.
In 1969, Bates and Granger [
The combination forecasting model is the progress of selecting and utilizing the information of single forecasting model, but there are fewer special researches on the selection problem of single model when forming a combination forecasting model. In previous combination forecasting researches and applications, often, there is no screening of single forecasting models involved in combination forecasting, and the forecasting model is chosen according to its obvious characters presented by particular progress and existing knowledge and experience.
During the regression analysis of economic measurement, an important index to evaluate the fitting degree is
The researches of Armstrong [
So, this paper will discuss how to select suitable model from alternative single models to form combination for a certain forecasting target; according to some selecting principles, the number of single models to form combination is no more than five.
From the end of the 1980s to the 1990s, a great breakthrough of the modeling theory in econometrics is the cointegration relationship research of the time series. Cointegration actually presents the longrun equilibrium relationship of different time series, which is a key basic thought and theory in the current econometric field and is also an important theoretical cornerstone in current researches on combination forecasting launched by time series.
If forecasting series is an objective performance of sample series, in the long term, they should have an equilibrium relationship, even though, in the short term, these variables may deviate from mean value due to random disturbance and other reasons. Therefore, we can calculate the forecasting series of all alternative single forecasting models and reject the alternative single forecasting models that do not conform to cointegration relationship through the cointegration test with sample series, and the goal of selecting single model can be achieved.
Suppose
The three mathematical equations above hold true for all
Suppose some series requires difference transformation for
Suppose
Let
The key of this result is to be able to explain that if some linear combination of two variables is
Generally speaking, economic series is unstable, so as a relative reasonable forecasting series. Supposing
Given that both predicted series
Only when every single forecasting in series
The former parts in Theorems
Attestation of Theorem
For different combination forecasting models, Theorem
Assuming the forecasting series
And
It is obvious in the above equation that
If the single model in combination forecasting is in inconformity with the cointegration conditions in Theorem
Step one of single model cointegration test in combination forecasting is to detect the integration order
The most widely used method is unit root test for integration order calculation of sample series or forecasting series. Standard unit root test method is DickerFuller testing (DF), but the frequently used method in reality is generalized DF testing (ADF) [
The above formula adopts the lagged term
Unit root test is mainly performed to test the coefficient
The output result of ADF test includes the statistical magnitude
To know whether two variables
So it can lead to
It is called cointegration regression.
The testing methods for
Of course, in the specific application process, the following manner can also be carried out.
Expand the ADF regression of
Study the stability of the obtained
In formula (
M3Competition sample series [
N999 fitting sample series.
Quarter  Year  

1990  1991  1992  1993  1994  1995  1996  1997  1998  1999  2000  
1  3944.5  3969  4389  4244  4497  4938.5  5031  4895  5120.5  5790  6302.5 
2  3901  4068  4311.5  4244  4475.5  5012.5  4994  5034  5262.5  5944.5  6207 
3  3802.5  4250  4274.5  4185  4697.5  5071  4984.5  5018.5  5355  5984.5  6805.5 
4  3737.5  4373.5  4222  4370.5  4913.5  4978.5  4975.5  5145  5475.5  6009.5  7185 
N999 forecasting sample series.
Quarter  Year  

2001  2002  
1  7373.5  7608 
2  7259.5  7592.5 
3  7558.5  7722 
4  7660.5  7611 
In the paper, 12 single models are selected to fit and predict N999 series, which are, respectively, moving average (Naïve), single exponential smoothing (Single), linear exponential smoothing (Holts), dampen trend exponential smoothing (Dampen), seasonal exponential smoothing (Winter), time series decomposition model (Decomposition), ARIMA, ARARMA, BP neural network (BP), NARX neural network (NARX), grey forecasting (GM), and support vector machine (SVM). Real things contain two kinds of information: the nonlinear information and linear information; therefore, some classical linear and nonlinear single models are considered in the paper.
The evaluation indexes of forecasting are of a large number, and Goodwin and Lawton [
sMAPE is proposed for the greater punishment shortcomings of MAPE to positive error. For example, if
Next, we conduct unit root test to N999 sample series through the test tool of Eviews6.0 software.
It can be observed from Table
Unit root test of N999 sample series after first difference.

Prob.*  

Augmented DickeyFuller test statistic  −5.025046  0.0002 


Test critical values: 1% level  −3.596616  
5% level  −2.933158  
10% level  −2.604867 
*MacKinnon (1996) onesided
Augmented DickeyFuller test equation  
Dependent variable: 

Method: least squares  
Date: 09/25/13, time: 00:18  
Sample (adjusted): 1990Q3 2000Q4  
Included observations: 42 after adjustments  


Variable  Coefficient  Std. error 

Prob. 



−0.822358  0.163652  −5.025046  0.0000 

66.08964  24.93664  2.650303  0.0115 



0.386983  Mean dependent var.  10.07143  
Adjusted 
0.371658  S.D. dependent var.  182.3687  
S.E. of regression  144.5601  Akaike info. criterion  12.83172  
Sum squared resid.  835905.0  Schwarz criterion  12.91446  
Log likelihood  −267.4660  HannanQuinn criter.  12.86205  

25.25109  DurbinWatson stat.  2.023140  
Prob. ( 
0.000011 
Next, unit root test is conducted for the fitting data of twelve single models, and only when the fitting sample of single model is also integrated of order can they go into cointegration test phase (see Table
Unit root test on fitting data of 12 single forecasts.
Test sample  ADF statistics  1% critical value  IS stable  ADF statistics after first difference  1% critical value  IS stable after first difference  Result 

Forecast sample  2.66  −3.59  No  −5.03  −3.60  Yes 

Naive  −6.02  −3.60  Yes  —  —  — 

SINGLE  −6.02  −3.60  Yes  —  —  — 

Holts  1.65  −3.60  No  −7.53  −3.59  Yes 

DAMPEN  0.15  −3.60  No  −16.02  −3.61  Yes 

WINTE  4.20  −3.59  No  −15.31  −3.61  Yes 

THETA  5.12  −3.59  No  −9.01  −3.60  Yes 

ARIMA  3.31  −3.60  No  −7.59  −3.59  Yes 

ARARMA  2.08  −3.60  No  −11.4  −3.60  Yes 

AutoANN  10.8  −3.61  No  1.55  −3.59  No 

NARX  5.21  −3.60  No  −19.74  −3.60  Yes 

GM(1, 1)  3.11  −3.59  No  −12.56  −3.59  Yes 

SVM  4.10  −3.60  No  −4.21  −3.60  Yes 

Through the unit root test, we can see that the Naïve and Single are stable series, and the integrated order of AutoANN is greater than 1, so these three single models are removed first.
The cointegration test is conducted to fitting series and sample series of the remaining nine single models; according to (
ADF test on residuals of 9 single forecasts’ fitting sample.
The single forecast model residual sequence  ADF statistics  1% critical value  IS stable  IS cointegrated with sample series 

Holts  −5.47  −4.21  Yes  Yes 
DAMPEN  −2.56  −4.20  No  No 
WINTER  −7.12  −4.22  Yes  Yes 
THETA  −7.33  −4.21  Yes  Yes 
ARIMA  −9.35  −4.23  Yes  Yes 
ARARMA  −12.38  −4.22  Yes  Yes 
NARX  −15.21  −4.23  Yes  Yes 
GM(1, 1)  1.21  −4.22  No  No 
SVM  −4.89  −4.21  Yes  Yes 
It can be observed from Table
Finally, the remaining seven single models are predicted through the combination of simple average, weighted optimal, and ANN forecasting model again, and the result is shown in Table
Result of combined forecasts of single models by cointegration test.
Simple average  Weighted optimal  ANN  

12 single models’ combination without cointegration test  5.00  5.42  5.88 
7 single models’ combination with cointegration test  4.90  5.21  5.44 
It can be observed from Table
Numerous research literatures on forecasting accuracy mostly focus on the forecasting accuracy and stability evaluation and cannot directly compare the differences between competing forecastings of the same object. To achieve such discriminative comparison, forecasting encompassing theory came naturally, which was proposed by Meese and Rogoff [
Kisinbay [
Tomiyuki and Ryoji [
Encompassing test can identify the origins of accuracy differences between competing models; help predictors distinguish whether the difference is caused by sample variability or the significance of information set in construction models. But we cannot ensure that the forecasting model with the best forecasting accuracy performance must be able to explain all competing models. The research of Ericsson [
Combination forecasting theory holds that the information in different forecasting models may vary due to the differences of models, even leading to different forecasting results. We can obtain better results than the forecasting values of originally two single models through combination forecasting. However, if no additional information is included in the constructed combination forecasting, this combination will be not able to get higher efficiency, and thus the combination forecasting loses its significance.
Based on the above analysis, besides cointegration test to single models before combination forecasting, we should also adopt forecasting encompassing test to identify the containment relationships between alternative single models to select single modes through certain heuristic strategies.
We first consider the twotwo encompassing between single forecasting models by making
That model
Ericsson [
Fair and Shiller [
If
In testing process through (
Harvey, Leybourne, and Newbold provide HLN encompassing testing method for the above model defects [
Assuming that the
Here
Through the Monte Carlo method, we simulate and evaluate the testing capacity of HLN and find that it has poor performance in dealing with small samples.
Therefore, the paper amends on HLN testing so that its performance in testing small samples improves (after amendments, it can be referred to as MHLN testing). Specific amendments are as follows: first, compare and test statistical magnitude through
Testing for MHLN statistic magnitude is very simple, and only by calling correlation test function in Matlab7.0 statistical toolbox can test results of various significant levels be obtained.
This section chooses N999 series in M3 as the sample series of encompassing test; through the cointegration selection in Section
Calculate the fitting sMAPE in the samples of 7 single models for N999 series and sort all models according to that numerical number and the performance of each model. It should be noted that the fitting sMAPE in the samples of single models is different from the forecasting sMAPE out of samples.
Select the model with the lowest sMAPE value as the optimal model, and for optimal model NARX neural network, the adopted MHLN encompassing test (confidence level of 0.01) and
Select the second optimal model ARARMA and operate as step two, and the result shows that ARIMA is included by ARARMA, so we remove the ARIMA model among alternative single models.
Select the third optimal model of Decomposition and operate as step two again until the end of encompassing test and no included single forecasting model is found.
Since the number of single models in combination forecasting is better to be no more than 5, we repeat the above encompassing test procedures again and increase confidence level to 0.05. And the result shows that the Decomposition includes SVM.
Through encompassing test, seven single models are finally selected into five models. These five single models are used to make combination forecasting again, and further analysis is made on the basis of Table
sMAPE of combined forecast of single models by inclusive test.
Simple average  Weighted optimal  ANN  

12 single models’ combination without cointegration test  5.00  5.42  5.88 
7 single models’ combination with cointegration test  4.90  5.21  5.44 
6 single models’ combination with encompassing test (0.01 confidence level)  5.31  5.14  5.69 
5 single models’ combination with encompassing test (0.05 confidence level)  4.62  4.88  5.32 
We find that ARARMA includes ARIMA; but in fact, the real forecasting accuracy out of ARIMA samples is higher than that of ARARMA, which also proves that “taking advantages in forecasting accuracy out of samples is not a sufficient condition for forecasting encompassing” mentioned at the beginning of this chapter.
Furthermore, we find that when confidence level is 0.01, the simple average combination accuracy of the remaining six single models in conclusiveness test decreases while the accuracy of the weighted optimal combination upgrades. The main reason is that the eliminated ARIMA has a high forecasting accuracy itself, whose exclusion will necessarily decrease the accuracy of simple average combination; but for weighted optimal combination, the exclusion of ARIMA leads to the significant increase of weights of ARARMA and NARX neural network, so the combination forecasting accuracy improves.
When the confidence level is 0.05, SVM is excluded; the accuracy of the remaining five single model combinations has significantly improved. In particular, the sMAPE value of weighted optimal combination is lower than the Naïve single forecasting accuracy, proving the effectiveness of combination forecasting.
Therefore, when the single models are under encompassing tests, if the remaining single model number is no more than 5 when under the test at confidence level of 0.01, the test directly ends; if greater than 5, then confidence level should be adjusted to 0.05 or 0.1.
This thesis discusses single forecasting model selection in combination forecasting through cointegration test first and encompassing test method then. The result shows that the forecasting accuracy has improved to a certain extent after single model selection.
Cointegration verification aims to keep the fitting sample and real sample of single models have a consistent fluctuation trend, thus guaranteeing the combination forecasting accuracy of samples to the largest extent, especially the accuracy of middle and longterm forecasting. And the encompassing test is to eliminate the inclusive models; the start point of single model forecasting lies in the fact that each model establishes on different information set and owns different model type. If one single model is included by another single model, it will lose its meaning in the combination model and only increase the burden of combination.
According to the empirical analysis about N999 sample series, the paper screens 12 alternative single models to the final 5; and finally the forecasting accuracy is calculated to prove the efficiency of single model selection.
The authors declare that there is no conflict of interests regarding the publication of this paper.