Research on Combinational Forecast Models for the Traffic Flow

In order to improve the prediction accuracy of the traffic flow, this paper proposes two combinational forecast models based on GM, ARIMA, and GRNN. Firstly, the paper proposes the concept of associate-forecast and the weight distribution method based on reciprocal absolute percentage error and then uses GM(1,1), ARIMA, and GRNN to establish a combinational model of highway traffic flow according to the fixed weight coefficients. Then the paper proposes the use of neural networks to determine variable weight coefficients and establishes Elman combinational forecast model based on GM(1,1), ARIMA, and GRNN, which achieves the integration of these three individuals. Lastly, these two combinational models are applied to highway traffic flow on Chongzun of China and the experimental results verify their effectiveness compared with GM(1,1), ARIMA, and GRNN.


Introduction
The traffic flow forecast is an important research in modern intelligent transportation and an accurate traffic prediction is a premise and a key to achieve traffic control and planning [1].Long-term traffic flow forecast is based on hours, days, months, and even years for the unit of time [2], which is very important to the traffic forecast.On one hand, long-term traffic prediction can contribute to the planning and construction of the rational road distribution.On the other hand, it will help conduct the road maintenance in the operation department and schedule the construction progress timely [3].
Over the years, the scholars have been dedicated to research in this field and made a series of predictive models.Kim and Hobeika established a real-time traffic flow forecast using ARIMA model [4].Lee and Fambro established a sub-ARIMA forecasting model in short-term traffic flow [5].Yao and Cao used an ARIMA method to predict the traffic trend and analyzed the applicability of ARIMA with real data [6].Brian and Demetsky used a neural network to forecast short-term traffic flow and it was proved that the method had better results [7].Dougherty and Cobbett established a prediction model with BP neural network to predict the urban traffic [8] and Dia established a neural network forecasting model with time delay [9].Ma et al. used BP (Backpropagation) and RBF (Radial Basis Function) to establish a dynamic traffic flow forecast model and raised the data preprocessing methods and forecast model evaluation [10].Tiefeng used an improved genetic neural network model for urban traffic flow prediction [11] and the experiment also confirmed the effectiveness.Guo et al. who considered the delay and nonlinearity made use of GM(1, 1 | , ) model to predict urban road traffic flow [12].
In fact, these forecast methods mentioned above have their own advantages and disadvantages.After all, the predictive ability of a single forecast model is limited.If they can be effectively combined to achieve complementarities, the prediction precision can be improved greatly.The scientifically and rationally combinational forecast model can extremely avoid the adverse effects of the use of a single model and play their respective advantages.Bates and Granger first proposed a combinational forecast model [13], which has become an important development of forecast technology.Li et al. used a dynamic weight combination of the historical average, Euclidean distance, and dynamic time 2 Mathematical Problems in Engineering warping distance to predict the intersection traffic flow in Xiamen Lotus [14].Baochun et al. proposed an adjustable parameter genetic gray system theory to predict short-term traffic flow [15].Zhuoping and Yuxian established a variable weight combinational model for railway cargo traffic and got better prediction accuracy [16].Zhang and Wang proposed a combinational forecast model of the genetic algorithm and the time delay neural network [17].In addition, [18][19][20][21][22][23][24][25], respectively, merged the data mining technology, rough set theory, fuzzy logic, support vector machines, particle swarm optimization, ant colony algorithm, and chaos algorithm into the forecast models to predict the traffic flow.
From the above, we can see that the combinational model is an effective way to improve the accuracy rate of the traffic flow forecast.Because of an actual problem of longterm traffic flow, we propose two fusion forecast models of GM, ARIMA, and GRNN.The first model is a fixed weight coefficient combination which is established based on three individual models.After using the methods of the associate-forecast and the reciprocal absolute percentage error to allocate the fixed weights of different models, we build a combinational forecast model and output the final forecast results.The second method is a combination of variable weight coefficients according to Elman neural networks.GM(1, 1), ARIMA, and GRNN are integrated to establish a random weight combination.After comparing the experimental results of two combinational prediction models with three individuals, the rationality and accuracy of the models in this paper are verified.

GM(1,1)
In 1986, Julong proposed gray system theory [26], that is, to take small samples, poor information, and uncertain system as the research object and to extract valuable information through the partially known information processing and extending.Tapping the potential law among data has been found to achieve a correct understanding and effective control of the system behavior.Gray system theory with the help of concepts such as space and smooth discrete functions established a differential dynamic gray model in the use of discrete data sequence, which is called GM (Grey Model).
The modeling mechanism of GM is illustrated as follows: (1) The original irregular data is summed up into the spanning number by adding processing functions.
(2) The data from GM must be generated via the inverse spanning reduction before they can be used.

GM(𝑚, 𝑛
) is a GM model with -order equation and  variables, which is applied to the analysis of the dynamic relationship between variables but not suitable for forecasting.GM (1,1), on the other hand, is a system model that contains a single variable and 1-order differential equation and is applicable to predicting the future changes according to the previous values of the single variable.It is the most common one in all the GM models.
(2) Then there is a 1-order albino differential equation with a single variable:  (1) ()/ +  (1) () = . and  are undetermined coefficient, where  is the development coefficient and  is the amount of grey which can be obtained by using the least square method; namely, ( (1) (2) +  (1) (1)) − ( (1) (3) +  (1) (2)) 2 1 . . . . . .− ( (1) () +  (1) ) ,  = [ (0) (2) ,  (0) (3) , . . .,  (0) ()]  . ( (3) By calculating the above result, the cumulative value of the sequence is gotten: (4) Finally, the prediction value is obtained by a reduction process: GM (1,1) does not require large samples or the data to be subject to certain distribution.According to only a small amount of data, it can complete a forecast satisfactorily.Because its predictable geometry is a monotonic smooth curve, it is suitable for the dynamic prediction that the time series is short with less data and the volatility is not too large.When the amount of data is large and it has strongly stochastic volatility, the prediction error is often big and the prediction accuracy tends to be low.

ARIMA
ARIMA (Autoregressive Integrated Moving Average) proposed by Box and Jenkins is a time series forecast method [27] and it is also known as Box-Jenkins model.For ARIMA(, , ), AR is autoregression and  is the number of autoregressive items.MA and  are similar to AR and , while  is the number of differencing times when series becomes stationary.
The ARIMA model data which is processed must be stationary; that is, the mean and covariance of the sequence do not change with time advection.The data sequence first needs to go through a series of data testing such as a figure test, an autocorrelation and a partial correlation function test, a run test, a characteristic root test, ADF, KPSS, and other methods for their stability.If the data sequence is not smooth, you need to make it smooth through the methods such as difference and logarithmic differential treatment.
After steady data processing, ARMA (Autoregressive and Moving Average) is used to fit data.It is the mixture of AR (Autoregressive) and MA (Moving Average) with the form of differential equations.For normal, smooth, and zeromean time sequence {  }, if   relies not only on the historic values of the previous  steps as  −1 ,  −2 , . . .,  − but also on each of the interferences with the previous  steps as  −1 ,  −2 , . . .,  − (,  = 1, 2, . ..), we can get general ARMA according to the multiple linear regression: where   ∼ NID(0,  2  ).Formula (5) represents a -order AR and a -order MA model, recorded as ARMA(, ). and  denote the orders of AR and MA components and   ( = 1, 2, . . ., ) and   ( = 1, 2, . . ., ) denote, respectively, the parameters of each part.
The general modeling steps of ARMA are as follows.
(1) Model Identification and Order.Which model can be appropriate is often based on the ACF (autocorrelation function) and PCF (partial correlation function).Then, it is necessary to calculate the values of  and .ACF and PCF are just a preliminary determination for the orders and the set of candidate models.Then we should choose the optimal focus from alternative models.If the regression coefficient passes by -test, it indicates a significant effect.On the contrary, some coefficients should be removed according to the actual situation but it is not necessary to be done completely and the characteristic roots are needed for further test.
(2) Parameter Estimation and Adaptation Test.After the completion of the initial fixed order, the mean, variance, and all the values of   and   need to be determined for further model application.Next, we must determine if the model can properly describe the given time series, namely, adaptive model testing; that is, the essence of the independence of the model is whether residual sequence {  } is a white noise or not.
ARIMA can predict the curve for short-term and longterm forecasts and it has smaller prediction error than AR.When the amount of data is sufficient, it has relatively high prediction accuracy.However, its parameter estimation is a bit complex and when the amount of data is not enough and the regularity is not strong, a large amount of computation will result in obvious time delay which decreases the prediction accuracy.

GRNN
GRNN (generalized regression neural network) proposed in 1991 by D. F. Specht, an American scholar, is a branch of RBF network.Based on nonparametric regression, it has a strong nonlinear mapping ability to suit for solving the problem of curve fitting.Compared with RBF, GRNN has stronger advantages on approximation ability and learning speed and its training process is also more convenient.Therefore, it has been widely used in the decision-making control system, signal process, finance, energy, and other fields [28,29].

4.1.
The Structure of GRNN.GRNN is composed of four layers, namely, the input layer, the hidden layer, the summation layer, and the output layer.When the network input

GRNN Modeling Process.
The input layer receives input samples and the number of neurons is equal to the input vector dimensions in the study samples, and then the input layer directly passes the input variables to the hidden layer.The number of hidden layers is equal to that of the learning samples and neuron transfer function is described as formula (6), where  is network input variables,   is the th neuron corresponding learning samples, and  is a smooth factor.Neurons in the summation layer are partitioned into two types.The first neurons compute arithmetic sum of those in the hidden layer (formula (7)) and the second ones compute weighted sum (formula (8)).  is the connection weight between the th neuron in the hidden layer and the th one in the summation layer.The number of neurons in the output layer is equal to input vector dimension in the learning samples and you can get estimated values in this layer when the second neurons are divided by the first ones in the summation layer (formula ( 9)): x 1 x 2 x n P 1 Hidden layer

Summation layer
Output layer For generalized regression neural networks, once the learning samples are identified, the corresponding values of the network structure and connection weights between neurons will be also determined.Therefore, we just make manual adjustment for the smooth factor in the training process to get the regression estimation results [30].
Similar to other neural network models, GRNN is also able to discover the hidden rules from the large data and then integrate this learning information into the linking weights between neurons, but the prediction accuracy is poor when the training samples are less.

The Combinational Forecast Model Based on GM(1,1), ARIMA, and GRNN
An individual prediction model often has certain onesidedness, so different forecast models in accordance with a proper form can comprehensively utilize all kinds of information provided by them and they can learn from each other so as to effectively improve the prediction accuracy, which is the concept of combinational forecast [31].Due to the influence of various factors such as weather, holidays, commercial and industrial layout, and social and economic activities, the traffic flow has strong uncertainty and complexity.Therefore, it is difficult to obtain satisfactory prediction effect for a single mathematical model.Based on the weights, it is to be determined whether or not combinational forecast can be split into fixed weight and variable weight.This paper proposes a fixed weight and a variable weight combinational forecast based on GM(1, 1), ARIMA, and GRNN and experimental results indicate that they are effective compared with three individual models.Then the fixed weight combinational forecast model is as follows: and   () are the weight and prediction value of the th individual model, respectively, and The selection of weights in combination with a forecast model is extremely important.The reasonable weight distribution can improve the prediction accuracy, while it can reduce forecast accuracy on the contrary.Common weightselection methods mainly include the nonoptimal weighting and the optimal one.
The nonoptimal weighting method is often based on error statistics, including the arithmetic mean method, the inverse of the variance, standard deviation, and other methods.The weight of the arithmetic average method is evenly distributed, which treats an average of all models.Though it is easy to calculate, it ignores the importance of every prediction model and therefore the prediction accuracy will decrease.
The optimal one is based on certain optimal criteria to construct the objective function for minimizing it under some constraint conditions.This method has higher prediction accuracy, but the calculation is more complicated, which needs to solve linear and nonlinear programming.
Considering the advantages and disadvantages of these methods, we present a method to calculate the weights by associate-forecast mean absolute percentage error, which is described below.
(2) The greater the error of the model, the smaller the weights.Therefore, in the combinational model, the weights are calculated in the form of the reciprocal; namely, (3) The final combinational forecast model composed of GM(1, 1), ARIMA, and GRNN is as follows:

Elman Neural Network.
A typical Elman neural network is composed of the input layer, the hidden layer, the link layer, and the output layer.The connection of the input, the hidden, and the output layer is similar to that of the forward feedback network.The unit in the input only plays the role of signal transmission and the transfer function of the hidden layer can be linear or nonlinear.The link layer, also known as a context or a state layer, is used to memorize the output value of the implicit unit at the previous time and together with the input of the network as the input of the hidden one at the next moment.
The expression of nonlinear state space in Elman neural network is is the number of training the neural network with  of the input vector and  is output vector of the hidden layer neuron with  of the output vector. in , , and  out denote connection weight matrix from the input layer to the hidden layer, the link layer to the hidden layer, and the hidden layer to the output layer, respectively.(⋅) is the transfer function in the hidden layer and (⋅) is that in the output layer.

Elman Combinational Model
Based on GM(1,1), ARIMA, and GRNN.The structure of Elman variable weight combinational forecast model based on GM(1, 1), ARIMA, and GRNN is shown in Figure 2. Suppose there are  =  1 +  2 data in the traffic original sequence, where  1 is the number of training samples and  2 is that of test samples.Apply separately GM(1, 1), ARIMA, and GRNN to   1 training samples for -step forecast and take their own prediction results as the training samples of Elman and the corresponding original data as the desired output.Then  2 test samples are tested in Elman and their output in Elman will be the final prediction results.

Indicating
sim represents the predicted value of the traffic flow,  real indicates the real value, and  2 is the total number of the test samples.The smaller MAPE and RMSE the better, while the bigger EC the better [32].

Experiments and Analysis
To analyze the effect of the combinational forecast model, this paper takes 54 pieces of highway traffic data on Chongzun of China from January 2009 to June 2013 as experimental objects which are conducted for six-step prediction.The initial data is described in detail in Figure 3. EViews 6 [33], the gray system theory modeling software 3.0, and MATLAB R2012b are used in these experiments.

Calculations of Three Individual Models and the Combinational Models
(1) GM(1, 1) Forecast.As seen from Figure 3, the annual export traffic increases gradually, so we choose GM(1, 1) to set up an individual model.The initial amount of data required by GM(1, 1) model is not big, so we take the former 42 items of initial sequence as  (0) .
(2) ARIMA Forecast.Self-correlation and partial correlation functions are shown in Figure 4, which indicates the instability of raw data.After one-time difference, the data is stable.Autocorrelation and partial autocorrelation function of differential data sequence are shown in Figure 5, which shows that PACF tends to 0 soon after  = 8 and so does ACF after  = 7, so  = 8 and  = 7.In examining the model,     MAPE the better and the smaller TIC (Theil Inequality Coefficient) the better.Based on the above information and the fact that the lower-order ARMA is available instead of the higher-order MA because of difficulty of calculation, ARMA(2, 1) is selected for the best predictive model.
Here unit root and -statistical test are used to test the white noise.Unit root test is shown in Figure 6 and all the inverses of roots are inside unit root.The result of -statistical test is shown in Figure 7 By using this model to do associate-forecasts for 43-48 items, the MAPE is 21.077%.
(3) GRNN Forecast.We take the traffic real values at the times  − 6,  − 5,  − 4,  − 3,  − 2, and  − 1 as traffic flow factors at time ; that is, the number of input nodes of GRNN is 6 and the real value at time  is the desired output.Due to small sample data, thirty training samples and six test samples are generated in the form of a rolling arrangement.Then GRNN neural network is established to predict traffic flow from item 43 to item 48.In the experiment, a smooth factor, spread, increases gradually from 0.1 to 10 with step 0.5 and the final choice spread is equal to 1 according to the smallest MAPE.At this time, the prediction results determined by GRNN are 833966, 804699, 879517, 804699, 717632, and 879517 and the average error is 23.158%.
(4) The Fixed Weight Combinational Forecast.Based on the above calculation results, the associate-forecast average percentage error of GM(1, 1), ARIMA(2, 1, 1), and GRNN is 8.966%, 21.077%, and 23.158%.According to formulas ( 12) and ( 13), it is obtained that  GM = 54.668%, ARIMA = 23.469%, and  GRNN = 21.359%,so the final fixed weight combinational forecast model is (5) Elman Combinational Forecast.The associate-forecast values of GM(1, 1), ARIMA(2, 1, 1), and GRNN are taken into Elman neural network as the input samples, the corresponding real values as the desired output, and their prediction values from item 49 to item 54 as training samples, in which Elman structure is shown in Figure 8.In this experiment, the maximum-minimum method is used for data normalization and the number of iteration is 1000.The training function uses traingdx and the number of hidden layer nodes changes from 10 to 20 with an increment of 2. Based on the results of MAPE, the final number of nodes in the hidden layer is 10.Therefore, the final prediction results are 1136600, 1155800, 1136500, 1137400, 1136100, and 1136000.

Estimation Results.
Using GM(1, 1), ARIMA(2, 1, 1), GRNN, the fixed weight combinational forecast model, and the Elman one, respectively, the next six-step predictions are shown in Table 2 and the absolute percentage errors are described in Table 3.As seen from it, the minimum error rate of the fixed weight combinational forecast is 0.5655%, the maximum one is up to 12.8971%, and the mean one is just 3.461%, which indicates high prediction accuracy.Similarly, the minimum, maximum, and the mean error rates of Elman are 0.1009%, 6.7957%, and 2.083%, respectively, which shows higher prediction accuracy.
The evaluation of each model is shown in Table 4.As can be seen, either MAPE or RMSE of the fixed weight combination and Elman is less than any individual forecast model, which indicates the combinational model has better accuracy than any single one.Meanwhile, EC of both combinational models is higher than any individual, which also illustrates that both combinations are closer to actual observations and have better prediction results.In short, no matter from which index evaluation, the combinational forecasts are undoubtedly good choices.Furthermore, both MAPE and RMSE of Elman combination are less than those of the fixed weight, while the latter has higher EC, which indicates the former is more suitable for the traffic forecast.Elman neural network relies on history to learn and has no need to design a mathematical model.It makes the dynamic adjustment to the weight of every individual model and it has high prediction accuracy.
The comparison of the traffic flow between the original series and each prediction model is shown in Figure 9 and the comparative graph further verifies two combinational models proposed in this paper can fit the actual data better.

Contrast of the Combinational Models.
As seen from Table 4, the prediction effect of GRNN is the worst, ARIMA(2, 1, 1) is middle, and GM(1, 1) is the best in three individual models.However, is the effect of the combinational forecast of three models best or not?Can better prediction results be gotten if the worst individual model is removed from the models?
To this end, this paper makes the following experiment further.First of all, get rid of the worst GRNN model and use GM(1, 1) and ARIMA(2, 1, 1) for fixed weight and Elman combinational forecast, respectively.Then remove ARIMA(2, 1, 1) and only use GM for Elman prediction.The results of prediction percentage error are shown in Figure 10.The comparison of average percentage error is shown in Table 5, which includes the combinational forecast of three individual models and some combinations except GRNN and except ARIMA(2, 1, 1) in turn.Although GM(1, 1) is the optimal model and GRNN is the worst model in view of the individual model, the combinational forecast results show that some useful information may be lost if the bigger error forecast model is simply given up.The precision of fixed weight combination composed of GM(1, 1), ARIMA(2, 1, 1), and GRNN is higher than that of the same combination of  GM(1, 1) and ARIMA(2, 1, 1) and much higher than individual GM (1,1).Similarly, the precision of Elman combination of three individual models is better than that of GM(1, 1) and ARIMA(2, 1, 1) and much better than GM-Elman.
In brief, by comparisons of all kinds of the combinational models, we can see that the proposed fixed weight and Elman combination of three models have higher accuracy than those of two models.Meanwhile, the accuracy of the combinational forecasts of two models is higher than that of any individual model.Of course, in these experiments, all the results of Elman combinational forecast are superior to the fixed weight combination in the same conditions.

Conclusions
This paper introduces the combinational models of GM(1, 1), ARIMA, and GRNN to predict highway traffic flow on Chongzun of China.The fixed weight combination takes the mean absolute percentage error of associate-forecast as the basic element of weight calculation.Elman variable weight one relies on neural network structure and learning algorithm automatically adjusting the weights and outputting the optimal prediction.As seen from the comparison of the simulation experiments, either combinational model has higher prediction accuracy than any individual one.Even if the prediction results are in the case of large differences between individual models, the combinational models can still obtain good prediction accuracy.
In short, the combinational model can overcome the shortcomings of various independent models and improve the prediction performance.Even if the prediction result of a model is not ideal, when combined with another good prediction model, it can also improve prediction effect rather than compromise prediction effect.Anyway, how to select the appropriate models for combination is still a question and how to distribute the weight of each individual model is worthy of thinking deeply.

Figure 9 :
Figure 9: The comparison of various models.

Figure 10 :
Figure 10: The error comparison of various combination models.

Table 1 :
Alternative model evaluation form.
Fitting analysis according to Table1, the bigger adjusted  2 the better, while the smaller AIC (Akaike Information Code) and SC (Schwartz Criteria) the better.The smaller 2is to determine the coefficient of correction and MAPE represents the mean absolute percentage error.

Table 3 :
Comparison of absolute percentage error.

Table 4 :
Indicating evaluation of all the models.

Table 5 :
Comparison of MAPE in different combinational models.