Short-Term Traffic Flow Prediction of Expressway : A Hybrid Method Based on Singular Spectrum Analysis Decomposition

Real-time expressway traffic flow prediction is always an important research field of intelligent transportation, which is conducive to inducing and managing traffic flow in case of congestion. According to the characteristics of the traffic flow, this paper proposes a hybrid model, SSA-LSTM-SVR, to improve forecasting accuracy of the short-term traffic flow. Singular Spectrum Analysis (SSA) decomposes the traffic flow into one principle component and three random components, and then in terms of different characteristics of these components, Long Short-Term Memory (LSTM) and Support Vector Regression (SVR) are applied to make prediction of different components, respectively. By fusing respective forecast results, SSA-LSTM-SVR obtains the final short-term predictive value. Experiments on the traffic flows of Guizhou expressway in January 2016 show that the proposed SSA-LSTM-SVR model has lower predictive errors and a higher accuracy and fitting goodness than other baselines. This illustrates that a hybrid model for traffic flow prediction based on components decomposition is more effective than a single model, since it can capture the main regularity and random variations of traffic flow.


Introduction
Expressway traffic flow prediction and management is always an important research field of intelligent transportation [1]. Mastering the change laws of traffic flow and accurately predicting the real-time traffic flow are conducive to studying the traffic capacity of expressway and managing and inducing traffic flow, so as to relieve traffic congestion, reduce fuel consumption, and save the total social cost of traffic travel [2]. Expressway traffic flow is affected by human travel behaviour, in sequence with strong regularity, suddenness, and randomicity. Currently, different prediction models, such as classical statistical models, machine learning models, and Artificial Neural Network (ANN), have predicted the short-term traffic flow with different accuracies by obtaining different characteristics and changing laws of traffic flow [1]. e classical statistical models mainly include Automatic Regressive Integrated Moving Average (ARIMA), Seasonal ARIMA (SARIMA) [3], and ARIMA hybrid models. To solve the problem of SARIMA requiring a sound data for model building, Kumar and Vanajakshi [3] made the traffic flow a stationary one by differencing and then used the autocorrelation function, partial autocorrelation function, and maximum likelihood method to identify the suitable parameters of the SARIMA model and made traffic flow forecast with limited data. Considering their limitations of the ARIMA needing the stationarity and autocorrelation of the time series and the variability of traffic flow, the ARIMA family is usually combined with other nonlinear models to make prediction of flow. Ding et al. [4] incorporated the dynamic volatility into the subway short-term ridership forecasting process, and they construct four kinds of the integrated ARIMA and GARCH models to model the mean part and volatility part of the short-term ridership in subway. However, this method needs identifying the volatility characteristic and parameters of different subway stations one by one. Hou et al. [5] proposed an adaptive hybrid model, in which ARIMA captures the linear laws and the Wavelet Neural Network method obtains the nonlinear changes. en, outputs of the two individual models are combined by fuzzy logic and their weighted result is regarded as the final predicted values of the traffic flow. is hybrid model performs well in the data processing stage, but vehicles which are hard to identify will be ignored. Support Vector Regression (SVR) has been employed in linear and nonlinear time series regression widely; however, reasonable hyperparameters in SVR are a difficult problem. To overcome this shortage, Cai et al. [6] employed gravitational search algorithm (GSA) to search optimal SVR parameters and achieved a more accurate short-term traffic flow forecasting. SVR is generally combined with other models to achieve a short-term traffic flow forecasting. According to the different categories of the passenger flows entered in the subway station, Li et al. [7] determined the optimal time granularity and interval by utilizing Pearson's correlation coefficient and then used Empirical Mode Decomposition (EMD) and SVR (EMD-SVR) to predict the passenger flow for each station. is framework obtains a significant improvement in forecasting passenger flow. However, it is difficult for the GSA and EMD in [6,7] to obtain a uniform hyperparameter of SVR due to volatility characteristics of traffic or passenger flow on different stations. Feng et al. [8] explored both the randomness and nonlinearity of the traffic flow and combine polynomial kernel and Gaussian kernel to constitute an adaptive multi-kernel SVM (AMSVM), in which an adaptive particle swarm optimization algorithm is employed, that optimizes the parameters of AMSVM. e spatial-temporal correlation information is incorporated with AMSVM to predict the short-term traffic flow. Multiple kernels and hybrid methods make AMSVM can better adapt to the dynamic characteristic of traffic flow on urban roads, thus providing a more accurate predicted result, but the predicted values have a certain delay compared to the real values. K-nearest neighbors (KNN-based) models, as another common choice, implement traffic flow forecast by matching similarities in the historical data. Habtemichael and Cetin [9] gave more weight to the near neighbors, used a rank exponent to aggregate the candidates found by the enhanced KNN algorithm, and implemented short-term traffic forecasting. To explore the the inherent spatial heterogeneity of city traffic and unclear spatiotemporal dependency relationships, Cheng et al. [10] proposed an adaptive spatiotemporal KNN model (adaptive STKNN) for short-term traffic forecasting. Adaptive STKNN determines the sizes of spatial neighbors and the lengths of time windows for traffic influence using cross-correlation and autocorrelation functions and then introduces adaptive spatiotemporal weights into the distance functions to optimize the candidate neighbor search mechanism. e results demonstrated that the adaptive STKNN model outperforms other models during all time periods and especially the peak period. However, once the query does not appear in the historical data, the KNN-based models will suffer a big forecasting error.
Compared with the traditional intelligent models, ANN and Deep Neural Networks (DNN-based) models can provide a more accurate traffic prediction due to their complex structure and nonlinear functional approximation. In which Long Short-Term Memory network (LSTM), Gated Recurrent Unit (GRU) performs well as it can learn about the inherent regularity of time series data, and the linear and nonlinear correlations with shortlong distances. Tian et al. [11] presented multiscale temporal smoothing model to infer the missing data and used a revised LSTM approach to learn the prediction residual and forecast traffic flow. To exploit the impact of temporal features on prediction, Mou et al. [12] proposed a T-LSTM model that inputs the addition of temporal information into T-LSTM to improve the accuracy of short-term traffic flow prediction effectively. Furthermore, space-time hybrid models expand traffic flow forecast for a single station to the whole road network by fully considering the influences of the surrounding stations. Wu et al. [13] proposed a DNN-based traffic flow prediction model, which makes full use of weekly/daily periodicity and automatically learns the importance of past traffic flow. en, a convolutional neural network and GRU are used to mine the spatial and the temporal features of traffic flow. Bogaerts et al. [14] utilized Graph Convolution Network [14] and Graph Attention mechanism [15] to capture the spatial features of traffic, and LSTM is used to obtain both short-term and long-term laws of traffic flows. is combination improves the accuracy of the traffic flow. However, the DNN-based models perform well in traffic flow prediction; however, they are difficult to interpret and require a large amount of data and a great time and space to train [3].
To distinguish the regularities of traffic flow, Empirical Mode Decomposition (EMD) and Singular Spectrum Analysis (SSA) are applied to reduce the noise of traffic flow sequence. Chen et al. [16] introduced Ensemble EMD (EEMD) and Wavelet to suppress the potential data outliers and then provided LSTM to fulfill the traffic flow prediction. Kolidakis et al. [17] combined SSA with ANN to analysis time series and forecast road traffic volume. However, these algorithms of component decomposition probably eliminate some random elements in the traffic flow, which leads to a distortion of the raw traffic flow.
Considering the complexity characteristics of the traffic flow, this paper proposes a hybrid model (SSA-LSTM-SVR), which decomposes the flow into different components and, according to their respective features, employs different models to make forecasting. Firstly, SSA is used to decompose the traffic flow into one principle component and multiple random components. LSTM is used to make a short-term prediction of the principle component due to its regularity and periodicity. For the random components, the SVR model is introduced to predict them. e corresponding prediction results of all components are superimposed to form the final predictive value. At the same time, Particle Swarm Optimization (PSO) algorithm [18] is applied to optimize the decomposition parameters of the SSA-LSTM-SVR model. e rest of the paper is organized as follows. Section 2 illustrates the methodology. Section 3 contains the experiment results. Section 4 makes deep discussion about the experiment results. Section 5 contains the conclusions and future work. 2 Advances in Civil Engineering

Singular Spectrum Analysis (SSA). Singular Spectrum
Analysis (SSA) [19] is usually applied to the component decomposition and noise reduction of time series signals. In order to find out the regularities and random factors of the expressway traffic flow, this paper introduces the SSA algorithm to decompose the trajectory matrix of short-term traffic flow into principle and random components. SSA mainly includes five steps: embedding, singular value decomposition (SVD), grouping, diagonal average, and component decomposition (subcomponent extraction).
Step 1. Embedding. Given a traffic flow series of length T, , and the embedding dimension L(2 < L < T/2), X(t) is arranged with time delay to obtain an L × K track matrix: where K � T − L + 1, and the embedded dimension L is also called window's length of X(t), usually being an integer multiple of the period of X(t).
Step 2. Singular Value Decomposition (SVD). SVD of XX T is performed to obtain L eigenvalues λ 1 ≥ λ 2 ≥ λ 3 ≥ · · · ≥ λ L ≥ 0 and corresponding orthogo- �� λ i is the singular value of X, U i is the left eigenvector, V i is the right eigenvector, and ( Given the contribution rate of X i to X as λ i / d i�1 λ i , the contribution rate of the top r X i to X will be Step 4. Diagonal Averaging (Reconstruction Component). Obtain the diagonal average values of X i � (z ij ) L×K , and convert X into the time series R i (t) � [r 1 , r 2 , r 3 , . . . , r T ] of length T. is process is called reconstruction component (RC) and the sum of RCs is equal to the original sequence; namely, e kth member of R i (t) is the average value of all members of X i � (z ij ) L×K with respect to i + j � k + 1, and the the details are shown as follows: Step 5. Component Decomposition. By obtaining the diagonal average values of where i + n ≤ d and 1 ≤ k ≤ Γ. X k (t) is the feature series of the original time series X(t), and each one has its own special features and they vary from each other. e sum of X k (t) is equal to the original series; namely, In order to make the prediction result of short-term traffic flow as close as the ground truth, we firstly decompose the traffic flows into principle components and some random components by using SSA, where Particle Swarm Optimization (PSO) [19] is used to optimize the decomposition parameters of SSA. en, according to their respective characteristics, the next section will discuss the corresponding prediction models suitable to respective characters.

Long Short-Term Memory Neural Network (LSTM).
Since the principle component of expressway traffic flow decomposed by SSA can well reflect the periodicity and regularity of traffic flow, LSTM is introduced to make prediction. LSTM is an improvement of Recurrent Neural network (RNN) [20], which effectively solves the problems of gradient disappearance and gradient explosion of RNN.
As LSTM can capture long short-term dependencies of a time series, it has been widely applied in the short-term traffic flow prediction. e hidden layer of LSTM is composed of several memory blocks, and each memory block contains a cell and three gates, namely, the input gate, the forget gate, and the output gate. e memory unit of LSTM is shown in Figure 1.

Advances in Civil Engineering
As Figure 1 shows, x t and h t represent the input and output at the interval t. f t , g t , and o t are the outputs of the forget gate, input gate, and output gate, respectively, and their mapping functions are as follows:

Support Vector Regression (SVR).
As the stochastic component of expressway traffic flow has complex features and strong nonlinearity, the Support Vector Regression (SVR) algorithm is introduced to make prediction of these components. As a widely used traffic flow prediction model, SVR maps data points to an optimal hyperplane in the highdimensional state space as far as possible to realize regression and prediction. For a sample set (x i , y i ), i � 1, 2, 3, . . . , T} of size T, in which x i and y i are the input and output values of the ith sample, the regression function of SVR is where 〈w · x i 〉 represents inner product of vectors w and x i , and b is the offset. e loss function will be where the error ε > 0. w and b can be obtained by solving the minimized objective function where C > 0 is the penalty factor. Among different kernel functions, the radial basis kernel function (RBF) is widely used due to its excellent nonlinearity, and the RBF is defined as where c > 0. At the same time, Lagrange multipliers τ i and τ * i are introduced to the duality problem of equation (12). By solving the duality problem under the KKT restriction, τ i , τ * i , and b are obtained, and the regression function of SVR is descripted as 2.4. SSA-LSTM-SVR Structure. As shown in Figure 2, SSA decomposes the input traffic flow X(t) into Γ subcomponents with parameters, μ 1 , μ 2 , . . . , μ Γ , which are optimized by the PSO algorithm. According to formula (8), the sum of all components is equal to X(t) and the sum of μ i is equal to 1; namely, e first principle component X 1 (t) reflects the traffic volumes at different moments, the long-term regularities, and periodicity of the traffic flow. As discussed in Section 2.2, LSTM can capture these main and long-term law of the flow by iterating the inputs, so LSTM is used to forecast X 1 (t). On the other hand, the left components, X 2 (t), . . . , X Γ (t), mainly contain the short-term changes and fluctuations of the traffic flow, and SVR is more sensitive to shortterm changes; therefore, SVR is applied to predict the left random flows, respectively. Finally, the corresponding prediction results, Y 1 (t), Y 2 (t), . . . , Y Γ (t), of different components are superimposed to form the predictive value Y(t) of the original time series X(t) and expressed as

Dataset Description and Performance
Indicators. e traffic flows of 347 toll stations of Guizhou province in China in January 2016 are obtained. Figure 3 shows the positions of 347 toll stations. e maximum daily average traffic volume reaches 14120, the minimum value is 5, and the traffic volumes varies greatly from site to site. According to the daily average traffic volumes, the toll stations are divided into three types, namely, less than 2000, 2000 to 6000, and above 6000. e traffic flows with different volumes from 46 stations (about 13.5% of toll stations) in one month are selected as experimental data to evaluate the performance of our algorithm. e intervals are set to 5, 10, 15, 30, 45, and 60 minutes (min).
Usually, mean absolute error (MAE), root mean square error (RMSE), mean absolute percentage error (MAPE), accuracy, and goodness of fit (R 2 ) are used to evaluate the performances of predictive models, and they are defined as follows: and where y i and y i are predictive value and ground truth, and y i is the mean of y i . Low prediction errors (MAE, RMSE, and MAPE) and a high accuracy and R 2 imply a good prediction performance.

Characteristics of Traffic Flows and SSA Decomposition.
Setting one day as the statistical period and 15 minutes (min) as the time interval, as an example, the traffic flows in the same historical periods in January 2016 of Guiyang North Toll Station are shown in Figure 4. It can be seen the traffic flows vary from day to day, with relatively fixed change trends and laws and strong randomness. Set the population as 20, the maximum number of iterations as 100, the initial and termination weight factor as 0.9 and 0.2, both learning factors (ζ 1 and ζ 2 ) as 2, minimum and maximum moving speed as −1 and 1, and the objective function as the goodness of fitting (R 2 ). PSO and SSA are applied on the traffic flows in 15 min of Guiyang North Toll Station from January 25 to 31, 2016. en, the number of the components and their weights are obtained, and they are μ 1 � 0.3, μ 2 � 0.3, μ 3 � 0.3, and μ 4 � 0.1. Figure 5 demonstrates the original traffic flow and the decomposed 4 components. e principle component (the first one in Figure 5(b)) is basically consistent with the magnitude and change trends of the original traffic flow, and the traffic curve with obvious change rules is relatively smooth. e left three random components (Figures 5(c)-5(e)) have small magnitudes, large Advances in Civil Engineering 5 random disturbances, and no obvious change rules. ese characters will greatly increase the difficulty of prediction; however, their influences on the original traffic flow cannot be ignored and discarded as noise in traffic flow prediction.

Predictive Performance.
With these decomposition parameters, the traffic flow of toll stations of Guiyang North Station in 5 min, 10 min, 15 min, 30 min, 45 min, and 60 min intervals are selected to verify the prediction performance of the SSA-LSTM-SVR model. Five prediction models, ARIMA, SVR, BP, KNN, and LSTM, are introduced for comparative analysis. ese 5 comparative models have different implementation mechanisms, and they obtain the characteristics and variation laws of time series data and implement the traffic flow forecasting in different means. In which, ARIMA, BP, and LSTM directly obtain the temporal dependencies of the time serial, ARIMA is more sensitive to the short-term linear changes, and BP and LSTM can both capture the linear and nonlinear variations of the time series with long short-term dependencies, which are more robust and stable than the ARIMA. On the other hand, SVR and KNN adopt the hyperplane and similar pattern matching to achieve the traffic flow prediction.
As shown in Table 1, the MAE, RMSE, and MAPE of SSA-LSTM-SVR model are all minimum, and the fitting degree R 2 and prediction accuracy are higher than those of others. ey all indicate that the SSA-LSTM-SVR model can better recognize the time series short-term change rules of the traffic flow and has a better prediction performance.  Table 2. e MAE, MAPE, and RMSE of the SSA-LSTM-SVR model are still the minimum, and the fitting degree R 2 and prediction accuracy are also higher than those of ARIMA, SVR, KNN, BP, and LSTM. is indicates that the SSA-LSTM-SVR model has a good generalization and better prediction performance than others. MAPE of the SSA-LSTM-SSR model is reduced by at least 5.00% and 6.33% on average, and the prediction accuracy is improved by at least 5.0% and 6% on average.

Predictive Performance on
Let time interval be 15 min. SVR, LSTM, and SSA-LSTM-SVR are selected to further observe the differences between the predictive values and the ground truth on three stations with different traffic volumes. As Figure 6 shows, it can be seen that the predicted values of SSA-LSTM-SVR fit the ground truth well both in off-peak and peak hours, and it can capture the subtle changes of time series including the downward and upward tendencies. Overall prediction errors of SVR and LSTM are relatively large, especially in case of sudden changes of the traffic flow. As the traffic flow declines and ascends, LSTM and SVR cannot well capture the change tendencies in time.

Discussion
Any prediction model has its applicable scope, advantages, and disadvantages. For example, ARIMA is suitable for stable time series, while it is difficult to obtain the nonlinear relationship in the series. When the query mode does not appear in the history database, KNN will produce a large prediction error, and when the input time step of LSTM exceeds 32, the prediction performance will decline. erefore, reasonable model selection is the basis for accurate prediction. For the expressway traffic flow with periodic, sudden, and random changes, it is difficult for a single model to obtain all characteristics and accordingly make accurate short-time prediction. In our SSA-LSTM-SVR model, SVD firstly decomposes the track matrix constructed by the traffic flow serials, which is equivalent to finding and extracting the intrinsic autocorrelation relationships of the traffic flow sequence. And then according to these correlations, SSA finds out the inherent change regularities and random features in the sequence through grouping, reconstruction, and component decomposition. ese operations decompose the traffic flow serials into different components and are conducive to highlighting their respective regularities and characteristics and weakening the mutual influence caused by component hybridity. In order to fully take advantage of the acquired features of different components, SSA-LSTM-SVR puts forward the idea of separately forecasting. LSTM is more suitable for the principle component with great regularity and smoothness, while SVR is more suitable for the left components with great random and sudden changes. Finally, in terms of respective predictive results, the SSA-LSTM-SVR model achieves a more accurate prediction performance, which actually reflects the shortterm changes of the expressway traffic flow. In SSA-LSTM-SVR, SSA implements components decomposition according to the linear autocorrelation of the traffic flow serial, and the LSTM and SVR capture nonlinear relationships of different components. erefore, compared with the ARIMA, KNN, BP, SVR, and LSTM models, the proposed SSA-LSTM-SVR model can catch more features and accordingly achieves a better prediction performance. In the SSA-LSTM-SVR, the differences between the predicted value and the real value of are the least, and the fitting effects are the best.
e above experiments and discussion illustrate that it is difficult for a single model to fully capture the characteristics of the complicated traffic flow sequence of the expressway, while a mixed model will be a more reasonable choice.

Conclusion and Future Work
e real-time prediction of expressway traffic flow is always an important topic in traffic management. Due to the inherent complexity of expressway traffic flow, the prediction accuracy of a single model is limited; thus, the idea of combined prediction model is proposed in this paper. First, in terms of the characteristics of the traffic flow, it is decomposed into 1 principle component and 3 random components by SSA. As the magnitude and change regularities of the principle component are basically consistent with the original traffic flow, with smoother changes, LSTM is applied to capture these features and make short-term forecast. Since the left three random components have small magnitude, large random disturbance, and no obvious change rule, SVR is introduced to forecast, respectively. e estimates of each component are then added together to obtain the final forecast values. Experiments on the actual traffic flows of Guizhou expressway show that compared with the ARIMA, LSTM, KNN, BP, and SVR models, the SSA-LSTM-SVR model has a better prediction performance, whose differences between the predicted value and the real value are the least, and the fitting goodness is the best. Meanwhile, MAPE of SSA-LSTM-SSR is reduced by at least 5.0% and 6.33% on average, and the prediction accuracy is improved by at least 5.0% and 6% on average.
From the findings, we would like to suggest that if the time series has complex characteristics, such as regularity, suddenness, and randomness, it may be an optimal choice to use the mixed model for short-term prediction. is paper only considers the traffic flows of single toll station. In the future research, the interaction affection of traffic flows between toll stations in the expressway network will be further discussed, so as to achieve a more accurate shortterm traffic flow prediction.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.  Advances in Civil Engineering 9