A Hybrid Short-Term Traffic Flow Multistep Prediction Method Based on Variational Mode Decomposition and Long Short-Term Memory Model

Timely and accurate traﬃc prediction information is essential for advanced traﬃc management system (ATMS) and advanced traveler information system (ATIS). Because of the characteristics of nonlinearity, nonstationarity, and randomness, short-term traﬃc ﬂow prediction could be still a challenging task. In this study, a hybrid short-term traﬃc ﬂow multistep prediction method is proposed by combining the variational mode decomposition (VMD) algorithm and long short-term memory (LSTM) model. Firstly, the VMD algorithm is employed to decompose the original traﬃc ﬂow data into a series of intrinsic mode function (IMF) components. Secondly, diﬀerent LSTM models are established to predict diﬀerent IMF components. For each prediction model, one-step to three-step predictions are carried out. Finally, the component prediction results are aggregated to obtain the ﬁnal traﬃc ﬂow multistep prediction values. The prediction performance of the proposed hybrid model is investigated using inductive loop data measured from the north-south viaduct expressway in Shanghai. The experiment results show that (1) VMD algorithm could eﬀectively avoid the problems of endpoint eﬀects and modal aliasing, and the decomposition eﬀect is better than empirical mode decomposition algorithm and wavelet decomposition algorithm; (2) among all the involved methods, the proposed hybrid model is more eﬀective and robust in extracting the trend information, which has the best multistep prediction performance.


Introduction
High-precision traffic flow prediction information is definitely considered as one of the most important parts in intelligent transportation systems. e accurate and timely traffic flow prediction information can not only provide strong support for travel decisions but also transform traffic management mode. For travelers, it would enable them to plan their trips ahead of time and timely adjust their trip mode and trip route with the dynamic short-term traffic prediction information. For traffic managers, the short-term traffic flow prediction information would enable them to take traffic control measures early enough to avoid traffic congestion rather than to deal with the traffic problems after the traffic congestion has already occurred [1]. Prediction within 15 minutes ahead is typically called short-term prediction. However, owing to the randomness and volatility characteristics of the traffic flow data, it is difficult to develop the satisfactory short-term traffic flow prediction models. and more hybrid prediction models have been proposed. For example, Hou et al. proposed an adaptive hybrid short-term traffic flow prediction method. Firstly, the linear autoregressive integrated moving average model and nonlinear wavelet neural network model were used to predict shortterm traffic flow. en, prediction results of the two single models were analyzed by fuzzy logic, and the weighted result was regarded as the final predicted results [20]. Zhang et al. proposed a traffic flow multistep forecasting method by decomposing the data into three modeling components: an intraday or periodic trend by introducing the spectral analysis technique, a deterministic part modeled by the ARIMA model, and the volatility estimated by the GJR-GARCH model [21]. Moretti et al. provided a one-hour forecast of urban traffic flow rates by combining artificial neural networks and a simple statistical approach [22].
In recent years, with the rapid development of data science, the data-driven models for short-term traffic flow prediction have attracted wide attention. Generally, these data-driven models mainly consist of the data processing algorithms and machine learning algorithms. e data processing algorithms are usually used for data denoising [23,24], data decomposition, and data reconstruction [25,26], which can effectively extract the significant features of the original traffic flow data [27]. Among the various data processing algorithms, the wavelet decomposition (WD) [28] and empirical mode decomposition (EMD) are the two mainstream algorithms for short-term traffic flow forecasting. Additionally, some improvement algorithms have also been extensively utilized, such as wavelet packet decomposition (WPD), ensemble empirical mode decomposition (EEMD), and complete ensemble empirical mode decomposition (CEEM-DAN). Numerous research studies have proved the effectiveness of these algorithms. For example, Zhang et al. [29] presented a short-term traffic flow prediction method based on improved wavelet packet analysis and long short-term memory neural network model. Wei et al. [30] proposed a hybrid method by combining EMD and backpropagation neural networks (BPN) to predict the short-term passenger flow in metro systems. Yang [31] proposed a hybrid traffic flow multistep prediction method based on EMD and stacked autoencoder model. Li et al. [32] proposed a travel time prediction model based on EEMD and random vector functional link network. EEMD is employed to decompose the complex travel time data series into several simple functions. Tian et al. [33] presented a CEEMDAN-PE-OSELM forecasting model based on CEEMDAN, permutation entropy (PE), and online sequential extreme learning machine (OSELM). However, there are still many shortcomings in these algorithms. For instance, the decomposition effect of WD and WPD depends on the choice of the basis function and has poor adaptability, while EMD and its improved algorithms have the problems of endpoint effect and over envelope. In order to solve these drawbacks, a novel data decomposition algorithm called variational mode decomposition (VMD) is put forward. e VMD algorithm has been verified to have superior prediction ability in wind speed prediction [34], foreign exchange rates' prediction [35], and carbon price prediction [36]. e machine learning algorithms are widely applied in the field of traffic flow forecasting because of their excellent performance in data mining and pattern recognition. Zhu et al. [37] put forward a new method based on RBF neural network to predict the short-team traffic volume. Vanajakshi et al. [38] compared the effect of the support vector machine model and the artificial neural network model in short-term traffic flow prediction. Wei et al. [39] integrated the timevarying deviation of the day-to-day traffic variation into support vector regression and proposed an adaptive support vector regression short-term traffic forecasting model. Recently, different types of deep learning algorithms, such as long short-term memory (LSTM), convolutional neural network (CNN), deep belief network (DBN), and recurrent neural network (RNN), have generated great interest among research studies due to their great ability to capture complex nonlinear relationships of traffic flow. Ma et al. [40] developed a long short-term memory neural network model for shortterm travel speed prediction, and the proposed algorithm can determine the optimal time window for time series in an automatic manner. Polson [41] proposed an innovative deep learning architecture to predict traffic flows and showed that deep learning architectures can capture nonlinear spatiotemporal effects. Wu et al. [42] designed a novel traffic flow prediction method based on deep learning framework. Guo et al. [43] proposed a novel end-to-end deep learning framework to model the complex explanatory variables for traffic forecasting. Yu et al. [44] proposed graph-based learning models for traffic forecasting that mimicked real traffic propagation based on variable adjacency matrices.
Motivated by the successful applications of models with attention, this paper will put forward a hybrid short-term traffic flow multistep prediction method based on the VMD algorithm and the LSTM model. e main contributions of this study are summarized as follows. (1) e variational mode decomposition algorithm is employed to decompose the original traffic flow data into several intrinsic mode function (IMF) components. (2) Different LSTM models are established to predict different IMF components, and the component prediction results are integrated to obtain the final traffic flow prediction values. (3) Comprehensive performance comparisons are conducted using inductive loop data measured from the north-south viaduct in Shanghai. e rest of this paper is organized as follows. Section 3 introduces the theoretical basis of relevant algorithms and the framework of the proposed hybrid method. Section 4 demonstrates case study where the prediction results of the proposed method and other involved methods are evaluated. Finally, Section 5 provides conclusion of this research.

Variational Mode Decomposition Algorithm.
e variational mode decomposition (VMD) algorithm is a novel process to examine nonstationary and nonlinear signals in signal processing, which can decompose a signal into a series of modes with specific bandwidth in spectral domain [45]. Each mode can be compacted around a center pulsation determined during the decomposition process. e whole framework of the variational mode decomposition is the variational problem, which mainly includes the construction of the variational problem and its solution. e time-series data of traffic flow is regarded as nonstationary signal f, and the variational problem is described as seeking K mode functions u k (t)(k � 1, 2, . . . , K). In order to obtain the bandwidth of each mode, three steps should be fulfilled [46]: (1) For each mode function u k (t), the Hilbert transform is adopted to obtain unilateral frequency spectrum: (1) (2) For each mode function u k (t), the frequency spectrum is modulated to the corresponding fundamental frequency band by applying an exponential tuned to the respective estimated center frequency: (3) e bandwidth of u k (t) could be estimated by utilizing the Gaussian smoothness of the modulation signal. en, the constrained variational problem can be expressed as min where f(t) indicates the original signal, t indicates the time interval, K denotes the number of the modes, u k denotes the kth mode, δ(t) denotes the Dirac distribution, and w k is the center frequency. In order to convert the above optimization problem into an unconstrained one, the penalty term and Lagrangian multipliers are employed, which can be shown as where α is the second penalty factor and λ(t) is Lagrange multiplication operator. e alternating multiplier direction method is used to solve the unconstrained problem in formula (4), which can find the saddle point of the augmented Lagrangian. Based on the alternating multiplier direction method, u k and w k can be continuously updated in two directions to complete the variational mode decomposition. e solution of the optimization problem for u k can be explained as follows: where n is the number of iterations f(w), u i (w), λ(w), and u n+1 k (w) are the Fourier transforms of f(t), u i (t), λ(t), and u n+1 k (t), respectively. e solution of the optimization problem in frequency domain for w k is as follows: e specific process of VMD decomposition is as follows: Step 1: initialize u 1 k , w 1 k , and λ 1 , and set the number of iterations to 1.
Step 3: for all w ≥ 0, update λ n according to the following formula: Step 4: repeat Steps 2 to 3 until iteration constraints are met, that is,

Prediction Model Based on LSTM.
Long short-term memory (LSTM) network is a special kind of recurrent neural network (RNN). Compared with the traditional RNN, LSTM adds a cell state in the hidden layer, which solves the problem that RNN is easy to fall into gradient disappearance or gradient explosion. LSTM network consists of the input layer, the hidden layers, and the output layer. e main characteristics of the LSTM are the memory cells in its hidden layers, which contain memory blocks rather than traditional neuron nodes. Each block has several self-connected memory cells and three multiplicative units, input, output, and forget gates. e network structure of LSTM is shown in Figure 1. e LSTM generates a mapping from an input sequence vectors to an output probability vector by calculating the network unit activations using the following equations: Discrete Dynamics in Nature and Society where x indicates input vector, y indicates output vector, i, f, o, and C denote input gate, output gate, forget gate, and cell state, respectively, W represents the linear transformation matrices, b is the bias term, and σ(·) is sigmoid nonlinear function. e output value of the model is calculated according to formulas (9)- (14), and the error term S and weight gradient loss of each LSTM cell are inversely calculated according to the defined error functions, namely, equations (15) and (16): where N is the number of samples, h t is the predicted value, and h * t is the measured value. e memory gate unit and threshold limit are introduced in LSTM to realize the effective utilization of long-distance information and respond to the data with large changes in time. rough multigate cooperation, LSTM has good robustness, which can effectively deal with the problems of gradient loss and gradient explosion. Based on the significant advantage of LSTM in long-term time-series prediction, the LSTM model is used to predict the traffic flow time-series data.

e Whole Process of the Proposed Model.
It is quite a challenge to capture the fluctuation rules of short-term traffic flow data, so direct prediction of traffic flow data cannot achieve good prediction effect. e greater the data fluctuation is, the less accuracy of prediction is. erefore, when forecasting the short-time traffic flow, the traffic flow time series first will be preprocessed. In this paper, we put forward a hybrid short-term traffic flow prediction method based on VMD algorithm and LSTM model. e architecture of the proposed method is depicted in Figure 2.
Step1: the VMD algorithm is employed to decompose the original traffic flow data into a series of subsequence. e number of decomposition subsequences K is determined by observing the central frequency. When the central frequency of the last layer is relatively stable, K can be considered as the best value.
Step2: different LSTM models are established to predict different IMF components obtained by VMD.
Step3: for each prediction model, one-step, two-step, and three-step predictions are conducted. e multistep predictions are carried out by introducing the prediction result of the previous step into the predicting model.
Step 4: the component prediction results of each step are accumulated to obtain the final traffic flow multistep prediction values.

Experiment Data and Evaluation Criteria.
e northsouth viaduct expressway of Shanghai, China, is selected as the study scenario; the satellite map of selected urban expressway is shown in Figure 3. is section of expressway runs from Yandong interchange to Gonghe interchange, which includes 24 main line detection sections and 30 ramp detection sections. A total of 88 main line coil detectors and 60 ramp coil detectors are arranged. e average distance between the main line detectors is about 500 meters. e data collection time period starts from August 27 to August 31 in 2018. e sampling interval of the data is set to 5 minutes. Figures 4 and 5 give the traffic flow data of NBDX 16(2) and NBXX 11(3) collecting from five consecutive days.
In this paper, three evaluation criteria are employed to judge the performances of proposed method: the root of mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE). e less the values of these criteria are, the closer the prediction results are to the original traffic flow values. e evaluation criteria are defined as follows: where y i represents the actual data and y i represents the prediction data and n is the number of samples.

VMD Results.
Before VMD of traffic flow data, it is necessary to determine the number of decomposition subsequences K. If the value of K is too small, the mode aliasing phenomenon will appear in the decomposed subsequence. If the value of K is too large, the mode repetition or additional noise will be produced. In this paper, the value of K is determined by observing the central frequency. Firstly, the original traffic flow timeseries data is decomposed by VMD at different K values. en, the center frequency in different layers is calculated. Next, the center frequency of IMF component in the last layer at different K values is compared in order to obtain the best value of K. When the central frequency of the last layer is relatively stable, K can be considered as the best value. e parameters of VMD algorithm are set as follows: α � 2000, τ � 0.3, and ε � 10 − 7 . It can be seen from Tables 1 and 2   Discrete Dynamics in Nature and Society when K is greater than 4, the center frequency of IMF component in the last layer remains relatively stable, so K � 5 is selected as the decomposition number in the experiment. e decomposition results of traffic flow data are illustrated in Figures 6 and 7.
In Figures 6 and 7, the display on the left is the five intrinsic mode functions obtained by the VMD, and the right side is the spectrums corresponding to the figures on the left. It is obvious that the signals on different scales exhibit completely different characteristics and the frequency of each component increases gradually from IMF1 to IMF5. e VMD algorithm decomposes the highly complex traffic flow data into multiple periodic and regular components, which is conducive to discover and learn more information contained in the traffic flow data and make the prediction results more accurate.

e Experimental Results of the Proposed Model.
In order to evaluate the multistep prediction performance of the proposed method, one-step predictions to three-step predictions are carried out. e number of layers of the LSTM model is set to 4 which includes input layer, output layer, and two LSTM layers. e number of neurons is set to 128 and the input time window is 4. According to the different multistep prediction logic, the multistep prediction method can be divided into two types: direct multistep prediction and iterative multistep prediction. In the direct multistep prediction method, many different prediction models need to be established to realize multistep prediction. Moreover, the larger the prediction time span, the lower the correlation between the current input variable and the corresponding prediction object, which will lead to the increase of prediction error. In the iterative multistep   prediction method, introducing the prediction value into the input information will lead to certain error conduction, especially when there are too many prediction steps. According to the research results of [47], the prediction effect of iterative multistep prediction is better than that of direct multistep prediction. erefore, iterative prediction is adopted in the multistep prediction process in this paper. As shown in Figure 8, when predicting y t+2 , the input variables are y t−p+1 , . . . , y t , y t+1 . Figures 9-14 give the one-step to three-step prediction results of nbdx 16(2) and nbxx 11 (3). e blue curve represents the actual traffic flow data and the red curve represents the prediction data.
As can be directly seen from Figures 9-14, the prediction results of the proposed method not only accurately describe the dynamic fluctuation trend of the original traffic flow sequence but also are similar to the real traffic flow data in the numerical value. erefore, it can be concluded that the  Decomposition results ×10 5 The corresponding frequency spectrum  Discrete Dynamics in Nature and Society proposed hybrid method could effectively improve the prediction accuracy.

Experimental Results' Comparison of Different Models.
In order to avoid randomness and receive reliable results, five-fold cross validation is used. As shown in Figure 15 Tables 3 and 4 show the prediction performance of different models using data source collected from NBDX16(2) and NBXX11 (3). From the observation of prediction results in Table 3 and 4, several important conclusions can be drawn as follows: (1) It is worth noting that all hybrid prediction methods obviously outperform the single models in terms of prediction accuracy, which indicates that decomposition of traffic flow time series prior to forecasting can significantly improve the prediction performance. (2) e multistep prediction accuracy of the VMD-LSTM hybrid method is generally much better than the EMD-LSTM method and the WD-LSTM method. is indicates that VMD algorithm is a more effective decomposition algorithm.
(3) e prediction error of the VMD-LSTM hybrid method is much lower than that of the VMD-SVM method and the VMD-BP method, which indicates that the prediction performance of the LSTM model is better than that of the traditional prediction model.
From the above experiments, we can conclude that the VMD-LSTM hybrid model has superiority and wide applicability.

Conclusion
To achieve accuracy short-term traffic flow forecasting, a hybrid short-term traffic flow multistep prediction method based on VMD algorithm and LSTM model is proposed. e main idea consists of two parts: the data processing technique and the performance of the prediction model. For the volatility and randomness of traffic flow, VMD algorithm is employed to transform the unstable traffic flow sequence into a series of relatively stable intrinsic mode function (IMF) components. e periodic characteristics of the IMF components are more obvious than the original traffic flow sequence. Considering the nonlinear fitting ability and selflearning adaptive characteristic of the LSTM model, different LSTM models are established to predict different IMF components, and the results are accumulated to obtain the final prediction results. For each prediction model, 1-step to 3-step iterative predictions are carried out. e traffic flow data measured from the north-south viaduct expressway in Shanghai are used to carry out the experiment. According to the experiment results, it can be concluded that (1) compared with single models, the proposed hybrid model has a greater improvement in short-term traffic flow prediction and (2) compared with EMD algorithm and WD algorithm, VMD algorithm can effectively solve the problems of modal aliasing and endpoint effects, which could make the periodic characteristics of each IMF more obvious and improve the prediction performance.
Overall, this paper succeeds in proving the possibilities to forecasting short-term traffic flow based on VMD algorithm and LSTM model. e results of this paper can be used for the implementation of an advanced traffic management system and advanced traveler information system. However, there are still several limitations in the current study. Firstly, the hyperparameters of the LSTM model are determined by reference to the existing literature. In future research, different parameter optimization algorithms could be used to obtain the optimal hyperparameters. Secondly, different time-slice data such as 1 min and 2 min can be tested, which could check the generalization ability of the proposed method.
irdly, short-term traffic flow data in different scenarios such as arterial road and expressway can be used to test the effectiveness of the proposed method.

Data Availability
e data used to support the findings of the study are included within the article.  Discrete Dynamics in Nature and Society