An Improved CEEMDAN-FE-TCN Model for Highway Traffic Flow Prediction

,


Introduction
With the development of the social economy, the existing transportation supply has gradually been unable to meet the increasing traffic demand. Urban traffic congestion is continuously aggravated, resulting in economic losses, environmental pollution, and energy waste [1]. Intelligent Transport System (ITS), as an essential part of traffic management, combines the advanced technology of communication, information, and artificial intelligence. ITS aims to deliver real-time traffic information accurately to help travelers better route planning. At the same time, it can also improve the identification ability of traffic evolution trends and particular traffic situations, support the traffic management department to give early warning and command of emergencies, and effectively reduce casualties and economic losses [2][3][4]. Specifically, one of the critical technologies of ITS is short-term traffic prediction, which is the core of the active control of urban traffic systems [5]. rough the deep excavation of big data, the inherent evolution law of traffic flow can be mastered to achieve accurate and real-time prediction, which provides precise travel information for travelers and policy suggestions for managers to control beforehand. Traffic prediction of different periods has its application value. Short-term traffic flow prediction is essential for the traffic control department and travelers. For the traffic management department, short-term traffic flow prediction can help identify the evolution of traffic flow to formulate short-term traffic control measures such as lane closure and ramp control in advance, effectively alleviating potential traffic congestion. Besides, it can help travelers better understand the operation condition of the road network and make path planning accordingly [6][7][8][9]. erefore, short-term traffic flow prediction is of practical significance and worth studying.
Two methods have dominated traffic forecasting research in the existing literature: statistical methods and machine learning methods [10]. Statistical methods based on linear statistics include the ARIMA method [11], Kalman filter method [12], Markov chain method, etc. [13], which is more suitable for the road section with stable traffic conditions. However, the traffic flow nonlinearity is prominent when the prediction interval becomes smaller, resulting in low accuracy. Because of the fluctuation of traffic flow, the prediction method based on machine learning has drawn increasing attention, through which the inherent law of traffic data is excavated to capture the dynamics of traffic flow. For example, Wu et al. applied Support Vector Regression (SVR) to predict travel time and mapped the data to a high-dimensional space for regression, achieving good prediction results [14]. e SVR model is robust to noisy data and is more suitable for a small sample size. Cai et al. introduced K-Nearest Neighbor (KNN) model to realize multistep prediction on space and time, but the time complexity of calculation was high [15]. Besides, Csikos et al. constructed Artificial Neural Network (ANN) to learn the traffic speed dynamics through traffic speed samples in a month for prediction [16]. In recent years, with big data acquisition, deep learning models can capture more complex traffic features and have prospective applications [17,18]. As one of the most typical methods, the Recurrent Neural Network (RNN) has a circular structure different from ANN. By feeding back the hidden layer information of the last moment to the input of the current moment, the temporal correlation of the traffic flow can be captured [19]. Traditional RNN mainly includes three structures: Elman Neural Network [20], Time-Delay Neural Network (TDNN) [21], and Nonlinear Autoregressive with Exogenous inputs Neural Network (NARX NN) [22]. Unfortunately, they all result in gradient vanishing and explosion problems, making it challenging to capture long-term information.
Nevertheless, it is shown that the traffic events that occurred in the previous period usually impact the predicted time, so RNN forecasting methods need to be further improved. Ma et al. firstly applied Long Short-Term Memory Neuron Network (LSTM NN) to predict traffic speed, which realized the memory of helpful information in a short and long time through the gate units and overcame the defect of traditional RNN. e results showed that the prediction performance was significantly better than other prevailing methods [23]. As the variant of LSTM, Gated Recurrent Unit (GRU) simplifies the structure, improving the prediction efficiency. Gao et al. combined GRU with MFD to forecast the traffic speed [24]. Other improved models like Attention-Based LSTM [25,26] and BiLSTM [27,28] achieved high accuracy on traffic prediction.
However, the process of RNN models is serial, meaning that later timesteps must wait for their predecessors to complete. For long-term sequence features capturing, RNNs use up much memory to store the partial results for their multiple cell gates. Convolution Neural Network (CNN) can extract the information of the long-term sequence parallelly because of the shared weights of the kernel [29]. As the length of the sequence increases, the network is deepened to learn the features, making it challenging to train. With the causal and dilated convolution, Temporal Convolution Network (TCN) achieves a flexible receptive field size, capturing the long-term historical information by a simple structure [30]. Zhao et al. improved the residual block of TCN for faster training speed and applied it to traffic flow prediction [31]. Zhang et al. used the genetic algorithm to optimize the hyperparameters of TCN. e results showed that the prediction performance was significantly better than other prevailing methods [32]. e inherent changing law of traffic flow is complex, consisting of various dynamics on different temporal scales. Although the deep learning models can capture long-term historical information, they need a deep network and take up much training time and memory. us, it is necessary to decompose the traffic flow time series, which simplifies the structure of prediction models and extracts features thoroughly and effectively. Huang et al. proposed Empirical Mode Decomposition (EMD), which decomposed the trend or fluctuation of different scales in signals consecutively to generate a series of IMF with different frequencies [33,34]. Unlike wavelet transform, it is an adaptive and data-driven method without a defined wavelet basis. eoretically, signals with nonlinearity and randomness can be decomposed. However, the conventional EMD decomposes the signal incompletely, causing mixing and false modes. us, several improved models were proposed to solve these problems [35][36][37][38]. In recent years, EMD related methods have been gradually introduced to traffic prediction. For instance, Wei et al. combined EMD with Backpropagation Neural Network (BPNN) to predict the subway passenger flow, which showed notable performance. Only modes highly correlated with the original data were selected to improve the prediction efficiency [39]. Likewise, Chen et al. applied Ensemble Empirical Mode Decomposition (EEMD) to decompose the traffic flow time series, removed the high-frequency mode, and introduced LSTM NN to predict the left reconstructed modes [40]. However, what cannot be ignored is that each IMF plays an essential part in the time series, and the immediate abandonment of some modes will lead to the lack of detailed information on traffic flow features. Lu [43]. However, the value of K has not been chosen with a theoretical basis, and the BiLSTM may take up much memory usage. ough mixing modes were solved to some extent, the residual noise and spurious modes remained. Also, the prediction on every IMF resulted in poor efficiency. Moreover, the in-depth change features of traffic flow may not be captured because of the small training data size or high memory usage. e existing research on the decomposition prediction of traffic flow time series is insufficient and remains preliminary. Such problems as incomplete decomposition, low prediction efficiency, high storage of memory, and the deep capture of traffic flow dynamics need further investigation. erefore, in this paper, an improved CEEMDAN-FE-TCN model is proposed to forecast highway traffic flow. First, the improved CEEMDAN method decomposes the nonlinear highway traffic flow into IMF and residual with different frequencies. Next, the fuzzy entropy (FE) of each mode is calculated. IMF and residual with similar chaos are recombined, highlighting the traffic dynamics. Finally, the TCN is applied to predict the different recombined subsequences. After reconstructing the output of TCN submodels, the predicted traffic flow is obtained. e contributions of the paper can be summarized as follows: (i) e improved CEEMDAN method is first used for highway traffic flow decomposition. e changing features are decomposed to different temporal scales, making TCN extract the dynamics thoroughly. (ii) e FE difference of different modes decomposed from the original data is calculated. On this basis, the modes are recombined as subsequences, which highlights the primary trend of traffic flow changes and retains specific fluctuations. e computational complexity is reduced, and the forecasting efficiency and accuracy are further improved. (iii) e proposed improved CEEMDAN-FE-X framework can be applied to decrease the prediction error of prevailing models notably. Moreover, the improved CEEMDAN-FE-TCN model outperforms other models compared in this paper, which has strong robustness. e rest of the paper is arranged as follows: In section 2, an improved CEEMDAN-FE-TCN model is proposed for traffic flow prediction. Section 3, Section 4, and Section 5 introduce the improved CEEMDAN, FE, and TCN, respectively. e prediction effects of the proposed model are verified on two sensors in Section 6. Finally, Section 7 summarizes the conclusions and future directions.

The Improved CEEMDAN-FE-TCN Model
In this paper, an improved CEEMDAN-FE-TCN model is constructed for highway traffic flow prediction, which contains three modules: improved CEEMDAN decomposition, FE calculation, and TCN prediction.
TCN is applied as the core module to predict the highway traffic flow. As a new neural network with a convolutional structure, TCN has the advantages of largescale parallel processing of CNN and integrates the modeling ability of sequential tasks, which makes up for the long-term dependence problem of RNN [44]. e RNN variants like LSTM and GRU memorize part of the information through the gated unit, while TCN can capture all the historical information with better prediction and faster training speed [30].
However, the traffic flow time series consists of different temporal scaled changing features, causing fluctuation and nonlinearity. It is challenging for TCN to extract the mixed dynamics thoroughly. So, the improved CEEMDAN model is adopted to decompose the sequence to IMF and residual, making TCN capable of capturing the features on every single temporal scale. e modes decomposed by improved CEEMDAN have physical significance. Nevertheless, from the traffic point of view, some IMF may be part of traffic flow dynamics on a specific time scale. Besides, each IMF needs a corresponding TCN submodel for training and predicting, causing complex computation. erefore, FE is introduced to calculate the complexity of every IMF decomposed by the traffic flow time series. e sequences with close FE have similar temporal scales and stationarity, indicating that TCN will have the same feature extracting ability on the recombined sequence as every single sequence. e recombination will highlight the changing features of traffic flow and eliminate the accumulated error on multiple similar sequences prediction. us, the modes with similar FE are recombined as the input of TCN, reducing calculation complexity and improving prediction efficiency and accuracy.
e output of every TCN submodel is the predicted traffic flow on different time scales. After reconstruction, the final predicted traffic flow is obtained. e framework of the proposed model is shown in Figure 1.
e procedures in specific are expressed as follows: Step 1: the improved CEEMDAN method is introduced to decompose the original traffic flow time series to obtain k IMF and residual with different frequencies.
Step 2: the FE of each mode is calculated. According to the difference between the modes, the IMF and residual with similar chaos are recombined to subsequences (RS).
Step 3: the TCN submodules are adopted to train and predict RS (1)-RS (n), respectively; then the prediction results are reconstructed to obtain the predicted highway traffic flow.

CEEMDAN Algorithm.
e CEEMDAN algorithm can eliminate the mixing modes to some extent. Each IMF is calculated through the residual signal by adding white noise adaptively in the IMF decomposition process, reducing the reconstruction error. e method has good integrity and reduces the number of integrations. e specific steps are shown as follows [37]: Step 1: a series of Gaussian white noise is added adaptively to the original signal x: (1) x (i) denotes the time series after adding white noise for the ith time; β 0 denotes the noise factor; ω (i) denotes the white noise added for the ith time; I denotes the number of integrations.
Step 2: the EMD algorithm is used to decompose x (i) , and the first EMD mode d (i) 1 is averaged to calculate the first CEEMDAN mode as follows: Remove d 1 from x to obtain the rst residue as in Step 3: decompose r 1 + β 1 E 1 (ω (i) ) by the EMD algorithm to obtain the second CEEMDAN mode: where E k (·) denotes the kth mode decomposed by the EMD algorithm.
Step 4: repeat the following process to calculate the remaining modes until the remaining residual cannot decompose.
where K denotes the number of the CEEMDAN modes.
e nal residual is calculated as e original x can be expressed as Step 2 Step 2

Improvements on CEEMDAN.
Although the CEEM-DAN method has overcome mode mixing, residual noise and spurious modes remain. On this basis, the improved CEEMDAN algorithm was proposed, which has two perfections: One is to estimate the local mean of the signal plus noise and define the difference between the current residue and the average of its local means as the primary mode, which reduces the residual noise existing in the decomposition mode. e other is to extract the kth mode by using E k (ω (i) ) to replace white noise, reducing mode overlap. erefore, the improved CEEMDAN method is adopted to decompose the original traffic flow time series. e steps can be described as follows [38]: Define operator E k (·) as the kth mode decomposed by EMD, operator M(·) as the local mean of the mode, and operator · as mean operation. en, Step 1: is constructed to calculate the first residue: Step 2: the first mode can be calculated as Step 3: the second residue is estimated as the mean of a series of r 1 + β 1 E 2 (ω (i) ) and the second mode is defined as Step 4: for k � 3, . . . , K, the kth residue is expressed as Step 5: the kth mode of the improved CEEMDAN can be obtained: Step 6: go to Step 4 for next k.

Fuzzy Entropy
Fuzzy entropy (FE) measures the complexity of time series and the probability of generating new patterns when the dimension changes. e higher the time series complexity, the higher the entropy [45]. e fuzzy membership function is introduced to make the fuzzy entropy continuous and smooth with the change of parameters, reducing the sensitivity dependence on parameters, and the statistical results are stable [46]. e process of FE calculation is shown as follows [47]: Step 1: the dimension is set for the IMF of traffic flow time series X � [x(1), x(2), . . . , x(N)], and the mdimension vector is constructed as follows: i � 1, 2, . . . , N − m + 1; then, u(i) can be expressed as Step 2: the distance d m ij of vectors X m (i) and X m (j) is calculated as Step 3: introduce the membership function: r denotes the similarity tolerance parameter, which means R times the standard deviation of the original one-dimensional time series, namely, r � R × SD. e similarity between vectors X m (i) and X m (j) is defined as Step 4: define function en, Step 5: go to Step 1 for next m.
Step 6: the fuzzy entropy of traffic flow time series can be expressed as

Temporal Convolutional Network
TCN combines the advantages of CNN and RNN, which capture the global information and process parallelly. It contains three main modules: causal convolution, dilated convolution, and residual block.

Causal Convolution.
When processing sequential tasks, TCN needs to generate outputs with the same length as the input. All data in causal convolution strictly follow the causal relationship in time order, meaning that the value at time t only depends on the information before time t. Because of the strict time-constrained nature of causal convolution, TCN ensures causality and prevents future data leakage.

Dilated Convolution.
With the increasing length of the sequence, the network is deepened to extract more features of historical time, making it hard to train. In order to simplify the network structure, the dilated convolution is adopted, which enables an exponentially sizeable receptive field. For a 1D sequence input x ∈ R and a filter f: where d is the dilation factor, k is the filter size, and s− d·i indicates the past direction. e structure of the causal and dilated convolution is shown in Figure 2. With the dilated convolution, the receptive field size of TCN is flexible, making it easy to capture the features of the global long sequence by a few hidden layers.

Residual Block.
By learning the identity mapping function, residual connection enables the network to transfer information in a cross-layer way, increasing network depth, improving accuracy, and simplifying network training. X is set as the input value of the residual module, and the potential identity mapping function for cross-layer is F (·), the result of which will be added to the input value X, so the output value o of the residual module can be expressed as o � Activation(X + F(X)). (22) e structure of a residual block is shown in Figure 3.  Figure 4, and the detailed information of the two sensors is shown in Table 1. e datasets were collected by Caltrans PeMS (https:// pems.dot.ca.gov/) from 2018/8/1 to 2018/8/31. e flow of all lanes was aggregated into 5-minute intervals to reduce the volatility of the data and ensure real-time prediction. ere were 8928 samples in each group of datasets. e error and loss rate was less than 2%, making it proper to be trained and tested. e training and testing datasets were divided by 2018/8/27. ere were 7488 samples trained and 1440 samples tested in each group, as shown in Figure 5. e autocorrelation of the traffic flow data obtained by VDS No. 717490 and VDS No. 718462 is shown in Figure 6. As the time lag increases, the autocorrelation of both sequences decreases slowly. erefore, they are nonstationary time series with nonlinear changes which should be smoothed. In addition, when the time lag increases to 40, the autocorrelation is still over 0.3, indicating that the sequences have a long-time dependence, so TCN is suitable for the traffic flow prediction. e datasets were processed by TensorFlow2.0.0 and Keras 2.3.1 and compiled by Python3.6. Four indexes were introduced to measure the prediction accuracy: Mean Absolute Error (MAE), Root Mean Square Error (RMSE), R-squared, and Geoffrey E. Havers (GEH).

Empirical Study
ey are calculated as follows:

Traffic Flow Sequence Decomposition and Recombination.
e improved CEEMDAN algorithm was adopted to decompose the traffic flow time series obtained by VDS No. 717490 and VDS No. 718492, respectively. e 11 IMF and one residual were arranged with different frequencies, as shown in Figure 7. e IMF and residual with similar FE were recombined to reduce the calculation complexity and increase the forecasting efficiency and accuracy. e mode FE and the FE difference between different modes of the traffic flow sequences obtained from the two sensors were calculated, as shown in Table 2, and the changing trend of FE is shown in Figure 8. For VDS No. 717490, modes with FE differences less than 0.1 were recombined. Similarly, for VDS No. 718462, 0.05 was the difference threshold of recombination. e recombined subsequences of the two sensors are plotted in Figure 9. Figure 9 shows that each recombined subsequence reflects part of traffic flow dynamics. For VDS No. 717490, IMF1, IMF2, and IMF3 are high-frequency modes with high FE and chaos. Although they are noise with poor predictability, reflecting the randomness and nonlinearity of traffic flow, the detailed information is contained. erefore, they need to be predicted, respectively. e FE of IMF4 is close to that of IMF5 and IMF6. e recombined subsequence reflects the specific daily change characteristics of traffic flow. ere are two peaks every 288 data points, representing the morning and evening peaks of tra c ow which have apparent di erences. It is shown that the morning peak ow is higher than that of the evening on weekdays, while on weekends, the two peaks are similar and lower than those on weekdays.
e FE of IMF7 is quite di erent from IMF6 and IMF8, re ecting that the overall trend of daily tra c ow increases rst and then decreases. Together with IMF4, IMF5, and IMF6, they are median-frequency modes with solid predictability and are the core of time series prediction.
IMF8, IMF9, IMF10, and IMF11, together with the residual, constitute the trend mode, re ecting the weekly tra c ow dynamics. It is shown that the tra c ow on weekdays is relatively stable and higher than that on weekends. e FE and the chaos of the trend mode are low, and the predictability is rm. e trend mode is the essential component of time series prediction. It is worth noting that the data on 2018/8/21 were unstable and uctuant, so the corresponding subsequences of IMF7 and IMF8 + IMF9 + IMF10 + IMF11 + Residual changed apparently on that day, causing disturbance to the original changing cycle. e tra c ow obtained from VDS No. 718462 shows similar changing characteristics to VDS No. 717490. However, because the on-ramp only has one lane with more unstable tra c ow, the uctuation frequency is higher than that of the mainline, resulting in weaker periodicity.

Hyperparameter Optimization.
e accuracy of each prediction model is a ected by various hyperparameters, which should be optimized before the prediction. For TCN, the number of lters, time lag, kernel size, and dilation factors are the crucial hyperparameters a ecting the performance. e number of lters determines whether feature extraction is complete, the others a ect the size of the receptive eld, and all hyperparameters jointly in uence the prediction accuracy of TCN. GridSearchCV in the Scikitlearn was imported to score the performance of di erent hyperparameters combinations of each prediction model and search for the best hyperparameters by 10-fold crossvalidation. e data range of di erent hyperparameters of each TCN module is shown in Table 3.

Results and Comparison.
e prediction e ect can be divided into vertical and horizontal comparisons. e

Output Dilation=8
Hidden Layer Dilation=4 Hidden Layer Dilation=2 Hidden Layer Dilation=1 Input    e results show that the recombined subsequences have higher accuracy and less training time than the single IMF. With recombination, the number of training models is reduced, and the computational complexity is decreased.
Despite the improved CEEMDAN algorithm, other methods based on EMD are introduced to optimize the prediction performance of TCN. e results are shown in Table 6 and Figure 10.
As shown in Figure 10, compared with the direct prediction, the accuracy of decomposition prediction is notably increased. With the improvement of EMD, the performance of forecast is promoted. Speci cally, for VDS No. 717490, the error of ICEEMDAN-TCN is reduced by 69% compared with TCN. For VDS No. 718462, it is decreased by 59%. Furthermore, recombining similar modes according to FE can ulteriorly improve e ciency and accuracy with less calculation complexity. In terms of prediction e ciency, the training time of the model after the recombination is reduced by 34%-49%. From the perspective of prediction accuracy, for VDS No. 717490, the error of ICEEMDAN-FE-TCN is further reduced by 3% compared with ICEEMDAN-TCN, and for VDS No. 718462, it is decreased by 5%.   Overall, the proposed model has the lowest error and the least training time (except original TCN) on both sensors, indicating the best goodness of t. Aug 31, 2018, was taken as an example to visualize the prediction performance of each model, as shown in Figure 11. Since there is little di erence in visualization between X-TCN and X-FE-TCN, the X-FE-TCN is representative. e prediction performance of the original TCN model is approximately tted to the actual data but has an apparent time delay. e reason is that the tra c ow time series consists of changing features with multiple frequencies. TCN cannot accurately capture the multiple-scaled dynamics, causing prediction error. e EMD-based models can decompose the sequence to di erent IMF, making it easier for TCN to learn the characteristics of every scale so that the prediction performs better than the original model, and the hysteresis can be e ectively eliminated. Among all the models compared, the improved CEEMDAN-FE-TCN achieves the best performance because of the extraordinary ability of decomposition. Besides, the uctuation of ramp ow is more potent than that of the mainline ow; the proposed model also performs well on the ramp tra c ow prediction, which appears to have strong robustness.    Table 7 and Figure 12. e visualization is shown in Figures 13 and 14. e results above show that the tra c ow predicted single-step and multistep ahead by di erent algorithms approximately ts with the original data, and the prediction accuracy obtained by decomposition forecasting is signicantly improved compared with the direct prediction.
From the perspective of one-step-ahead prediction, for VDS No. 717490, the prediction accuracy of TCN is higher than that of LSTM, GRU, SVR, and HA. Under the framework of the improved CEEMDAN-FE-X, the prediction error of TCN, LSTM, GRU, and SVR is sharply decreased by 69%, 64%, 67%, and 44%, respectively. Among all the models compared, the improved CEEMDAN-FE-TCN model obtains the lowest MAE, RMSE, GEH average at 7.36, 10.34, and 0.39, respectively, and the highest R-square   at 0.997. For VDS No. 718462, the prediction accuracy of TCN is higher than that of LSTM, GRU, SVR, and HA. Under the framework of the improved CEEMDAN-FE-X, the prediction error of TCN, LSTM, GRU, and SVR is sharply decreased by 59%, 55%, 54%, and 51%, respectively. Among all the models compared, the improved CEEMDAN-FE-TCN model obtains the lowest MAE, RMSE, GEH average at 2.10, 2.96, and 0.39, respectively, and the highest R-square at 0.968. ough the error of all models increases with the prediction step prolonging, the proposed model performs best on two-step and three-step ahead predictions, indicating its goodness of fit on long-and short-term predictions.
TCN, LSTM, and GRU all realize the memory of the long-term changing features. erefore, they appear to be more accurate with the extensive training samples by extracting deeper traffic dynamics than the SVR and HA models, reducing the prediction error. However, unlike the RNN models, TCN can capture the whole long-term sequence features by convolving parallelly. So, it takes up less memory and avoids forgetting information, which thoroughly learns the global time series characteristics and achieves more accuracy than RNN. Furthermore, under the framework of the improved CEEMDAN-FE-X, the TCN RNN and SVR models all perform better than the direct prediction, which means the decomposition     -TCN  EEMD-TCN  TCN  EMD-TCN  EMD-FE-TCN  CEEMDAN  -FE-TCN  CEEMDAN  -TCN  ICEEMDAN  -FE-TCN  ICEEMDAN  -TCN   350   400   450   500   300  250  200  150   50   100  Training time   EEMD-FE-TCN  EEMD-TCN  TCN  EMD-TCN  EMD-FE-TCN  CEEMDAN  -FE-TCN  CEEMDAN  -TCN  ICEEMDAN  -FE-TCN  ICEEMDAN  -TCN (a)     prediction has the universality on di erent prediction models. What should be mentioned is that the proposed framework has the best optimization e ect on the TCN model and has a better e ect on the RNN model than on the SVR model for both the mainline and the ramp ow. In conclusion, under the reasons mentioned above, the improved CEEMDAN-FE-TCN outperforms the other models compared in this paper on the highway mainline and ramp tra c ow prediction.

Conclusions
In this paper, an improved CEEMDAN-FE-TCN model is proposed to forecast highway tra c ow. First, the improved CEEMDAN method decomposes the nonlinear highway tra c ow into IMF and residual with di erent frequencies.
en, the FE of each mode is calculated, and the modes with similar chaos are recombined as subsequences, highlighting the tra c dynamics. Finally, the TCN is applied to predict the different recombined subsequences. After reconstructing the output of TCN submodels, the predicted traffic flow data is obtained. e data of two sensors on US101-S: VDS No. 717490 and VDS No. 718462 collected from PeMS were tested. Compared with other models, the following conclusions are drawn: (1) e improved CEEMDAN algorithm can decompose the traffic flow time series with different frequencies. e accuracy of time series prediction after decomposition and reconstruction is notably higher than direct prediction. Compared with conventional EMD-based models, the improved CEEMDAN-FE-TCN obtains the lowest prediction error. (2) e FE algorithm can calculate the chaos of the modes decomposed by the original data. By recombining the modes with similar FE, the main dynamics of traffic flow are highlighted while amplifying the details of fluctuations. e prediction efficiency and accuracy would be further improved.
(3) e improved CEEMDAN-FE-X framework has remarkable effects on single-step and multistep traffic flow prediction. Under this structure, the prediction accuracy of the TCN, LSTM, GRU, and SVR models is significantly increased. e proposed model outperforms the other models in this paper on the highway mainline and ramp traffic flow prediction, confirming the robustness.
Studies can be combined with other aspects in future work, such as adding spatial factors into the time series prediction. Decomposing the spatiotemporal graph and selecting suitable models for the subgraphs of different frequencies to make predictions may improve accuracy.

Data Availability
All of the data related to this paper are available on Caltrans PeMS (https://pems.dot.ca.gov/).

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.