^{1}

^{1}

^{1}

^{2}

^{1}

^{2}

Freeway travel time prediction is a key technology of Intelligent Transportation Systems (ITS). Many scholars have found that periodic function plays a positive role in improving the prediction accuracy of travel time prediction models. However, very few studies have comprehensively evaluated the impacts of different periodic functions on statistical and machine learning models. In this paper, our primary objective is to evaluate the performance of the six commonly used multistep ahead travel time prediction models (three statistical models and three machine learning models). In addition, we compared the impacts of three periodic functions on multistep ahead travel time prediction for different temporal scales (5-minute, 10-minute, and 15-minute). The results indicate that the periodic functions can improve the prediction performance of machine learning models for more than 60 minutes ahead prediction and improve the over 30 minutes ahead prediction accuracy for statistical models. Three periodic functions show a slight difference in improving the prediction accuracy of the six prediction models. For the same prediction step, the effect of the periodic function is more obvious at a higher level of aggregation.

Travel time can effectively measure roadway traffic conditions [

Some researchers compared the performance of statistical models and machine learning models. For example, Stathopoulos et al. [

Traffic data usually exhibit periodic characteristics during weekdays. Thus, considering the periodicity of data can improve the prediction performance. Up to date, three different approaches have been proposed to capture the periodic characteristics. Zou et al. [

Regarding the prediction interval (steps), some existing studies have investigated the impact of data resolution on model prediction performance, but there are no definitive results. For example, Park et al. [

Based on the previous studies, some studies have compared statistical models and machine learning models, and some scholars have proposed the improvement of periodic functions on travel time prediction. However, few studies have comprehensively evaluated the effects of different periodic functions on the two types of models under different prediction steps. Thus, this study focuses on multistep ahead travel time prediction by considering different periodic functions. The periodic characteristics of the travel time are captured by SA, TPF, and DES models. The residual part is modeled by the statistical models (ARIMA, space time (ST) model, vector autoregressive (VAR) model) and machine learning models(support vector machine (SVM), back propagation neural network (BPNN), multilinear regression (MLR)). In total, 18 hybrid prediction models were established and compared. In addition, the performance of prediction models was evaluated under different scenarios: multistep ahead prediction (1, 3, 6, and 12 steps ahead predictions) with different aggregation levels (5-minute, 10-minute, and 15-minute).

The remainder of the paper is organized as follows. In Section

This study analyzed the travel time data of US-290 between IH-610 and FM-1960 in Houston, Texas. The total length is approximately 12 miles. The segment is divided into five links by six automatic vehicle identification (AVI) readers (Figure

Study links along the US-290 [

We calculate the historical average travel time per mile of the five links (Monday to Friday, January to August 2008) (Figure

Historical median travel times on the five links.

Changes in traffic flow have certain temporal and spatial characteristics. Autocorrelation and cross-correlation functions were calculated to examine the temporal and spatial correlation. The equation adopted here follows that of Zou et al. [

In this case, the cross-correlation function measures the temporal and spatial correlation between the travel time data pairs recorded on two selected links. For travel time data pairs _{c} cross-correlation function is

We found that the autocorrelation function of travel time shows a downward trend with time lag (Figure

Autocorrelation functions of 5-minute travel time on links A, B, C, D, and E. (a) Link A. (b) Link B. (c) Link C. (d) Link D. (e) Link E.

Cross-correlation functions of 5-minute travel time on link D.

Previous research showed that travel time exhibits periodic characteristics during the weekdays. Similar periodic characteristics were found by Kamarianakis et al. [

Simple average method is one of the commonly used methods to describe the periodic characteristics [

Trigonometric polynomial adopts the sinusoids and cosinusoids to describe the periodic pattern. Equation (

Regarding the selection of optimal number of trigonometric series functions, Zou et al. [_{r} in equations (

Double exponential smoothing is one widely used method for both smoothing and forecasting time series. This approach builds the prediction in accordance of the levels mean _{t} and the trend _{t}. The model can be expressed as

ARIMA model is as equation (

To build a model for the predictive spread,

The coefficients

The stability of the VAR model could be guaranteed through the characteristic polynomial

A

Then, it could be estimated that

That issue mentioned above is worked out by utilizing the Lagrange equation. The regression function is demonstrated as

First, equation (

Second, predicting value of the output layer could be calculated through

_{0} , …, _{j} are the regression parameters which can be optimized by training samples.

As mentioned in Section _{t} is the periodic component; and

Periodic component can be described by three kinds of functions (TPF, SA, and DES), and the residual part is modeled by six prediction models. We compare the impacts of different periodic functions on multistep ahead freeway travel time prediction models using travel time data with different aggregation levels.

To evaluate the multistep prediction performance of all prediction models, three indicators, mean absolute error (MAE), mean absolute prediction error (MAPE) and root mean square error (RMSE) are considered comprehensively. The equations for calculating three indexes are as follows:

So far, there is no automatic way to calculate and evaluate the model training period. This study considered different training periods of 15, 20, 25, 30, 40, 50 and 60 days. For comparison, the travel time data in August (21 weekdays) were used as the test set. Figure

Comparison of training period for the different models.

In this part, the multistep ahead prediction performance of SVM, BPNN, MLR, ARIMA, ST, VAR under different aggregation levels (i.e., 5-minute, 10-minute, and 15-minute) are evaluated using the travel time data observed on link D. In addition, we explored the impacts of different periodic functions on statistical models and machine learning models under different aggregation levels for the input data. The testing period is 15:30 to 19:30 from 1 August to 31 August (21 weekdays).

The study provides the MAE, MAPE, and RMSE values of SVM, BPNN, MLR, ARIMA, ST, and VAR models for different forecasting horizons under different aggregation levels (5-minute, 10-minute, and 15-minute) for the input data (Tables

MAE, MAPE, and RMSE of six models for different minutes ahead predictions with 5-minute as aggregating level.

Time scale: 5 min | Minutes ahead predictions | ||||||
---|---|---|---|---|---|---|---|

MAE(s) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | ||||||

BPNN | 18.10 | 26.10 | 34.87 | 48.02 | 54.07 | 60.94 | |

MLR | 20.21 | 34.01 | 46.32 | 74.53 | 95.71 | 111.05 | |

Statistical models | ARIMA | 19.80 | 31.13 | 40.56 | 57.02 | 65.98 | 77.50 |

ST | |||||||

VAR | 19.39 | 30.74 | 39.31 | 56.48 | 68.82 | 78.16 | |

MAPE (%) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | ||||||

BPNN | 7.20 | 10.57 | 14.73 | 23.77 | 26.28 | 30.82 | |

MLR | 9.38 | 17.66 | 25.16 | 43.13 | 55.28 | 64.55 | |

Statistical models | ARIMA | 9.42 | 15.90 | 21.66 | 32.78 | 38.26 | 45.21 |

ST | 38.60 | ||||||

VAR | 9.10 | 15.15 | 19.72 | 28.19 | 33.16 | ||

RMSE(s) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | 41.08 | 83.07 | 91.32 | |||

BPNN | 28.38 | 53.60 | 73.15 | ||||

MLR | 29.67 | 46.96 | 61.95 | 92.99 | 118.52 | 134.65 | |

Statistical models | ARIMA | 28.99 | 44.47 | 56.80 | 77.89 | 108.58 | |

ST | 28.93 | 55.60 | 94.27 | 108.50 | |||

VAR | 43.62 | 78.74 | 98.34 |

MAE, MAPE, and RMSE of six models for different minutes ahead predictions with 10-minute as aggregating level.

Time scale: 10 min | Minutes ahead predictions | ||||||
---|---|---|---|---|---|---|---|

MAE(s) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | — | 21.79 | — | — | ||

BPNN | — | — | 44.41 | — | 59.36 | ||

MLR | — | 30.32 | — | 71.39 | — | 109.57 | |

Statistical models | ARIMA | — | 29.32 | — | 62.79 | — | 86.27 |

ST | — | — | — | 71.44 | |||

VAR | — | 26.89 | — | 51.77 | — | ||

MAPE (%) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | — | — | — | |||

BPNN | — | 8.92 | — | 20.81 | — | 30.91 | |

MLR | — | 15.51 | — | 41.05 | — | 64.00 | |

Statistical models | ARIMA | — | 14.93 | — | 36.55 | — | 50.69 |

ST | — | — | — | 37.60 | |||

VAR | — | 13.49 | — | 26.79 | — | ||

RMSE(s) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | — | — | — | |||

BPNN | — | 35.67 | — | 70.50 | — | 87.33 | |

MLR | — | 41.98 | — | 89.53 | — | 132.75 | |

Statistical models | ARIMA | — | 41.49 | — | 83.54 | — | 115.66 |

ST | — | 40.09 | — | — | 104.54 | ||

VAR | — | — | 73.83 | — |

MAE, MAPE, and RMSE of six models for different minutes ahead predictions with 15-minute as aggregating level.

Time scale: 15 min | Minutes ahead predictions | ||||||
---|---|---|---|---|---|---|---|

MAE(s) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | — | — | 45.79 | |||

BPNN | — | — | 25.41 | 38.48 | 61.38 | ||

MLR | — | — | 37.14 | 68.42 | 88.75 | 106.79 | |

Statistical models | ARIMA | — | — | 32.64 | 56.16 | 65.82 | 79.20 |

ST | — | — | 49.01 | 60.88 | 68.28 | ||

VAR | — | — | 30.87 | ||||

MAPE (%) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | — | — | ||||

BPNN | — | — | 11.09 | 16.25 | 21.66 | 28.05 | |

MLR | — | — | 20.99 | 39.86 | 51.47 | 62.47 | |

Statistical models | ARIMA | — | — | 18.19 | 33.01 | 37.99 | 48.02 |

ST | — | — | 31.12 | 35.59 | |||

VAR | — | — | 16.48 | 25.19 | |||

RMSE(s) | Model | 5 min | 10 min | 15 min | 30 min | 45 min | 60 min |

Machine learning models | SVM | — | — | ||||

BPNN | — | — | 41.57 | 65.63 | 72.81 | 95.45 | |

MLR | — | — | 49.56 | 85.06 | 112.00 | 130.03 | |

Statistical models | ARIMA | — | — | 45.50 | 75.91 | 90.63 | 106.43 |

ST | — | — | 45.06 | 71.97 | 91.30 | 100.50 | |

VAR | — | — |

Bold values indicate the smallest MAE, MAPE, and RMSE values in machine learning models and statistical models, respectively.

To investigate whether the proposed periodic functions improve the performance of six prediction models, hybrid models and prediction models are considered to predict travel time values at link D for the same testing period (15:30 to 19:30 from 1 August to 31 August). The periodic function shows a consistent rule on improving the prediction model, whether it is based on MAE index, MAPE or RMSE index. The figure shows the RMSE results of the six models and 18 hybrid models for different forecasting horizons under different aggregation levels (5-minute, 10-minute, 15-minute) (Figure

RMSE values of six models for different minutes ahead predictions. (a) RMSE values of SVM models for different minutes ahead predictions. (b) RMSE values of BPNN models for different minutes ahead predictions. (c) RMSE values of MLR models for different minutes ahead predictions. (d) RMSE values of ARIMA models for different minutes ahead predictions. (e) RMSE values of ST models for different minutes ahead predictions. (f) RMSE values of VAR models for different minutes ahead predictions.

From the above analysis, we can conclude that the periodic functions obviously improve the prediction accuracy of the six prediction models for multistep ahead prediction. Then we analyze the impact degree of different periodic functions on prediction models based on mean absolute error difference (MAED). The equation of the MAED is as follows:

MAED for multistep ahead prediction with different aggregation levels (5-minute, 10-minute, and 15-minute). (a) MAED for 1-step ahead prediction. (b) MAED for 3-step ahead prediction. (c) MAED for 6-step ahead prediction. (d) MAED for 12-step ahead prediction.

In this section, we discuss the aggregation level and periodic function suggestions. According to the conclusion of Table

Aggregation level suggestions and periodic function suggestions.

Models | Minutes forecasting ahead (min) | Aggregating level suggestions (min) | Periodic function suggestions |
---|---|---|---|

SVM, BPNN | 5 | 5 | — |

10 | 10 | — | |

15 | 15 | — | |

30 | 15 | — | |

45 | 15 | — | |

60 | 15 | TPF | |

90 | 15 | TPF | |

120 | 15 | TPF | |

ARIMA, ST, VAR, MLR | 5 | 5 | TPF |

10 | 10 | SA | |

15 | 15 | SA | |

30 | 15 | SA | |

45 | 15 | SA | |

60 | 15 | SA | |

90 | 15 | SA | |

120 | 15 | SA |

This paper evaluated the multistep ahead prediction performance of SVM, BPNN, MLR, ARIMA, ST, and VAR models using the freeway travel time data collected from vehicle identification readers along US-290 in Houston, Texas. The performances of the six prediction models under different aggregation levels (5-minute, 10-minute, and 15-minute) were compared. The impacts of different periodic functions on machine learning and statistical models under different aggregation levels (5-minute, 10-minute, and 15-minute) are also investigated. Several important conclusions can be drawn based on the results. First, the periodic functions can improve the prediction performance of machine learning models for more than 60 minutes ahead prediction and improve the over 30 minutes ahead prediction accuracy of all statistical models. Second, the considered three periodic functions have slight difference in improving prediction accuracy of the six prediction models during multistep ahead prediction. Third, with the increase of prediction steps, the impact of periodic function on the prediction model becomes obvious. Fourth, for the same prediction step, the effect of periodic function is more obvious with the increase of data aggregation level. For future work, since nonrecurrent events (incidents, special events, etc.) may disturb the cyclical pattern of travel time, it will be interesting to analyze and compare the impacts of periodic functions on prediction models under nonrecurrent traffic conditions. In addition, artificial intelligence has greatly promoted the development of traffic science. Especially deep learning algorithms, such as deep residual networks, cyclic neural networks and convolutional neural networks, have been rapidly developed in transportation field. It is also interesting to examine the impact of different periodic functions on deep learning algorithms.

The data used to support the findings of this study are available from the corresponding author upon request.

The authors declare that they have no conflicts of interest.

This research was funded by the National Key Research and Development Program of China (grant no. 2018YFE0102800).