Traffic Speed Forecast in Adjacent Region between Highway and Urban Expressway: Based on MFD and GRU Model

,


Introduction
With the continuous growth of the scale of China's highway network and traffic volume, the traffic load of the intercity highway in some developed cities is increasing. Many adjacent regions (network of in-and out-of-town roads) between highway and expressway have become part of the urban commuting road. Besides, due to the restrictions of traffic management measures, trucks are not allowed to enter the urban area in fixed hours. ey can only drive on adjacent regions between the highway and urban expressway, which caused the trucks to accumulate, resulting in low road capacity and service level during peak hours, severe traffic congestion, and frequent traffic accidents. What is more, the abnormal weather also often leads to local traffic congestion, which gradually evolves into spread congestion in the regional network. To sum up, it is essential to conduct research on traffic states predicting at the intersection of highways and urban expressways and publish accurate traffic guidance information to travelers to alleviate traffic congestion. e evolution of road traffic flow has complex nonlinear characteristics [1], which makes it challenging to realize accurate traffic flow predictions. Many machine learning algorithms have been used to research transportation [2], especially traffic flow prediction. Liu et al. proposed a hybrid road network traffic speed prediction model based on the state-space neural network and extended Kalman filter [3]. Zhang et al. predicted the traffic speed considering the heterogeneity of different roads [4]. Dhivyabharathi et al. proposed a method for predicting river traffic time using the particle filtering method [5]. Zhao et al. integrated the charging data and microwave detection data to predict traffic speed [6]. Zhao et al. proposed a prediction algorithm combining equal spacing interpolation and Sage-Husa adaptive Kalman filtering [7]. Wang et al. improved the reliability method of driving time prediction based on GPS point velocity distribution by calculating the variable velocity distribution coefficient [8]. Zhou et al. proposed a recurrent neural network based microscopic car-following model on predicting traffic oscillation [9]. Wang and Goodchild developed a logit model to determine the truck route's influencing factors and estimate the driving time [10]. Jula et al. developed a hybrid method composed of dynamic programming and genetic algorithm to find trucks' shortest path [11]. Dong et al. proposed a traffic crash prediction method based on the support vector regression (SVR) model [12]. Yu et al. proposed the random forests based on the near neighbor (RFNN) method to predict bus travel time [13]. Xie and Wei proposed Elman neural network to predict truck speed [14]. Wang and Xu constructed the short-term traffic flow prediction model of urban expressway based on the Long Short-Term Memory (LSTM) network under deep learning [15]. Luo et al. proposed a short-term traffic flow prediction model based on deep learning combining the features of convolutional neural network and support vector regression classifier [16]. Yao et al. discussed the application of Support Vector Machine theory to predict road travel time [17]. Wu et al. [18] proposed a traffic flow prediction model based on Deep Neural Networks (DNN) by utilizing the weekly/daily periodicity and space-time characteristics of traffic flow [18]. Jia et al. proposed a deep learning method for short-term traffic speed information prediction-deep belief network (DBN) model [19]. Due to the problem that the commonly used weight optimization algorithm could not adjust the learning rate adaptably, Zhao et al. adopted Adam, Adadelta, and Rmsprop to optimize the weight in the GRU model of the deep learning algorithm [20]. Wang et al. established a travel time prediction model based on the LSTM (Long Short-Term Memory) considering the precipitation data [21]. Zang et al. proposed an all-day traffic speed prediction method for elevated highways based on deep learning [22]. Peng et al. proposed a 3D Convolutional Neural Network-Deep Neural Network method to recognize and predict traffic status from aerial videos [23]. e author has also conducted many studies on how to improve prediction accuracy [24][25][26]. e existing machine learning algorithm cannot fully dig out the essence of traffic flow characteristics. Deep learning model, such as GRU, can help us learn and seize the inherent complex features effectively and predict traffic flow without prior knowledge [27]. Using the deep learning algorithm to mine traffic flow rules becomes the direction of traffic state prediction. e prediction of road segment traffic speed belongs to the study of microscopic traffic flow characteristics. It is also affected by the traffic states at the macrolevel, such as road network and neighboring region. At present, it is rare to combine macro-and microcharacteristics to predict traffic speed. e Macroscopic Fundamental Diagram (MFD) is a model reflecting the road network's macroscopic traffic state. According to specific indicators, MFD divides a complex and large road network into several independent subregions and implements appropriate control optimization strategies according to the subareas' characteristics. Based on the subregion division results, several road sections that are most similar to the predicted target road can be selected. Meanwhile, the spatial and temporal correlation between traffic flows in each subregion can be analyzed. e subregions with strong correlation can be selected to construct the traffic flow sequence dataset and input the prediction model, which is conducive to improving the prediction accuracy. Many scholars have studied the subregional division method of MFD. Ji and Geroliminis proposed a static normalized cut (Ncut) based subregions division method based on traffic congestion's spatial characteristics and minimizing the subarea's vehicle density [28]. Ji et al. also proposed a dynamic subarea delineation method based on GPS data targeting the maximum connected element [29]. Haddad and Geroliminis proposed a division method based on the operational stability of the subregions [30]. Ma et al. used a spectral approach to divide traffic zones based on neighboring intersections [31]. Ncut is a graph theory-based partitioning algorithm derived from the image partitioning domain. is algorithm not only considers the similarities within regions but also normalizes the similarities within regions using the similarities between regions. And then, the cutting scheme that minimizes the similarities between regions after normalization is found. In this paper, the Ncut algorithm is used to divide the traffic subregions.
en, the stability of the MFD for each subdivided subregion is calculated and analyzed to justify the division results.
GPS data has a wide coverage area and can better reflect the characteristics of urban road traffic flow [32]. What is more, in recent years, China has strengthened the supervision of freight vehicles. It is required that the GPS devices be installed on the large heavy-haul trucks to monitor the trucks' running status. is produces a large amount of trajectory data, especially in the adjacent region between highway and expressway. In this paper, based on the average roadway speed and flow data extracted from the truck GPS data, a short-time traffic flow prediction method combining MFD and GRU is proposed. Using the characteristics of MFD, the road network area is divided into subregions, and the microtraffic flow characteristics and macrotraffic conditions are combined to develop a traffic forecasting method. e test results of real traffic flow data show that the method proposed in this paper has lower prediction errors and higher accuracy than the existing prediction models. It is a reasonable and effective method to predict short-time traffic flow. e technical framework of this paper is shown in Figure 1.

Subdivision Method of Road Network
Based on MFD

Construction of Road Network Weighted Graph Based on
Traffic Operation Similarity. A stable MFD exists in a network of roads with operational homogeneity. A large area can be divided into subregions based on the operational homogeneity of traffic. e starting point of the MFD theory is to study the relationship between traffic demand and traffic supply in the road network, the maximum traffic volume can directly reflect the traffic supply of each road section and the overall road network, and the traffic volume data can be easily obtained through traffic flow detection. us, the traffic volume is taken as the fundamental traffic characteristic of the road section in this paper. e road section's maximum traffic volume is used to define the traffic operation similarity between adjacent connected sections.
Let the similarity degree of traffic operation between adjacent connected sections i and j in road network G be: where q m i is the maximum traffic volume of section i in road network G and q m j is the maximum traffic volume of section j in road network G. Using the natural constant transformation, the difference value of the maximum traffic volume between adjacent connected sections is squared mapped to the interval of 0-1. If the similarity is 0, the traffic operation similarity between the road sections is the least. If the similarity is 1, the traffic operation similarity between the road sections is the biggest.
Based on graph theory, the road network is first "nodearc transformation," so that the similarity degree of traffic operation between adjacent road sections is expressed as the weight of the arc section in the graph. e Laplace matrix is constructed based on the similarity degree of traffic operation, which is the basis for the subgraph division in graph theory. e details are as follows.
When road segment i in road network G is connected with road segment j, a ij � 1. When segment i is disconnected from segment j, a ij � 0. When i is equal to j, a ij � 0. e weighted adjacency matrix of road network G is W, and the element in W is is paper adopts road sections as node V in the un- D-W is the Laplace matrix of road network G, in which the sum of all the rows and columns is zero. Based on the transformations and calculations, it is possible to obtain a road network weighted by traffic operations similarity between adjacent connecting sections.

Road Network Subarea Division Method Based on Normalized Cut.
Ncut is one of the neutron graph partition methods in graph theory, which is a subgraph division method at the macrolevel. e focus is not on the graph's details, but on the overall characteristics of the graph. e optimal normalized cut problem of the graph can be expressed as It is an NP-hard problem to solve the minimum value of the normalized cut. e spectral clustering method is a widely used method that can solve NP-hard problem approximately by solving eigenvalue and eigenvalue vector. erefore, the Fiedler method is used to calculate the  Journal of Advanced Transportation eigenvalues and eigenvectors of the matrix and divide the subareas of the road network. e point set V of figure G is divided into two subsets, and the transformation can express the optimal normalized cut problem of A and B: where , and x is a vector of columns consisting of 1 and −1. When the i-th node is subdivided into subregion A, Since all rows and columns in a Laplace matrix have a sum of 0, the matrix always has an eigenvalue of 0. If graph G is connected, then the second small eigenvalue is positive. e corresponding eigenvector is called the Fiedler vector, which contains important information about the graph; that is, the numerical size of the elements in the Fiedler vector reflects the correlation of their corresponding vertices. When the road network is divided according to the Fiedler vector, the vertices corresponding to the Fiedler vector can be divided according to different critical value S. ere are many methods to select the S value, among which the 0point method is practical and straightforward.

Stability Calculation of MFD
By calculating the stability of MFD, the rationality of the subregion division can be proved. MFD stability depends on stability in the critical state. In the critical state, if the average traffic volume fluctuates less under the same road network density, the road network traffic operation will be more controllable. is article refers to the method (Fuzzy c-means algorithm) in our research [33] to divide the test data sets into three categories: unblocked, critical, and congested. Two indicators, the regional traffic volume and regional density, are used to classify the three traffic states. Firstly, FCM (Fuzzy c-means algorithm) is used to divide the data points of the spatial distribution of multidimensional data into specific classes. Each data point belongs to a certain class to some extent, and the membership degree is used to indicate the degree to which each data point belongs to a certain clustering. FCM divides n vectors into c fuzzy groups and calculates each group's clustering center to minimize the objective function of nonsimilarity indexes. en, the traffic state is divided into three stages: unblocked, critical, and congested. e dispersion of road network traffic operation, that is, the dispersion of weighted average traffic volume of road network in the critical state, represents MFD's stability in the critical state. e lower the dispersion of road network traffic operation is, the more stable the road network operation is, and the higher the MFD stability is. e higher the dispersion of road network traffic operation, the more unstable the road network operation will be and the lower the MFD stability. e dispersion of road network traffic operation is where q w e is the average traffic volume in the test data. q w c is the critical average traffic volume of road network, q w m is the maximum average traffic volume of road network, and α is undetermined parameters, 0 < α < 1.
e whole road network's dispersion degree is calculated by the weighted average of each subarea's dispersion degree.
e calculation results can be used as the judgment index of the subarea division to characterize the whole network's MFD stability. If the entire road network is divided into N subareas, and the dispersion of road network traffic operation in subarea i is s i , then the dispersion of the whole road network traffic operation is S N where q w ci is the critical average traffic volume in subarea i.

Traffic Speed Prediction Method Based on the Spatial-Temporal Correlation of Subareas
e evolution of traffic speed on a road section in a specific subarea is affected not only by the temporal evolution law of the traffic flow on the road sections but also by the spatial influence of the road sections in other subareas. is paper proposes a traffic speed prediction algorithm that considers the spatial-temporal correlation of subareas.

Spatial Correlation Analysis.
Firstly, the spatial correlation between each subarea is analyzed.
In spatial correlation analysis, it is necessary to measure the adjacency relationship of the neighboring subregions.
is requires quantitatively describing the adjacency relationship of adjacent regions to perform the calculation of spatial correlation statistics.
In this paper, the spatial adjacency matrix is used to express the spatial relationship between subregions.
Suppose that there are m subregions in the study area, and the spatial weight matrix W sp � [w where W sp is m × m dimensional spatial weight matrix and w sp ij is the spatial weight between the regional units i and j.

Journal of Advanced Transportation
Besides, to ensure that the subregions cannot be adjacent to themselves specifies that when j � i, w sp ij � 0. When two subregions are sharing one or more nodes, it is adjacent. e standardized formula of the spatial weights is

Temporal Correlation Analysis.
e Pearson correlation coefficient formula is improved to measure the time correlation of the two regions. If two subregions in the study area have spatial adjacencies, the time correlation can be calculated by the following formula over a certain period: where q i (t) and q j (t) are the traffic volume of subregions i and j at time t; u i and u j are the mean traffic volume for i and j; and σ i and σ j are the variances. According to the road network area studied in this paper, the regional correlation is calculated as ω sp ij c ij (ac), where ω ij is the spatial correlation of areas i and j and c ij (ac) is the correlation of two areas in the study period.
In this paper, the data of the K most relevant regions to the region where the predicted target segment is located are selected for constructing the input matrix of the prediction model. In this paper, the K values are determined as follows: K (K � 0, 1, ..., 4) are used to input the data from the most relevant regions into the GRU model, and the K-value with the smallest prediction error is taken.

GRU-Based Traffic Speed Prediction
Algorithm. RNN (recurrent neural network) is a kind of deep neural network designed to process sequence data, which plays an important role in the field of sequence mining. e GRU model is an improvement of recurrent neural network, which is one of the hot technologies of deep learning in recent years. Different from the traditional recurrent neural network, the internal structure of the GRU's hidden layer nodes does not use a single activation function.
e specific calculation steps of GRU are as follows: firstly, the current state input z t and the previous time output h t−1 are input into the update gate, and then a value between 0 and 1 can be output, where 0 represents the complete discarding information and 1 represents the complete reserving information, and the calculation formula is as shown in formula (12). Secondly, z t and h t−1 entering the reset gate of the sigmoid layer output the value between 0 and 1. Meanwhile, tanh layer will create a new candidate value vector h t , and the calculation formulas are shown in equations (13) and (14). irdly, the update gate is used as the weight vector, and the candidate vector and the output vector at the last moment are weighted averages to obtain the output h t of GRU cells. e calculation formula is shown in equation (15): where η represents the update gate vector; r represents the reset gate vector; p represents the bias vector; U represents the input weight; Q represents the cyclic weight; z t represents the input vector at t time; and h t represents the output vector at t time.
Regularization is generally defined as the modification of the learning algorithm, and the goal is to reduce generalization error rather than training error. Common regularization methods include L1 and L2 parameter paradigm penalty, Dropout, multitask learning, and early termination, which are common, where the penalty terms L2 and L1 of parameter normal form can be expressed as

Journal of Advanced Transportation
where θ i can be expressed as the reciprocal of weight Q of each layer, indicating that, for the layer with too high weight learned, its updating degree should be reduced. On the contrary, for the node with too low weight learned in the layer, its updating degree should be improved to achieve the goal of amortizing the ownership value in the layer.
To sum up, the flow of the stroke speed prediction algorithm proposed in this paper is shown in Figure 2. e input is a three-dimensional vector composed of features, time step, and samples. is 3D tensor is input to the GRU model with a dropout layer and fully connected layer to get the travel time's predicted result. One column of the matrix in equation (11) corresponds to the input of one time step of the GRU model.

Road Network Subarea Division and Stability Calculation of MFD.
is paper uses truck GPS data as the basis for algorithm validation. As shown in Figure 3, the experimental area selected in this paper is located in Beijing's southeast. Beijing's expressway and the main road, including the 5 th and 6 th ring roads, Jingtai, Jinghu, and Jingha highway, are selected to verify algorithm accuracy. e area is approximately 110 square kilometers. e total length of roads in the road network is about 131 km. According to the analysis, the selected area has more accidents and more GPS data of truck. e time range of the data used for validation is May 1, 2018, to July 31, 2018. e methods of map matching, anomaly data processing, and traffic speed time series extraction of truck GPS data in this paper are from literature [20]. In this paper, the collected truck GPS data is organized into the form of a time series of the traffic speed of the road section with a period of 5 minutes. en, according to the chosen K-value, it is organized into equation (11). L takes 12; that is, the prediction is made using the previous hour's data.
Sample size is a critical concern when using probe vehicles to collect real-time traffic information, and it is necessary to determine the number of probe vehicles needed for traffic state estimation. In this paper, the required sample size for different combinations of confidence levels of the study area is determined with reference to the method in [34].
May and June's average speed data are used as the training data set for GRU model training. e rest of the data serves as a test set for the algorithm. In this paper, the study area's road network is abstracted into the road network diagram, as shown in Figure 4, and there are 32 road sections and 21 nodes. e regional division of the cases of the parties is shown in Figure 5. e dispersion of traffic operation in subareas and the whole road network is shown in Table 1. When the network is divided into 5 subareas, the whole road network's traffic discrete degree is the smallest, 0.05673. e network has been divided into smaller regions. If continuing, the change of discrete degree of the whole road network is not too big. However, for speed prediction, the dimensionality of the data input to the model will increase. us, the prediction difficulty will increase. So, take five as the optimal scheme of areas division. e MFD of each subarea is shown in Figure 6. e traffic state classification results based on the FCM algorithm for subarea 1 are shown in Figure 7. e clustering centers are shown in Table 2.

Traffic Speed Prediction.
Two indexes, MAPE and RMSE, are selected to evaluate the prediction accuracy of the model. e calculation method of MAPE and RMSE is shown in the following formulas: where V Y (t) is the predicted traffic speed at time t, V(t) is the actual traffic speed at time t, and L is the total number of predicted cycles. is paper selects nodes 17 to 18 (Section 1) and 9 to 10 (Section 2) belonging to different regions as the experimental verification sections. e accuracy of the algorithm was verified in four different scenarios. e prediction results were compared with the GRU prediction algorithm based on a single time series of the road segment. is GRU model has the same parameter settings as the model presented in this paper. e first step is to determine the number of regions K that are input to the GRU model, so the relationship between the number of inputs and the prediction accuracy of the model is analyzed. Table 3 shows the prediction accuracy of Section 1 for different K values. Table 4 shows the prediction accuracy of Section 2 for different K values. From Table 3, it can be seen that the K of road Section 1 take 1. From Table 4, it can be seen that the K of road Section 2 take 1.

6
Journal of Advanced Transportation

Working Days
(1) Section 1. e predicted results of Section 1 on July 2 (working day) are shown in Figure 8. e errors are shown in Figure 9. It can be seen that on July 2, the average speed of the morning rush hour and noon rush hour sections is low, in a congested state, and the road section was in a state of smooth flow at night. e algorithm proposed in this paper can achieve acceptable prediction results. e results of error evaluation indicators MAPE and RMSE are shown in Table 5. MAPE was 2.30%, 3.05%, and the RMSE was 1.34 and 1.68, respectively.
(2) Section 2. e predicted results of Section 2 on July 2 (working day) are shown in Figure 10. e errors are shown     Figure 11. e algorithm proposed in this paper can achieve acceptable prediction results. e results of error evaluation indicators MAPE and RMSE are shown in Table 6. MAPE was 1.42%, 2.53%, and the RMSE was 1.05 and 1.79, respectively.

Weekend. (1) Section 1.
e predicted results of Section 1 on July 1 (weekend) are shown in Figure 12. e errors are shown in Figure 13. e algorithm proposed in this paper can achieve acceptable prediction results. e results of error evaluation indicators MAPE and RMSE are  shown in Table 5. MAPE was 3.93%, 4.25%, and the RMSE was 2.17 and 2.31, respectively.
(2) Section 2. e predicted results of Section 2 on July 1 (weekend) are shown in Figure 14. e errors are shown in Figure 15. e algorithm proposed in this paper can achieve acceptable prediction results. e results of error evaluation indicators MAPE and RMSE are shown in Table 6. MAPE was 2.46%, 3.93%, and the RMSE was 1.89 and 2.18, respectively.

Rainy Day. (1) Section 1.
e predicted results of Section 1 on July 5 (rainy day) are shown in Figure 16. e errors are shown in Figure 17. e algorithm proposed in this paper can achieve acceptable prediction results. e results of error evaluation indicators MAPE and RMSE are shown in Table 5. MAPE was 2.86%, 3.23%, and RMSE were 1.65 and 1.86, respectively.
(2) Section 2. e predicted results of Section 2 on July 5 (rainy day) are shown in Figure 18. e errors are shown in     Table 6. MAPE was 1.49% and 2.55%, and the RMSE was 1.03 and 1.82, respectively.

Accident.
e predicted results of Section 2 on July 11 are shown in Figure 20. e errors are shown in Figure 21.
ere was a traffic accident in the early hours of the morning. e results of MAPE and RMSE are shown in Table 6. MAPE was 7.35% and 8.94%, and the RMSE was 5.05 and 5.83, respectively.
It can be seen from the above figures and tables that, compared with the traditional prediction method based on single segment time series, the prediction accuracy of the proposed algorithm is improved. When no accident happened,     MAPE increased by about 0.5%, and on the day of the accident, MAPE increased by about 1.5%. e RMSE also improved more on accident day, compared with the day of no accident.

Conclusions
e traffic speed in adjacent regions between highway and expressway has gradually become important information concerned by highway managers and travelers. is paper proposes a prediction method of road traffic speed that considers microscopic traffic flow characteristics and macroscopic traffic status based on the road section average speed and flow data extracted from the GPS data.
Based on MFD, road network subareas are divided and evaluated. Firstly, the Ncut algorithm is used for the division of the road network. Secondly, to ensure the stability of the divided subarea's MFD, the definition of the road network's discrete degree is proposed. e traffic state is divided combined with FCM to get the best scheme for dividing the subregions after the calculation of the discrete degree of the whole network. e spatial-temporal correlation coefficient is proposed to measure the correlation between subareas. en, the traffic speed time sequence of the study subarea and the related area is used to build a matrix of traffic speed. e regional matrix of traffic speed data is input into the GRU model, and the output result is the predicted traffic speed of the studied region.
is paper takes the adjacent region between the highway and expressway of Beijing as an example to verify the algorithm. e southeast corner of the Beijing road network is selected as the research area. e area consists of two ring expressways and  three highways with a total area of approximately 110 square kilometers. Truck GPS data from this region is the basis of this study. e proposed algorithm's accuracy is verified under the working days, weekend, rainy days, and accident scenarios. e result shows that, compared with the traditional prediction method based on single segment time series, the prediction accuracy of the proposed algorithm is improved. is will enhance the level of traffic information services in the adjacent region between the highway and urban expressway and ease traffic congestion.

Data Availability
e data used to support the findings of this study have not been made available because the authors have signed the confidentiality agreement with the data providers.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.