Prediction of Road Network Traffic State Using the NARX Neural Network

,


Introduction
Transportation is a key link to social development. According to the latest statistics from the Traffic Management Bureau of the Ministry of Public Security, the number of motor vehicles in China reached 384 million until June 2021, and this figure is still growing. It is difficult to meet the increasing demand for comfort in traveling, and the negative effects of congestion on various routes within the road network in urban areas are even more serious. In this context, accurate and reliable traffic information is very important for intelligent transportation systems (ITS), advanced traffic management systems (ATMS), and advanced traveler information systems (ATIS) [1]. In detail, it is important to use traffic information wisely and accurately determine the future traffic situation of the regional road network to support traffic managers and travelers in their management or travel decisions. e key to the problem is how to accurately predict traffic trips and induce travelers to choose travel routes, thus spreading the pressure of traffic trips within the road network.
To tackle the traffic problems mentioned above, some scholars have proposed forecasting traffic events [2,3] and traffic demand [4][5][6] in the study of traffic state prediction. Others have proposed short-term prediction of various traffic state parameters and have designed various prediction models [7,8]. As a key component in achieving the purpose of traffic congestion mitigation, short-term prediction of traffic state is a combination of traffic information and mathematical algorithms. e future short-term traffic state parameters are predicted by building the corresponding prediction models. Usually, the time interval of short-term prediction does not exceed 15 min [9,10]. However, due to the randomness and complexity of the traffic state, the traffic flow shows strong nonlinearity as the time interval is shortened [11]. It will add difficulty to the short-term traffic state prediction. Meanwhile, although the parameter prediction of short-term traffic flow can enable traffic managers to discover abnormal traffic operation states in time and take corresponding control measures, it is difficult to satisfy the demand for long-term traffic trips, and the information obtained by users is limited by the form of information output. In contrast, long-term traffic state prediction can provide the most direct traffic operation information for travelers and has greater application value.
In the studies on the state of regional road networks, many studies only focused on the prediction target on a single section of varying length within the road network. It failed to consider the complex road conditions at the regional level and cannot demonstrate generalisability. And there are fewer studies on the traffic state prediction of road networks, which need to be improved and supplemented. erefore, the challenges of current research in both traffic state prediction and highway network state studies are as follows: (i) With limited data collection devices and the impact on the COVID-19 pandemic, how can traffic parameters and corresponding states of the entire road network be obtained? Despite the more advanced and diverse means of data collection on the road network at the present stage, some cities are relatively underdeveloped and lack sufficient data collection equipment and services (as in the area selected for this paper) to represent the effects of the data. e key in this situation is the effective mining and processing of the available collection data. (ii) Expanding from both temporal and spatial dimensions, how can the traffic state of a road network be represented and predicted by using cross-sectional flows when faced with a larger road network? From a spatial perspective, basic traffic flow parameters are required as quantitative criteria to characterize traffic, whether it is a section, road, or network. However, the distinction is that the analytical methods utilized, the forms of representation, and their meanings are similar but different at the microscopic scale and the macroscopic scale. From a temporal perspective, it is unclear what time scales are defined for long-term traffic prediction and short-term traffic prediction, and what duration of time can achieve optimum performance.
To address the above challenges, we have proposed a long-term prediction model that combines nonlinear autoregressive with exogenous inputs (NARX) neural network and Macroscopic Fundamental Diagram (MFD) for the state of the road network. First, the traffic state needs to be classified by the proposed formula of traffic state efficiency index and the design speed of the classified highway, and the MFD of the current state is drawn. To achieve the best prediction performance, the performance parameters were adjusted, such as the number of nodes and delay parameters of the NARX neural network (NARX NN). Finally, with the traffic state efficiency index and the predicted traffic output, the predicted MFD is drawn to express the traffic operation state. In this model, the capabilities of MFD to characterize traffic state relationships at the macroscopic scale and the capabilities of the NARX NN to handle traffic flows with time-series characteristics are utilized, respectively. To the best knowledge of the authors, this paper is one of the first attempts to combine a neural network forecasting model with the MFD and employ it for long-term traffic flow prediction on a regional road network. e main contributions of the paper are summarised as follows: (1) e NARX-MFD method for predicting longterm traffic state parameters is proposed and the NARX NN is verified to be capable of long-term traffic prediction. (2) e formula of the traffic state efficiency index is proposed. With the analysis of the traffic flow parameter curves, the traffic state of the regional road network is classified into four categories by using the smooth speeds in the curves and the design speeds of the classified highway. is will be used as an evaluation classification for the predicted state. (3) With the predicted traffic flow parameters and traffic state efficiency indicators, the predicted traffic state curve is produced and provides the traveler with more intuitive feedback on the future state of the road network.
Our work is presented as follows. e second section introduces the related works of prediction methods of traffic state parameters and macroscopic evaluation of traffic state and analyzes the reasons for choosing NARX NN and MFD. e third section addresses the theoretical system of traffic state evaluation. It introduces the formulas of traffic state efficiency index and the evaluation grading of regional road networks derived from it. e fourth section mainly focuses on data description and experimental design and describes several time-series prediction models involved in the experiments. Section 5 then presents the results of the comparison experiments of the prediction models and the outputs of the predicted traffic states and analyzes the reasons for them. Finally, in Section 6, we conclude our paper and outlook future research.

Related Work
In this chapter, the related work is reviewed and summarised in terms of two aspects: prediction methods of traffic state parameters and macroscopic evaluation of traffic state.

Traffic State Parameters Prediction.
In the context of the ITS, numerous prediction methods for traffic state parameters have emerged. In the past research, prediction models for various types of traffic state parameters have been proposed in the relevant literature. According to the type of parameter, these include traffic flow, speed, density, occupancy, and the resulting travel time and travel time index. Based on the traffic flow parameter relationships proposed by Greenshields, when conducting the selection of indicators, some have argued that the selection of volumes is important because of the stability of volumes [12]. 2 Journal of Advanced Transportation Some have argued that speed prediction is more critical [1,13], while others have argued that prediction accuracy can be improved by the combined effect of multiple traffic metrics, such as simultaneous characterization of volume and density [14]. In recent years, with the continuous improvement of algorithms and the development of data collection equipment, travel time and travel time index have received increasing attention as predictors. Li et al. [15] proposed a model based on an ensemble empirical mode decomposition and random vector functional link network to predict the travel time of a highway network. Xu et al. [16] chose the travel time index when dealing with the prediction of regional road networks. In contrast, due to the limitations of the collection equipment, other indices are difficult to obtain. So for the current work, we use traffic data for consideration. However, when these data are available, a multi-index prediction approach is taken.
A comparative study by Smith et al. [17] found that studies of flow prediction models can be divided into parametric and nonparametric methods, with parametric methods being more time consuming and nonparametric methods being more effective when dealing with the random data.
Research on parametric methods is much earlier and more established. Jensen [18] used linear regression methods for traffic forecasting in 1972, and Hogberg [19] used nonlinear regression methods to estimate the parameters in traffic forecasting models in 1973. Lee et al. [20] used the ARIMA model for short-term freeway traffic volume forecasting and proved to be able to accurately predict traffic variations with time-series characteristics. e method contains fewer parameters but has better forecasting results. ARIMA is widely applied as a parametric model for timeseries forecasting, which is obtained by introducing hidden variables through moving average (MA). Tatsuya [21] eliminated the seasonal factor based on ARIMA and applied the SARIMA model to forecast both short-term and longterm traffic flows, which proved that SARIMA is more suitable for long-term prediction. On the freeway, the sudden changes in traffic conditions are often caused by the effects of weather, accidents, and surges in demand. us Bezuglov [22] tested three grey system theory models and demonstrate that grey system models are better adapted to the sudden changes in parameters. Cai et al. [23] then formulated a noise-immune Kalman filter model applied to short-term traffic flow prediction, which solved the problem that the Kalman filter is insensitive to traffic flows satisfied by non-Gaussian noises. e research on nonparametric methods mainly includes neural network models, support vector regression (SVM), and K-nearest neighbor (KNN) algorithms. Among them, the first application of KNN in traffic flow prediction was in [24]. Meanwhile, Harrou et al. [25] monitored the traffic congestion problem with the KNN algorithm, which was very effective and was based on the advantages of the Kalman filter (KF) and a piecewise switched linear traffic (PWSL). Support vector regression (SVR) models were applied and proven to be appropriate for traffic flow forecasting on urban roads, e.g., [26]. And Castro-Neto et al. [27] demonstrated that the OL-SVR model can well deal with the atypical conditions when they occur and is better than other prediction models. e Gauss-SVR model proposed by Li et al. [28] has better predictive performance than the neural network when it comes to urban traffic flow forecasting, but it should be noted that the structure of the BP neural network used in the paper is simpler.
Although neural network models have been considered to be better suited for short-term traffic forecasting [29][30][31], and it has been argued that the accuracy of predictions gradually decreases as the range of predictions increases, neural networks do not work well for long-term traffic forecasting. However, Çetiner et al. [32] applied an artificial neural network (ANN) to predict long-term (1h) data and short-term (5 min) data for one day in a historical dataset for the city of Istanbul and discovered that better results were obtained with the long-term dataset. Hao et al. [33] demonstrated that Long Short-Term Memory (LSTM) neural networks can consistently and effectively achieve long-term traffic forecasting based on dynamic traffic flow probability graphs and alleviate the effect of missing data on the forecasting results. erefore, the neural network can enable long-time forecasting. It has been demonstrated that NARX NN is also capable of predicting long timeseries tasks [34].
While parametric and nonparametric methods have their advantages and disadvantages, both should be tested and compared with the baseline model to obtain the optimal model. However, the use of combined a model, or algorithms, or heuristic approach for forecasting seems to be the current rage.
is is very common in complex traffic forecasting, e.g., [23,28]. But some problems have remained. In [35], Vlahogianni has discussed comparing the use of baseline and combined models and raised the question of when to combine predictions. It addressed the situation where combined models may make it difficult to control errors in long-term prediction.
In the studies of traffic parameter prediction mentioned above, most of them were for single sites or individual road sections. In contrast, when analyzing macroscopic characteristics, traffic flow parameters have to be predicted for multiple sections on the road and the entire road network in an integrated manner. In a previous study, Williams [36] used the ARIMA model to predict traffic flows at several test sections within the city of Bonn, Germany. Kamananakis [37,38] applied the STARIMA model to traffic flow forecasting on major arterial road sections in the city of Athens, Greece, and described the topological distance relationship with the sections by using a weight matrix. A comparison with ARIMA and VARMA revealed similar prediction performance. In a recent study, Li et al. [39] proposed a short-term multistep traffic prediction model for macroscopic road networks, Dynamic Graph Convolutional Network (DGCN), which is based on the Dynamic Graph Convolutional (DGC) model. erefore, the above study shows that it is feasible to forecast the traffic state of the regional road network.

Macroscopic Evaluation of Traffic State.
In macroscopic traffic state evaluation, the range of study is often extended to multiple cross sections or the entire road network based on the fundamental diagram theory of traffic flow. To describe the traffic state of urban road networks, scholars have proposed various theories, such as dimensional analysis theory [40] and two-fluid theory [41][42][43]. e MFD was proposed by Godfrey in 1969 [44]. Daganzo et al. [45,46] in 2008 revealed the characterization of flows and trip completion rates by using GPS data of taxis in Yokohama, Japan, which has a high level of congestion. e existence of the MFD was formulated and verified, and a new paradigm was initiated. With the MFD, it is possible to significantly reduce the complexity of the traffic flow model, thus achieving a controlled design of the model for the urban network.
After that, analytical modeling and control researches on MFD have been continuously proposed. e main difference between these models is that the controlling strategies are different and the controlled models are summarised as follows.
(1) ree-parameter baseline model: according to Edie's [47] simple definition of road network traffic flow, the arithmetic mean or weighted mean of the section traffic flow parameters can be obtained, as shown in the following equation: where Q, K, and V are the averages of the volume (pcu/h), density (pcu/km), and speed (km/h) in the road network, while N is the number of sections or road segments detected. (2) Perimeter control models [48]: By defining a generic mathematical model for multireservoir networks with well-defined MFDs for each reservoir, maximizing the throughput of the system without the necessity to calculate workload and future demand data is achieved. (3) Adaptive control models [49]: ese models optimize the controller through real-time observation of changes to adapt the model to uncertainties and state delays under different parameters. (4) Hierarchical perimeter control models [50]: is model is proposed to address the problem of traffic congestion with high traffic demand and heterogeneous distribution. e hierarchical control strategy is introduced, and each layer solves the corresponding problem according to different demands to achieve the stability of the traffic system. (5) Robust control models [51,52]: is model mainly considers the parameter uncertainty of single or multiple regional networks and sets different algorithms for robust controllers according to various demands to ensure stable operation of the model.
And in addition to the MFD, new progress has been introduced to the study of macroscopic traffic flow analysis. onfer et al. [53] modularized the urban traffic network and proposed a model that contains signal control and is adapted to large-scale urban road networks, which can improve operational efficiency through parallel computation. Xiao et al. [54] presented traffic speed cloud maps that can dynamically capture the process of congestion formation and degradation, but the scale of data resources required is larger than that of the MFD.
In summary, scholars have demonstrated in previous studies that traffic data is highly nonlinear, periodic, and time-varying, and that the NARX model also supports the prediction of traffic data with time-series features compared to the SARIMA model [55]. Although NARX NN has longterm forecasting capability, whether it can fulfill the task of long-term traffic state parameter prediction or not still needs to be verified [56]. In this paper, we decided to conduct comparative experiments using the NARX model with the other baseline models. While the MFD is better than other methods of macroscopic state evaluation in presenting the traffic state characteristics in terms of data relationships, we chose MFD.

Evaluation Model of Regional Road Network's Traffic State Based on the MFD
For our study, considering the large scope of road network evaluation, the traditional microscopic traffic models are not effective, so the MFD is used to reflect the traffic conditions of the road network.

Traffic State Classification and Index Calculation.
As multiple indicators can reflect traffic conditions, the selection of the appropriate indicators is a key step. erefore, the selection steps of indicators in this paper will follow the following principles: (1) Clear purpose, (2) comprehensive scope of the evaluation, and (3) being practical and operable. After reviewing the relevant literature [57,58], the current commonly used traffic state evaluation indicators were compiled and shown in the following figure. e parameters of flow, speed, and occupancy were the most commonly used (see Figure 1). Considering the difficulty of index acquisition, evaluation criteria, and model applicability, the traffic flow parameters of flow, density, and speed were selected in this paper to reflect the condition of the road network. e essence of road network condition evaluation follows the basic evaluation principles and discriminates the traffic state through scientific and reasonable methods. In traditional traffic evaluation systems, the traffic flow fundamental diagram model is based on the analysis of historical data. It is established with statistical methods and characterizes the relationship between the parameters of a single section of road and has a limited scope of application. e relationships between spatially averaged flow variables of multiple sections or regional road networks are analyzed with MFDs, which provide an intuitive and systematic description of the overall structure of the road network.
To discern whether the road network is congested or not, it is assumed that the road network is homogeneous and all the cross-sectional traffic parameters are compatible with the flow relationship curve. According to the critical speed in the flow model proposed by Greenshields, it is divided into congested and noncongested areas.
Let zQ/zV � 0; then, V m � V/2 corresponding to Q max can be obtained and thus the point (V f /2, Q m ) is the maximum value of the flow-velocity curve. e extreme value of the flow-velocity curve Q m can be used as a dividing point between the uncongested and congested areas (see Figure 2).
In this paper, we have introduced M, the road network's traffic condition efficiency index, which characterizes the state of vehicles per time in the road network. Its calculation is shown as follows: where L is the length of the road section in the road network. From equation (2), it can be seen that there is a threefold functional relationship between M and V, and there must be an M max corresponding to a V.
When zM/zV � 0, V A � 2/3V f can be obtained, which corresponds to M max . At this time, the point  Figure 3.
In summary, the traffic state of the road network can be divided into four classes based on the traffic state curve. e intervals of each class are shown in Table 1. (1) e traffic state evaluation model for road networks of the same level uses the average speed of all vehicles in the road network as the state evaluation index, as shown in the following equation:

Evaluation
Congestion Area Figure 2: Traffic-velocity model congestion area division. Figure 3: Traffic state curve.
Journal of Advanced Transportation 5 where V jk (km/h) is speed state for all roads of grade j at the k-th inspection time interval; V ijk (km/h) is the speed state for the i-th road monitoring station of the grade j roads at the k-th inspection time interval, and n is the number of j-grade road monitoring stations in the road network. (2) Equation for calculating road weights: e importance of a road section is related to its capacity and length. Variations in the operational state of road sections with higher capacities or smaller lengths will have a greater impact on the operational state of the road network. e importance weights of intersections and sections are modeled as follows in equation (4), where traffic monitoring stations are calculated as intersections: where ω c (L ij ) is the section importance weight; C (L ij ) is the section capacity, in veh/h; and l (L ij ) is the section length, in m.
Using the road section level and weights, a comprehensive weighting model for road sections is established using the following equation: where ω k (L ij ) is the comprehensive weight for the i-  (4) and (5), as shown in the following equation: where F k is the traffic state value of the road network in the kth time interval, in (km/h). In this study, we only include primary roads and secondary roads due to the limitations of road network grades and monitoring devices in the selected areas. e higher the grade is, the more the transport functions it corresponds to undertake, and the more the weights that need to be assigned. In terms of the capacity of each road class, the weights of these two classes of roads were determined to be 0.6 and 0.4.

Experiment
After analyzing and obtaining the above method, the experimental flowchart of this paper is summarised as follows (see Figure 4).

Data Description.
To test the operational effectiveness of the state prediction model over a large area, the location chosen for this study was in Linzi District, Zibo City, Shandong Province, which has an area of 672.58 km 2 for the study. e area has long latitudinal boundaries as it is crossed by many roads of various grades to Qingzhou City. e traffic flow is inevitably high as vehicles must pass through the area to enter Qingzhou. Considering the large area of collection, the monitoring stations were selected to monitor four sites, Zi River, Bei Liu, Reed River, and Wu Tai, which are distributed on the main national roads (e.g., G308, G309, and G233) and provincial roads (e.g., S227, S228, and S102) in the region (see Figure 5(a)). Among them, there are more factories near the Reed River monitoring stations, and large vehicles will pass through the station, so the sectional traffic volume is larger than the other stations. e locations are relatively dispersed to reflect the intrinsic links of the regional road network. e weights are then divided according to the grade of each road, as shown in Figure 5 To reflect the correlation and weight expression of adjacent links in the road network more intuitively, a spatial weight matrix W (0.6,0.4) was established, as shown in Figure 6. In the figure, "0.6″ and "0.4″ indicate the corresponding road level weights between two nodes, and "0″ indicates that the two nodes are not connected. e total length of the roads involved is 122.135 km. From 12/01/2019 to 02/28/2020, the raw data was provided by the Shandong Provincial Highway Traffic Investigation and Management Institute in Zibo, Shandong Province, China. All vehicles passing on these roads are recorded in the station's database with radar-based detectors and video collecting equipment at the monitoring stations. e traffic flow data was measured every 1 hour and the speed data is measured every 5 min. In addition, due to the epidemic, some roads were closed and the communication of the detector was faulty during the collecting period and resulting in some traffic data being missing (1.94%). To minimize the influence, the missing values were recovered by the average values of other days measured at the same time during data processing. To avoid any loss of monitor accuracy during rain and snow, which could have an impact on the accuracy of the acquired data, the data was recalibrated using video equipment to ensure accuracy. After the above steps and data processing and data imputation, the traffic flow data and average speeds obtained from each of the four detection points were grouped by hour, with 8,640 sets of data each (see Figure 7). It implies that 90 days * 24 h of data were recorded for each monitoring station.

Traffic Flow Prediction Modeling and Baseline.
As previously mentioned, one of the main objectives of this paper aims to investigate the feasibility of NARX NN in long-term 6 Journal of Advanced Transportation Step1.State classification intervals of regional road network Step2

NARX NN.
e NARX NN can be seen as a neural network version of the time-series model, which can consider external time-series inputs. Its predictions are based on historical data from the same series. As a standard NARX neural network architecture, the parallel architecture feeds the output back into the input of the feedforward neural network, where TDL is the tapped delay line. e other architectural mode is a series-parallel architecture that uses the real output rather than the output estimated by the feedback. It enables the input to the feedforward network to be more accurate and can be trained by using static    Journal of Advanced Transportation backpropagation [59,60]. e simple structure is shown in Figure 8. As a typical nonlinear prediction model with input and output delays, in the NARX model, the traffic flow y(t+1) is obtained from the input and output flow predictions before t. e model's expression is shown in the following equation: where f is a nonlinear function and d is the order of delay of the input and output.

LSTM NN.
As a special form derived from the RNN model, LSTM NN can better deal with data of longer sequences. Long-term memory is preserved through gating and cell state updates, and time-related information is retained through storage cells to solve the gradient disappearance and gradient explosion problems in RNN models.

GRU NN.
e GRU NN, as well as the LSTM NN mentioned above, is excellent variant of the RNN [61]. However, differing from the LSTM NN, the GRU network simplifies the structure by merging the forgetting gate and the input gate into an update gate. e updated state of the cell is controlled through the update gate, and the reset gate determines how to combine the new input information with the previous memory to make convergence faster.
To compare the traffic flow prediction effect of the NARX NN, LSTM NN, and GRU NN models, the same traffic flow datasets are chosen. e root means square error (RMSE), mean absolute percentage error (MAPE), and R 2 were selected to evaluate the difference between the real results Y(t) � y 1 , y 2 , . . . y n and predicted results Y(t) � y 1 , y 2 , . . . y n , as shown in equations (8)-(10): where p is the number of samples, y i is the flow's measured value, and y i is the corresponding predicted value.

Experimental Settings.
e purpose of this experiment would be to predict future day traffic values with threemonth data and then obtain the predicted traffic status from MFD and traffic status evaluation intervals. For the structure of the dataset described in Section 4.1 and the baseline model in Section 4.2, the dataset of each monitoring station is partitioned into a training set, a validation set, and a test set, corresponding to a ratio of 7 : 1.5 : 1.5. It is worth noting that the datasets of the methods used were uniform to be fair for comparisons between different methods. Moreover, some of the parameters were taken as constant or default values to facilitate the control of variables. e open-closed loop control (see Figure 8(a)) is chosen for the training of the NARX NN, while the closed loop control (see Figure 8(b)) is chosen for its prediction with a learning rate of 0.005, which is intended to obtain better prediction results in long-term traffic flow prediction.
e corresponding training algorithm used the Levenberg-Marquardt (LM) algorithm to obtain a faster iteration rate. e training process for both LSTM NN and GRU NN utilized the Adam optimizer, with the Max-Epochs set to 250, the gradient threshold of 1, and an initial learning rate of 0.01. After the 125th training round, the learning rate is multiplied by 0.2 to reduce the learning rate. As it is a sequence of input and output data, both input and output are set to 1 dimension.
To compare different prediction methods and to ensure the uniformity of the platform, the experimental equipment is composed of a 6-core AMD3600 X 3.80 GHz CPU, 16 GB of RAM, a GEFORCE RTX 2060S 8 GB GPU, and MATLAB 2021a is chosen for the experimental platform.

Results
According to the above progress, the corresponding steps of Figure 4 and the obtained results are as follows.
Step 1. In this step, when classifying the regional states, the traffic state classification intervals in Table 1 and the design speeds of the corresponding roads are used. We can obtain the smooth flow speed V f , consequently 2V f /3, V f /2, V f /3, and the boundary value of each state. In this study, we used the data for the three months from December 2019 to February 2020 as the basis to obtain the actual state classification intervals of the regional road network by the above method (see Table 2).
According to the relationship between the traffic flow parameters within the regional road network and the speed classification interval in the table above, we can obtain the MFD of the road network (see Figure 9). At this time, the traffic flow of the overall network is classified into four different states according to the average speed values, with different colors representing different traffic flow states. As there are a large number of primary roads in the network and the monitoring stations are mainly located at the intersection of the two grades of roads, the classification intervals will be based on the primary roads.
Step 2. After Step 1, we first evaluated and compared the performance of several of the above prediction models on the datasets from the four monitoring stations based on the experimental design in Section 4.3. By varying the number of hidden layers or units, we were able to filter out the best method of setting the model parameters to ensure higher prediction accuracy. e specific parameter data are shown in Tables 3-5.    e optimal selection of parameters for the different prediction models was collated for each monitoring station as shown in the figure below (see Figure 10).
As a result of the above comparison experiments, it can be seen from the comparison of the indicators in Tables 3-5 and Figure 10 that, due to the similar trend of the flow data at each monitoring station, it resulted in a very similar selection of parameter and evaluation indices for each location in each method. By comparing the performance evaluation metrics of each model with the best parameters, it was found that the R 2 values of NARX NN are larger than both LSTM NN and GRU NN, but the RMSE values were smaller at some monitoring stations, and the MAPE values of NARX NN were smaller than both models.
Furthermore, it is surprising that when the number of nodes in the hidden layer of the LSTM NN is continuously increased, there is no gain in performance, but rather a decrease. e reason for this is that the method is already approaching the optimal number of nodes for this dataset, and increasing the number of nodes will complicate the network and tend to "overfit" or prolong the training time. Due to its simpler structure, the GRU NN converges faster than the LSTM model during training but has lower prediction accuracy than the LSTM. erefore, the traffic flow of the four observation stations was predicted based on the optimal number of hidden layers or units for the above-mentioned models. e results obtained are shown in Figure 11 below.
After observing and analyzing the above prediction curves, it can be seen that all three models fit well at all four monitoring stations and that the NARX NN converges faster and with relatively smaller errors than the other two models. However, in some time intervals, the NARX model oscillates less than the other two models. is indicates that while the model is well adapted to the time-varying characteristics of the traffic flow data, it is not able to cope with sudden variations in the flow at some sites.  Step 3. After the first two steps, the flow for the entire network is calculated with equations (4) and (5), the classified road weights, and the predicted flows for the four monitoring stations obtained. After processing the traffic data for the next day, a comparison of the three neural network models shows that the NARX model is highly adaptable and real-time for nonlinear traffic data, while the LSTM has better lagging performance (see Figure 12). e original values of traffic flow matched well with the predicted values. e predicted results are then evaluated. As can be seen from Table 6, the three evaluation indicators of the NARX model are slightly better than the other two models for predicting traffic flow for the next day. ough similar, it converges more quickly.
Finally, according to the traffic state classification interval, the traffic state of the road network for the next day is obtained from the current MFD, the predicted traffic flow, and the corresponding speed (see Figure 13).
e analysis in the figure shows that the free flow is distributed during the low peak hours of the day. e blocked flows are mainly distributed during the morning and evening peak hours, and the synchronic flows are distributed after them with the temporal progression, as the main state after the peak hours. Harmonic flows occur less frequently and occupy fewer hours. e main reason for the above phenomenon is that most of the selected monitoring stations are located in the surroundings of the central city. Residents have to pass through these points

Conclusions and Prospects
In this paper, we developed a state prediction model for regional road networks (NARX-MFD) and proposed a traffic state efficiency index formula. e traffic state of the regional road network is classified into four categories by using and analyzing the free flow speed and the design speed of the classified road in the traffic flow parameter curves. is will be used as the evaluation classification of the predicted state. en, according to the traffic state parameters measured from the four monitoring stations, the MFD of the road network in Linzi District of Zibo City is obtained for the selected period.
Afterward, a comparison experiment of LSTM, GRU, and NARX with the same dataset showed that NARX had slightly better prediction performance than LSTM and GRU and converged fastest and can cope well with long-time traffic data. However, it did not cope well with the sudden phenomenon of traffic changing and the oscillation amplitude was not large. e main reason for this is that only the test set has mutation data, and the model trained using the training set with regular variation can only cope well with such data. Finally, a prediction diagram of the future day's traffic state is obtained with MFD, predicted flows, and corresponding speeds based on the state classification intervals of the regional road network. It realizes the functions of data quality control, prediction, and visualization of the operation state of the road network. In summary, the road network operation state prediction model can provide a reliable basis for traffic managers' decision-making and provide effective real-time traffic information for travelers, thus reducing travel time and improving travel efficiency.
However, as there are still shortcomings in both macroscopic traffic state evaluation and predicted models in this paper, our future work will be carried out from these two aspects. In the macroscopic traffic evaluation section, the number of class categories affects the calculation of weights, the calculation of capacity, and even the classification of the road network state. And there are only three selected road class categories in the network, which could be further refined. In the prediction model section, firstly, as it is difficult to obtain data and the amount of data is not large enough, considering the scale of data under the practical application of the model, further research is needed to see whether the NARX model can maintain the current prediction effect when dealing with a large dataset. Secondly, trying to combine the NARX model with other models to improve the prediction accuracy of traffic flow is a part of future research.

Data Availability
e data used to support the results of this study were obtained from the Shandong Provincial Highway Traffic Investigation and Management Institute in Zibo, Shandong Province, China, and are available from the corresponding author upon request.