Real-Time Corrected Traffic Correlation Model for Traffic Flow Forecasting

This paper focuses on the problems of short-term traffic flow forecasting. The main goal is to put forward traffic correlation model and real-time correction algorithm for traffic flow forecasting. Traffic correlation model is established based on the temporalspatial-historical correlation characteristic of traffic big data. In order to simplify the traffic correlation model, this paper presents correction coefficients optimization algorithm. Considering multistate characteristic of traffic big data, a dynamic part is added to traffic correlation model. Real-time correction algorithm based on Fuzzy Neural Network is presented to overcome the nonlinear mapping problems. A case study based on a real-world road network in Beijing, China, is implemented to test the efficiency and applicability of the proposed modeling methods.


Introduction
It is of practical significance to predict traffic flow quickly, precisely, and timely.Short-term traffic flow forecasting provides an important basis for traffic guidance and control.Existing studies of short-term traffic flow forecasting can be classified into six categories in transportation literature: (a) linear system theory based models, such as Autoregressive Integrated Moving Average (ARIMA) [1] and Kalman Filtering model [2]; (b) data mining based models, such as Neural Network [3], Nonparametric Regression [4], and Support Vector Machine [5]; (c) nonlinear system theory based models, such as Wavelet Analysis [6], Catastrophe Theory [7], and Chaos Theory [8]; (d) simulation based models [9]; (e) combination model based models [10]; (f) the other models.
In the era of big data, it brings both opportunities and challenges to short-term traffic flow forecasting.During data processing, traffic big data meets the same difficulties with the general big data, such as capture, storage, search, sharing, analytics, and visualization.Therefore, short-term traffic flow forecasting method needs to have the capacity to deal with traffic big data.Traffic big data holds several characteristics, such as temporal correlation, spatial correlation, historical correlation, and multistate.Considering the advantages of traffic big data, data-driven based mathematical models can be set up.The physical meaning of these models can by described clearly.In addition, we can put forward real-time correction algorithm to improve the accuracy of traffic flow forecasting.
However, taking into account all the present researches in this field, there is still a lack of consideration of traffic big data and real-time correction for traffic flow forecasting.Further researches remain to be conducted on the direction of traffic big data analysis.In this paper, the method of short-term traffic flow forecasting is proposed in detail.The remainder of this paper is organized as follows.Section 2 presents basic mathematical model.In Section 3, real-time corrected traffic correlation model is established.A case study based on a realworld road network is carried out in Section 4 to demonstrate the performance and applicability of the proposed method.Finally, conclusions are drawn in Section 5.

Big Data Driven Based Traffic Correlation Model.
Traffic big data has a strong temporal-spatial-historical correlation as follows.
(i) In the temporal series, the traffic flow of last moment can be regarded as a continuation of current traffic flow.Dynamic traffic flow data continuously change over time with a certain trend.
(ii) In the spatial series, the traffic flow of downstream sections can be seen as a continuation of the upstream traffic flow.There exists a spatial association between traffic flow data of neighboring junctions or sections and that of target junctions or sections.
(iii) In the historical series, the traffic demand characteristics determine that traffic flow characteristics of the same day in the same period are similar.The law of traffic flow cycle is especially evident.
Therefore, the basic form of traffic correlation model [11] is expressed as where   () is the traffic flow parameter of section  at time , representing flow   (), speed V  (), or occupancy   ().  (),   (), and   () are the estimated value of   ().  ,   , and   are coefficients of these three variables.
() is calculated by temporal correlation analysis, generally based on Regression Analysis Model [12].  () is calculated by spatial correlation analysis, generally based on Neighbor Regression Model [13].  () is calculated by historical correlation analysis, generally based on Discrete Fourier Transform Model [14].
Thus, simplified equation of traffic correlation model is obtained: where   is regression coefficient of   (−). is the number of   .

Correction Coefficients Optimization Algorithm.
It is found that the speed and accuracy of data processing are both important for big data driven method.To improve the speed of traffic correlation model, the number of unknown variables in formulation (2) must be reduced.However, variables reduction may decrease the accuracy of the model.
Therefore, this paper defines a threshold of computing speed and derives the maximum of acceptable number of variables.Thus, the number of unknown variable  can be achieved.The correlation coefficients (  ) between the studied section  and the related section  are used to choose variables, the number of which is : In addition, for   (⋅), a unique value of   ( − ) is determined.Therefore, a lot of variables are reduced.When the max of correlation coefficient (  ) corresponding to each section  is calculated, corresponding time delay () and the unique value of   ( − ) are obtained: If the value of   is large enough, the variable   ( − ) is preserved.Otherwise, the variable   ( − ) is reduced.After several data tests, it is found that the values of   are not very different with relatively high values when  is less than 4; the rapid decay of   is observed with relatively low values when  is more than 8. Therefore,  ∈ [4,8].
Variables reduction makes   meaningless.So, a new variable   , which is normalized   , is selected to replace the variable   : Since the alternative process will bring some errors, which are likely to be systematic, a linear correction algorithm is present.Two correction variables  and  are introduced for calibration error.Simplified traffic correlation model is where V () is the actual value of   ().

Real-Time Correction Problem Statement.
Basic mathematical model can be used for traffic flow forecasting.The error of traffic flow forecasting is written as Therefore, V () is achieved: The variation range of Δ() reflects the accuracy of traffic flow forecasting.To improve the accuracy of traffic flow forecasting, Δ() is made as a compensation variable for   ().Then, V () is replaced by   () in formulation (8): For different traffic state, the propagation of traffic congestion is different.So, the temporal-spatial-historical correlation variables are dynamic.As shown in Section 2, basic mathematical model for traffic flow forecasting is put forward based on temporal-spatial-historical correlation characteristic of traffic big data.However, the multistate characteristic is largely ignored; the temporal-spatial-historical correlation variables are seen as static variables.Although a linear fitting method is used to improve accuracy, the error which is the main part of Δ() still exists.
Effective analysis of traffic correlation model is shown in Figure 1.The error of traffic congestion stage is larger than that of traffic smooth stage.The error of last moment may affect the error of current moment.
Δ() responds to changes with traffic flow state and temporal series.It is difficult for mathematical model to describe the characteristics of Δ().This paper tries to present real-time correction algorithm based on nonlinear mapping.

Real-Time Correction Algorithm.
The main goal of realtime correction algorithm is to calculate the error term (Δ()) in formulation (9).Because of nonlinear mapping, Fuzzy Neural Network (FNN) can be used to overcome this problem.The structure of Fuzzy Neural Network is shown in Figure 2.
Every output unit in Fuzzy Neural Network is corresponding to certain fuzzy subset of the output variable, which is Δ().The format of output signal is To get membership degree model of LOS and Δ, SAGA-FCM is used to get the clustering centers.LOS is divided into six types [15] where   (LOS()) is the membership degree of which sample data () that belongs to LOS  , () = [V(), ()]. is the set of clustering centers for LOS,  = {  ,   ,   ,   ,   ,   }, and   = (V  ,   ).

Hidden layer
Input layer Fuzzy subset: LOS(t − 1) Fuzzy subset: Δ(t − 1) where   (Δ()) is the membership degree of which sample data Δ() belongs to Δ  . is the set of clustering centers for Δ,  = {  ,   ,   ,   ,   }.Historical date is applied to train the Fuzzy Neural Network.Input signal is corresponding to target output.After training, the network can be seen as a container of fuzzy relations and if we want to get other conclusions from the network, the only thing that needs to be done is to input the real value after defuzzification, as shown in

Real-Time Corrected Traffic Correlation Model.
Real-time corrected traffic correlation model, which is seen as the improved traffic correlation model, is composed of static part and dynamic part.The static part is " ∑  =1     ( − ) + , " which shows the physical meaning of traffic correlation.The dynamic part is "Δ(), " which shows the physical meaning of traffic multistate characteristic.Traffic flow forecasting framework is outlined in Figure 3.
The steps of traffic flow forecasting are as follows.
Step 1 (traffic correlation model).Based on historical data, temporal data, and spatial data, basic traffic correlation model, as shown in formulation (1), is built.
Step 4 (real-time corrected traffic correlation model).Realtime corrected traffic correlation model, as shown in formulation (8), is stored in the database as system knowledge.
Step 5 (short-term traffic flow forecasting).Based on system knowledge, real-time data is processed to calculate the forecasting results.

Case Study
4.1.Data Characteristics.Taking a section of the Second Ring Road (Section 1, as shown in Figure 4) and its surrounding trunk road in Beijing, China, as the object of study, it verifies the effectiveness and feasibility of the proposed method.Basic traffic flow data are detected by microwave detectors.Interval time of traffic flow data is 5 min.Goodness of Fit is shown in Table 1, in which the value of  2 shows the high effectiveness of formulation (16).

Model Improvement.
Real-time correction algorithm is presented to obtain the dynamic part of real-time corrected traffic correlation model, which is "Δ()." The steps are as follows.
Step 1 (clustering center search).Based on SAGA-FCM, LOS is divided into six types; Δ is divided into five types.Clustering Centers are shown in Tables 2 and 3.
Step 2 (calculation of membership degree).Formulations are shown is Section 3.2.
Step 3 (training Fuzzy Neural Network).Input layer has 11 nodes; the transmission function of the nerve cells in the hidden layer is transig; the output layer has 5 nodes and the transmission function of the nerve cells in the output layer is logsig, while the training function is traingdx.Historical data is used to train Fuzzy Neural Network.

Short-Term Traffic Flow
Forecasting.Making one day as an example, the result is shown in Figure 5. Mean

Conclusions
Traffic big data strongly shows temporal-spatial-historical correlation and multistate characteristic.Traffic correlation model is established based on temporal-spatial-historical correlation.Correction coefficients optimization algorithm is put forward to reduce parameters and ensure calculation accuracy.In order to improve the effectiveness of shortterm traffic flow forecasting, real-time correction algorithm is presented based on multistate characteristic.Fuzzy Neural Network is used to overcome the problem of nonlinear mapping.Case study shows that real-time correction algorithm can improve the effectiveness of traffic correlation model.
The core of this paper is to present a short-term traffic flow forecasting based on traffic big data analysis.The advantages of real-time corrected traffic correlation model for traffic flow forecasting are as follows.
(1) The temporal-spatial-historical correlation, which is considered in the static part of model, explains the physical meaning of traffic flow forecasting by mathematical model.
(2) The multistate, which is considered in the dynamic part of model, explains the dynamic characteristic of traffic flow.
(3) Real-time correction algorithm improves the accuracy of traffic flow forecasting.Case study shows the high efficiency and applicability of the proposed methods.Moreover, the proposed methods can be extended to 15or 30-minute-ahead forecasting.The next steps of this work are to study traffic incident and its influence for short-term traffic flow forecasting.In addition, how to deal with the long period traffic flow forecasting like one hour or even longer can also be focused on.

Figure 1 :
Figure 1: Effective analysis of traffic correlation model.

Figure 4 :
Figure 4: Spatial location of research object.

Figure 5 :
Figure 5: Effective analysis of real-time correction algorithm.

Table 1 :
Goodness of Fit.

Table 2 :
Clustering centers of LOS.

Table 4 .
Comparing with these two kinds of model, real-time corrected traffic correlation model is better than the basic mathematical model.That is, the static part of real-time corrected traffic correlation

Table 4 :
Evaluation result of proposed models.