Development of Urban Road Network Traffic State Dynamic Estimation Method

Traffic state estimation is a key problem with considerable implications in modern traffic management. A simple, general, and complete approach to the design of urban network traffic state and phase estimator has been developed in this paper. A uniform traffic state dynamic estimationmethod structure is designedwhich consists of three steps. (1) Floating-car data and radio frequency identification data preprocessing method is proposed to remove the abnormal data and finish the map matching process. (2) Section speed estimationmethod is proposed based on the degree of confidence. (3) Traffic phase identificationmethod is proposed based on the estimated section speed. A number of simulation and field investigations have been conducted to test the estimator performance. The investigation results indicate that the proposed approach is of high accuracy and smoothness on the section speed estimation and effectively eliminates the influence of abnormal data fluctuations and insufficient data. And the traffic phase identification method can effectively filter out the abnormal distortion of estimated section speed around the threshold value and modify the phase step of traffic status caused by abnormal data.


Introduction
Real-time traffic state estimation is a fundamental task for urban traffic management centers and is often a critical element of Intelligent Transportation Systems (ITS).Intelligent Transportation System (ITS) has been regarded as important means to solve traffic jams in the city [1].Traffic state estimation for an urban network refers to estimating the traffic congestion level of the network at the current time based on available real-time traffic-aware data.For this purpose, various sensors are used to collect traffic information.To restore the real traffic conditions, raw traffic information should be processed by various methods.By using multiple traffic data sources we will make better inferences to obtain the traffic-aware data.And this is a reproduction of real traffic system that could not be achieved from a single source of data.Furthermore, as sources of data are becoming increasingly available, traffic status estimation needs to be applied not only to realize these benefits, but also to use the data more efficiently.
For traffic state estimation, a limited number of research works produced and proposed corresponding estimation algorithms that were almost exclusively based on the seminal methodology of Kalman filtering [2] and its extensions for nonlinear systems [3].The Kalman filter is an optimal state estimator applied to a dynamic system that involves random noise and includes a limited amount of noisy realtime measurements.Although it was originally derived for linear systems, the Kalman filter can also be extended for its application to nonlinear systems via specific on-line Taylor expansions of the originally nonlinear systems.The Kalman filter obtained in this way is called extended Kalman filter (EKF).
Early applications of traffic state estimation were reported in traffic surveillance systems for short interdetector distances by Gazis and Knapp [4], Knapp [5], Szeto and Gazis [6], Nahi and Trivedi [7], and Nahi [8].In these and later investigations, for example, Smulders [9] and Bhouri et al. [10], only short freeway sections with a length below 2 km were considered; see a review by Cremer [11].The applied 2 Mathematical Problems in Engineering models were relatively simple (due to the short section lengths).Later approaches started using more comprehensive dynamic traffic flow models, which opened the way to the consideration of longer freeway stretches (2-4 km); see Cremer [12] and Cremer et al. [13], while more recent investigations elaborated on some technical details based on previously proposed basic ideas; see Kohan and Bortoff [14] and Meier and Wehlan [15].Other modeling approaches have been used as well as the Kalman filter, such as the neural networks (e.g., [16][17][18][19][20]) and others (e.g., [21,22]).Karlaftis and Vlahogianni [23] compare statistical methods and neural networks in transportation research, highlighting some of the differences similarities of the two types of data analysis tools.Wang et al. [24,25] developed a general approach to the real-time estimation of the complete traffic state in freeway stretches based on the extended Kalman filter.
In China, several approaches have been proposed based on the development of social economy and the technology: Jia et al. [26] used the floating-car data in Beijing to identify the traffic state based on average vehicle speed.Dou et al. [27] proposed an algorithm of logistic regression for traffic state probability forecast to obtain the accurate and objective traffic state information to meet the demand of traffic guidance.Wang and Guo [28] present the heterogeneous data fusion methods and road traffic code model in Beijing.
As indicated above, while there has been substantial progress in developing traffic state estimation technologies.Compared with the research on freeway network, fewer researches on the approaches of dynamically estimating urban area network traffic states have been found in the literature.Developing the dynamic estimation of urban road network traffic states is of critical importance in managing urban traffic.The primary objective of this paper is developing a traffic state dynamic estimation method based on degree of confidence for urban road network.
This paper presents an urban road network traffic states estimation approach that combines a section speed estimation procedure based on confidence value and a traffic phase identification method to provide the most reasonable traffic phase identification results.The proposed method was evaluated in part of the roads network in Nanjing, China, and its effectiveness compared with the weighted average algorithm.
This paper is organized as follows.The first part is the introduction of traffic states estimation method.The second part is the section speed estimation model based on confidence value.The third part is traffic phase identification methods based on section speed and performance evaluation.Last part is conclusions.

Traffic State Estimation
There are some related terms used in the remainder of this paper; in order to avoid any ambiguity or confusion between them, they are clearly defined here based on existing research [27].(Traffic) state: the state in which traffic is at any given time can be described by a number of parameters such as flow, density, and speed; as these are continuous variables, there can be an infinite number of traffic states.Traffic phase: simple traffic flow theory models assume a small (usually 2 or 3) number of traffic phases, which have direct physical interpretations (e.g., congested or uncongested).Trafficaware data: a number of real-time estimated parameters based on raw traffic data obtained from traffic sensors by big data technology, which is used to describe traffic states.
Figure 1 shows the structure of traffic state estimation method based on traffic-aware data developed in this study.
Input data include floating car data, radio frequency identification data, video data, and network data.Floating car data is generally updated at intervals ranged from 1 second to 10 minutes and is delivered using GPRS sentence.After interpreted and processed, the data includes information about the terminal, time, longitude, latitude, speed, flag, rate, and update time.Electronic license plates with RFID tags are installed on part of vehicles, which enable the road side RFID detectors to get the passing time, encrypted license plate number, and other vehicle properties when they passing by.Both the road side RFID detectors and video detectors are installed at the top of intersection exit.And after being interpreted and processed, the video data enables us to get the section flow volume.Network geometry data are static provided by government.After real-time raw data reading and storage, the raw data will be preprocessed to amend the errors.Degree of confidence calculation which is based on data analysis will be started as soon as raw data is well preprocessed.And historic data storage approach is proposed to support section average speed estimation.Then trafficaware data is calculated based on the different input data and degree of confidence.Next, degree of confidence based section average speed estimation model is established to calculate traffic phase identification parameters.At last, traffic phase identification method is developed to get the section traffic phase, and final outputs are traffic phase estimated results.

Raw Traffic Sensors Data Analysis and Preprocessing
Usually, the raw traffic sensors data is not good enough to be used directly.In this section, raw data analysis results and raw data preprocessing method are proposed.
To remove the abnormal data, floating car data preprocessing is preliminary screening of raw floating car data.Abnormal data mainly includes three categories: abnormal velocity data, wrong direction data, and time duplicate data.
Based on the floating data collected between November 19, 2009, 7:00 to 19:00 in Hangzhou, China, the abnormal data is analyzed.The total number of floating car is 3839.The total sample size is 2045574.Filter results are shown in Table 1.
As the positioning accuracy of GPS data is limited, vehicles cannot be located on the road accurately most of time (as is known to all, vehicles should be on the road).Map matching algorithms integrate positioning data from a Global Positioning System (or a number of other positioning sensors) with a spatial road map with the aim of identifying the road segment on which a user (or a vehicle) is travelling  and the location on that segment.There are many researches about map matching algorithms [29][30][31][32][33][34] in recent years.Amongst the family of map matching algorithms consisting of geometric, topological, probabilistic, and advanced, topological map matching algorithms are relatively simple, easy, and quick, enabling them to be implemented in real-time.
Thus in this research topological map matching algorithms are used to preprocess the floating car data.Radio frequency identification data is a kind of fixed single point detector data.Points matching algorithm is proposed to filter out records of the same car when passing the detectors installed at upstream and downstream of the section.The algorithm is shown in Figure 2 to preprocess the raw radio frequency identification data in traffic state estimation.
Based on the pair of point record, section travel time is calculated by the passing time.Then it is combined with the section length between this pair of point to obtain travel speed.In this step, abnormal data is removed from the database.Abnormal velocity data is the main abnormal RFID data.
Video data is preprocessed by license plate recognition technology, only to provide the real-time traffic volume of each section in this study because of the low precision of recognition (less than 30%).
Raw traffic sensors data analysis and preprocessing is the first step to estimate traffic state.In next section, section speed estimation method is proposed.

Section Speed Estimation Method
Based on Degree of Confidence 4.1.Algorithm Outline and Explanation of Parameters.In this study, section speed is used to estimate the traffic phase in each section.Floating car data and RFID data both are collected from part of vehicles in road section.Considering the characteristics of the floating cars and vehicles with RFID tags, weighted average method based on the degree of confidence is used to calculate the section speed.The degree of confidence is obtained by sample size of floating car data and RFID data as well as the data features of floating car data and RFID data.Concretely speaking, if the sample size of data is enough to estimate the section speed, we adopt weighted average method to calculate the section speed after map-matching.If the sample size of data is not enough, recent data and historic data of the same period fusion method are used.There are some related terms used in the algorithm models of this method; in order to avoid any ambiguity or confusion between them, they are clearly defined here.Single vehicle weight : the method assigns different weight for every piece of data (FCD and RFID) by an algorithm proposed in this paper based on speed and sample size of a section.
Degree of confidence : degree of confidence  is the sum of sample weights in one section.It is an important parameter in recent data and historic data of the same period fusion method.
Sample size  and minimum sample size  min : in order to insure the estimated section speed to reflect the real-time traffic state, sample size in one section should not less than minimum sample size  min .If the real-time sample size is not enough, we adopt recent data (e.g., 5 minutes ago data) and historic data of the same period to estimate the section speed.Value of  min is obtained by simulation experiments [35].
Count variable of small samples  and confidence attenuation parameter  max : continuous lack of sample times in section  is counted as   .When  <  min ,    =  −1  + 1 and when  >  min , the value of   returns to zero. max is confidence attenuation parameter, which indicates that recent data before  max data process cycles is invalid to estimate the section speed.

Single Vehicle Weight and Degree of Confidence.
In this study, single vehicle weight is calculated based on FCD and RFID speed data, sample size, and speed distribution in one section.And the single vehicle weight is used to get the degree of confidence.For each piece of RFID and FCD data, we divide them by the speed: high speed data, medium speed data, and low speed data, and separate weighting factors are assigned to each group (high speed weighting factor:   ; medium speed weighting factor:   , low speed weighting factor:   ).At the same time, for each section, count the number of data in each group.  is the number of high speed data;   is the number of medium speed data;   is the number of low speed data.Then we calculate the high speed single vehicle weight as the following formula: The same procedure may be easily adapted to obtain the medium speed single vehicle weight and the   low speed single vehicle weight   .Then the high proportion data will be assigned a higher weight.At the same time, adjusting the weighting factors can assign different weight to each group data.
Based on the above method, degree of confidence is calculated as the following formula: In (2),  is the sample size of the section.The next step is to estimate the section speed.Section speed estimation process based on degree of confidence is shown in Figure 3.
If the sample size is enough ( >  min ), section speed is estimated as   And if the sample size is not enough ( <  min ), section speed is estimated based on real-time data, recent data, and historic data as the following formula: In ( 4), 0 is degree of confidence of last time period data.V 0 is the estimated section speed in last data process cycle.  is degree of confidence of historic data in the same period.V  is the estimated section speed based on historic data in the same period.
In (4), when  takes value 0, formula (4) becomes the same as formula (3).It reflects that when the sample size of data is enough to estimate the section speed, we adopt weighted average method to calculate the section speed.
When  takes value 1 and  takes value 1, it reflects that there is no real-time data and recent data sample size is enough to estimate the section speed, and thus recent data is used to calculate the section speed.When  takes value 1 and  takes value 0, it reflects that there is no real-time data and recent data sample size is not enough to estimate the section speed, and thus historic data is used to calculate the section speed.In (4), when each parameter takes the extreme value, the formula corresponds to a special data status.And when each parameter takes the other value, the formula estimates the section speed by real-time data, recent data, and historic data.

Traffic Phase Identification Method Based on Section Speed
In this paper, we analyze the raw FCD and RFID data in March 29, 2011, 7:00 to March 29, 2011, 7:00, Nanjing, China.
Table 2 shows some of our results about speed distribution by the speed threshold to identify the traffic phase in Nanjing.The proportion of each speed group is almost balanced based on present traffic phase threshold.After removing invalid data and zero speed data, the raw speed data that is close to the threshold is analyzed, and the result is shown in Table 3. Step changes in speed?Table 3 shows that more than 20% of the original data are close to traffic phase threshold.When using these data to identify traffic phase, traffic phase results will vary between two phases frequently.This result is not in accordance with the traffic continuity, and thus traffic phase identification method is proposed to make traffic phase identification results more accurate.
The process of traffic phase identification method is shown in Figure 4.
Speed threshold value and traffic phase are defined as follows.
When section speed is below V 1 , preidentified traffic phase  = 0 in this section.When section speed is between V 1 and V 2 , preidentified traffic phase  = 2 in this section.When section speed is between V 2 and V 3 , preidentified traffic phase  = 3 in this section.And when section speed is above V  , preidentified traffic phase  =  in this section.In the case described above, we make In general,  gets the value 2, which indicates that the section has three traffic phases (free-flow phase, uncongested phase, and congested phase).
Near-threshold speed correcting method and step change correcting method are described as the following formula: In (6),    is the estimated traffic phase in recent data process cycle (estimated result). 0 is the estimated traffic phase in last data process cycle.And   is the preidentified  traffic phase in recent data process cycle.V 0 is the section speed in last data process cycle, and V  is the section speed in recent data process cycle.V is the threshold that section speed is close to in recent data process cycle. is a parameter to judge whether the section speed changing in a data process cycle is tiny. is a parameter to judge whether section speed is close to the traffic phase threshold in recent data process cycle. is a parameter to judge whether the section speed changes obviously in a data process cycle.
When |V  − V 0 | <  and |V  − V| < , the section speed changing is tiny and it is close to traffic phase threshold.Because of the traffic continuity,    =  0 .When V  − V 0 >  and |V  − V| < , the section speed accelerates significantly and it is close to traffic phase threshold.It shows that the traffic phase changes obviously, so   =  0 + 1.The same procedure may be easily adapted to obtain the estimated traffic phase when V 0 − V  >  and |V  − V| < .
In other cases, for example, when V  − V 0 >  and |V  − V| > , the section speed accelerates significantly, and it is not close to traffic phase threshold.The preidentified traffic phase can reflect actual traffic phase, so    = .In (6), three parameters are calibrated by simulation test in this study (see performance evaluation).And after the traffic phase identification process, traffic phase is obtained finally.

Performance Evaluation
In this study, FCD and RFID speed data are used to estimate section speed, and data sample size, speed distribution, and the continuity of the traffic flow are considered in the estimation method to improve the accuracy and stability of estimated results.In order to test the effect of section speed estimation method, numerical test is performed by using speed average method and the method in this paper based on simulation data.In this experiment, speed above 30 km/h belongs to high speed data; speed between 15 km/h to 30 km/h belongs to medium speed data; speed below 15 km/h belongs to low speed data.And speed weighting factors are   = 0.4,   = 0.5, and   = 0.1.The result is shown as in Figure 5(a).
As can be seen from the illustration, when the sample size is small, section speed estimated by speed average method has a larger error, and the result has a relatively large fluctuation as time goes on.However the result of method in this study is more relevant to expectation value.This algorithm improves the accuracy and stability of estimated results.In addition, the real-time FCD and RFID data on January 19, 2009, is used to conduct a field test in Nanjing, China.We use the data of November 12, 2008, to November 14, 2008, to establish the historic database.Experimental place is the section between Ertiao Rd. and Yixian Bridge on Zhongshan East Rd.The truth value of section speed is obtained by live video capture and analysis.The result is shown in Figure 5(b).
In this experiment, accuracy of the section speed estimation method in this paper is 87.25%, and trend of the result is more relevant to truth value.Error analysis is shown in Table 4.
Traffic phase identification method is also tested by a numerical experiment.According to the MFD in the experiment area section speed above 30 km/h belongs to free-flow phase; section speed between 15 km/h to 30 km/h belongs to uncongested phase; section speed below 15 km/h belongs to congested phase.The traffic phase threshold is V 0 = 15 km/h, V 1 = 30km/h.And the parameter is  = 5km/h,  = 2.5 km/h, and  = 12.5 km/h.The experiment result is shown in Figure 6(a).In this experiment, 600 section speed results have been identified by the method in this paper.49 results have been corrected (accounts for 8.1%).And 83 traffic phase changes are fixed by this method (accounts for 13.8%).The results show that the traffic phase identification method makes the results more accurate and reasonable.
In addition, a case study based on the estimated section speed data of March 18, 2009, is presented to verify the effectiveness and feasibility of this method.The result is show in Figure 6(b).
In this case study, 5 traffic phase identification results have been corrected.By comparing the results revised and the original one, it is found that the optimization result is more reasonable.

Conclusions
The primary objective of this paper is developing a traffic state dynamic estimation method based on degree of confidence for urban road network.A simple, general, and complete approach to the design of urban network traffic state and phase estimator has been developed in this paper.A uniform traffic state dynamic estimation method structure is designed which consists of FCD and RFID preprocessing method to remove the abnormal data and finish the map matching process.Then, section speed estimation method was proposed based on degree of confidence.Traffic phase identification method is proposed based on the estimated section speed.A number of simulation and field investigations have been conducted to test the estimator performance.The investigation results indicate that the proposed approach is of high accuracy and smoothness on the section speed estimation and effectively eliminates the influence of abnormal data fluctuations and insufficient data.And the traffic phase identification method can effectively filter out the abnormal distortion of estimated section speed around the threshold value and modify the phase step of traffic status caused by abnormal data resulting in optimized results of discrimination.
Now, this research is applied in Nanjing, China.A transportation information service system is established to provide a service to the government.Future research includes the development of an efficient calibration method for the parameters in this paper and the enhancement of the simulation model to evaluate the performance of the strategy in large road network.

Figure 1 :
Figure 1: Structure of traffic state dynamic estimation method.

Figure 2 :
Figure 2: Radio frequency identification data points matching algorithm.
Get sample size of section i n > n min ?m 0 < m max ?m = m 0 + 1 m = m max Reset the counter m = 0 Go to next section i = i + 1

Figure 3 :
Figure 3: Section speed estimation process based on degree of confidence.

Figure 4 :
Figure 4: The process of traffic phase identification method.

Figure 5 :
Figure 5: The test and verify result of section speed estimation method.

Figure 6 :
Figure 6: The test and verify result of traffic phase identification method.

Table 1 :
Abnormal floating car data analysis results.

Table 2 :
Speed distribution by the speed threshold to identify the traffic phase in Nanjing.

Table 3 :
Near-threshold raw speed data analysis.