A Bayesian Combined Model for Time-Dependent Turning Movement Proportions Estimation at Intersections

Time-dependent turning movement flows are very important input data for intelligent transportation systems but are impossible to be detected directly through current traffic surveillance systems. Existing estimation models have proved to be not accurate and reliable enough during all intervals. An improved way to address this problem is to develop a combined model framework that can integrate multiple submodels running simultaneously. This paper first presents a back propagation neural network model to estimate dynamic turningmovements, as well as the self-adaptive learning rate approach and the gradient descent withmomentum method for solving. Second, this paper develops an efficient Kalman filtering model and designs a revised sequential Kalman filtering algorithm. Based on the Bayesian method using both historical data and currently estimated results for error calibration, this paper further integrates above two submodels into a Bayesian combined model framework and proposes a corresponding algorithm. A field survey is implemented at an intersection in Beijing city to collect both time series of link counts and actual timedependent turning movement flows, including historical and present data. The reported estimation results show that the Bayesian combined model is much more accurate and stable than other models.


Introduction
Time-dependent turning movement flows are very important input data for traffic signal control system, route guidance system, and other ITS systems.However, using the existing traffic detection devices, it is impossible to achieve the turning movement flows at intersections directly.Since the realtime link flows of entering and exiting legs can be detected conveniently, the dynamic turning movements estimation methods based on detected time-series of link counts, that is, dynamic origin-destination flows estimation (DODE), have been studied extensively in the literature.
Using different methods, existing researches have formulated many DODE models, which can be generally classified into five categories according to the modeling techniques: parameter optimization method, entropy maximization method, maximum likelihood method, Kalman filtering (KF) method, and variational inequality (VI) method.The previous three categories are extended from the static origindestination (O-D) estimation problems, the KF model is a kind of state-space method, and the VI model is a rather new technique.
Along a different line, existing DODE models have focused on different objective networks, including intersection, freeway segment, and general road network.Intersection models were the earliest to be developed, and most of them were extended from static O-D estimation models.Cremer and Keller (1981) [1], Cremer (1983) [2], and Cremer and Keller (1984) [3] have constructed relations between time series of link traffic flows and dynamic turning movement flows, that is, dynamic O-D flows, and formulated a series of least square models.Nihan and Davis (1987) [4] further proposed a revised least square model using recursive method and designed the truncation and normalization processes to satisfy the constraints requirement of dynamic turning proportions.Bell [5] also presented a revised optimization model to estimate intersection turning movement flows, which considered the platoon dispersion between entrance and exit legs.All above researches only employed the inequality constraints in the optimization models.To further incorporate the equality constraints, Li and De Moor (1999) [6] put forward a new least square model.Considering that the occasional outliers of detected flows may greatly influence the results of least square models, Jiao et al. [7] proposed a least absolute deviation model and designed a genetic algorithm to obtain the optimal solution.All these models fall within the scope of parameter optimization method, they can achieve rather accurate estimation results; however, they are not efficient enough.
Introduction of dynamic travel time and dynamic route choice extended intersection models to freeway segment and general network.Most of these models employed state-space formulations, for example, Okutani (1987) [8], Ashok andBen-Akiva (2000, 2002) [9,10], Lin and Chang (2007) [11], Li et al. (2009) [12], and so forth.These models were all formulated using KF techniques and were applicable for freeway segment or general road network.Since this paper mainly deals with intersection problems, we do not put too much emphasis on the review of these models, but the efficiency of KF method is rather valuable for time-dependent turning movement proportions estimation at intersections.
More recently, Nie and Zhang (2008) [13] employed the variational inequality technique in the DODE problem and formulated several new VI models.Lou and Yin (2010) [14] inferred turning movements with incomplete information at intersections and further incorporated them into a decomposition scheme for estimating dynamic O-D flows on actuation-controlled signalized arterials.Lu et al. ( 2013) [15] considered the influences of congestion and presented a single-level nonlinear optimization model to estimate the dynamic O-D flows, integrating a dynamic user equilibrium constraint.All these models have proved to be rather accurate.
Further analyses of existing researches show that the estimated dynamic O-D flows often fluctuate within a range around the actual values, that is, in some specific intervals, some models tend to overestimate the O-D flows, while other models tend to underestimate them.One possible way to approach this problem is to develop a combined model framework that can automatically integrate multiple submodels that run simultaneously.
To the best of our knowledge, there is no existing research which estimates the intersection turning movements in a combined model framework.Fortunately, similar methods have been used in short-term traffic forecast through combining multiple prediction models using the errors the models made in the previous time intervals, for example, Zheng et al. (2006) [16] and Dong et al. (2010) [17].However, the nature of dynamic turning movement estimation is different to that of short-term traffic forecast.Therefore, the key feature of this paper is to integrate several different estimation models and to present a combined model to estimate the turning movement proportions at intersections.
The rest of this paper is organized as follows.Section 2 illustrates the basic problem statement and variable definitions.Section 3 presents a revised back propagation (BP) neural network model to estimate intersection turning movement proportions and further develops a self-adaptive learning rate approach and the gradient descent with momentum method to solve the BP neural network model.Section 4 formulates an efficient KF method and designs a revised sequential KF algorithm.Based on above two submodels, Section 5 further proposes a Bayesian combined model, which is calibrated using both historical data and estimated results in past several intervals.Section 6 reports the evaluation results based on practical traffic survey data.Section 7 concludes the paper and recommends some future researches.

Problem Statement and Variable Definitions
This paper deals with a typical intersection with  entrance legs and  exiting legs, as described in Figure 1.
For convenience of illustration, we first define following variables used in this paper:   () is detected link flows at entering leg  during interval ,  = 1, 2, . . ., ;   () is detected link flows at exiting leg  during interval ,  = 1, 2, . . ., ;   () is turning flows entering from leg  during interval  with destination at leg , that is, the timevarying turning movement flows; Obviously, Additionally, we use the superscript "N" to denote the specific variables in BP neural network model, "KF" to indicate the specific variables in KF model, and "H" to show the historical data.
Based on the   () and   () obtained from traffic detection devices, the key issue of this paper is to estimate the timevarying turning movements   () or turning proportions   () from the detected time series of link counts.(1) It has very strong self-learning abilities and is capable of addressing very complicated nonlinear estimation problems, especially for short-term cases.

Back Propagation Neural Network Model
(2) The large amount of training data can be achieved from traffic surveillance systems directly, which will greatly improve the performance of the estimation.
(3) The back propagation of estimation errors enables the dynamic adjustment of weights, which speeds up the training process and improves the accuracy of the model.
There are three layers in the BP neural network model.
(1) Input layer: three neurons in the input layer, corresponding to the link counts of the upstream approach.
Here the number of neurons is subject to the number of upstream lanes and detectors.
(2) Hidden layer: 15 neurons in the hidden layer, which is decided through many experiments.The logarithmic sigmoid transfer function is used in the hidden layer, which outputs results between 0 and 1, corresponding to the actual range of turning movement proportions.The formulation of the transfer function is (3) Output layer: three neurons in the output layer, corresponding to the turning movement flows of leftturn, go straight and right-turn directions, that is, the estimation results.Linear transfer function is used in the output layer.
In this paper, historically detected link flows and surveyed turning proportions are used as the training set to train the BP neural network model, and currently detected link flows are used as the test set to estimate the turning fractions.

Algorithm of Self-Adaptive Learning Rate and Gradient
Descent with Momentum.Since common BP neural network method is rather slow in the training process and often arrives at locally optimal solution [18], this paper adopts the integrated algorithm of self-adaptive learning rate and gradient descent with momentum, which accelerates the training process dramatically and guarantees the reliability of the algorithm.
The self-adaptive learning rate approach is formulated as follows: where () is the learning rate of step ,  is the growth factor of learning rate, () is the weight of step , Δ() is the difference between current weight and the previous weight, and sign[ ] is the function returning the sign of the variable in the square bracket.
Here the learning rate is adjusted automatically according to the gradient in the previous step, with the initial value within [0.01, 0.8].
The gradient descent with momentum method is presented as follows: ( + 1) ( + 1) ( + 1) where  denotes neurons in the input layer,  indicates neurons in the hidden layer,  shows neurons in the output layer,   is the weight from neuron  in the hidden layer to neuron  in the input layer,   is the weight from neuron  in the output layer to neuron  in the hidden layer;  is the momentum factor with the constraint 0 <  < 1;   is the threshold of node  in the hidden layer, and   is the threshold of node  in the output layer.Both   and   keep updating with the training process.The above algorithm is coded using M language of Matlab software.

KF Model.
To estimate the dynamic turning movement flows efficiently, we further propose a revised Kalman filtering model using the turning movement proportions   () as the state variables to reflect the interrelations between the link flows at entering and exiting legs.
We formulate the state transition equation as follows: where B() is a column vector form of   () and W() is a column vector of random errors.W() is actually a Gaussian white noise vector, its mean value is 0, and its covariance matrix is D  , where D is a constant semipositive matrix and   is the Kronecker delta; that is,   = 1 while  = ; otherwise   = 0.
According to (3), we formulate the measurement equation as follows: where Y() is a column vector of detected link flows at exiting legs, Q() is the corresponding measurement matrix, and e() is the column vector of link flows detection errors; its mean value is 0, and its covariance matrix is R  , where R is also a constant semipositive matrix, just like D. Equations ( 7) and ( 8) constitute the Kalman filtering model to estimate the time-dependent turning movement proportions in a state-space formulation.

Revised Sequential KF Algorithm.
To improve the estimation efficiency, we design a revised sequential KF algorithm [19] to avoid the inverse matrix computation.The truncation and normalization processes [4] are also integrated into the algorithm to modify the results to satisfy the inherent constraints of   () in ( 1) and (2).
Furthermore, the initial values of   () are preset according to the number of lanes turning to different directions instead of conventional average number 0.33, as shown in (9).This revision of initial values of   () will accelerate the convergence of the revised KF algorithm: where   is the number of lanes from entering leg  turning to exiting leg .In case of the presence of mixed lane, it is divided into the involved directions averagely.
Except for the initial values of   () and the truncation and normalization processes, the other steps in the algorithm are rather similar to existing sequential KF algorithms [19].
This algorithm is also coded using M language of Matlab software.

Bayesian Combined Model
Through the above two models, one can obtain the timedependent turning movement proportions  N  () and  KF  (), respectively, where  N  () is the result from BP neural network model and  KF  () is the result from KF model.To get more accurate and stable estimation results, we integrate these two models into a combined model framework, as shown in Figure 3.
From Figure 3 one can find out that the Bayesian combined model is actually a weighted average result of BP neural network and KF models: where B () is the estimation result of the combined model,  N () is the weight of BP neural network model, and  KF () is the weight of KF model.By adjusting the above two weights, each submodel may be strengthened or weakened.If we set the weight of a submodel to zero, this model will be neglected.According to a comparison of estimation errors, these weights are decided logically, and the details are shown below.
According to the historical estimated results of two submodels and the historical actual turning proportions, we can get the mean absolute percentage error (MAPE) of two submodels, respectively.Here we use EH N to denote the historic MAPE of BP neural network model and EH KF to denote the historic MAPE of KF model.The prior probabilities to choose BP neural network model and KF model are presented as follows: where the superscript "N" means BP neural network model, the superscript "KF" means KF model, the function Pr() denotes a choice probability, Pr( N ) is the prior probability of model  N (BP neural network model), and Pr( KF ) is the prior probability of model  KF (KF model).Equations in (11) illustrate the influences of historical estimation errors.
To further reflect the influences of current estimation errors, we define  N and  KF as the MAPEs of currently running BP neural network and KF models, respectively.Here the MAPEs are from previous 5 intervals; that is, they are updating momentarily: where Pr( |  N ) is the probability of generating estimation  using model  N and Pr( |  KF ) is the probability of generating estimation  using model  KF .In (12), the most important issue is that the present actual turning proportions are unavailable.To address this problem, we use the Bayesian weighted results instead of actual values.For the first five intervals, the corresponding historical actual values are used alternatively.Since the previous five intervals remain rolling, there is actually an updating mechanism in the Bayesian combined model stated here, as shown in Figure 3.
According to the Bayesian theorem, the posterior probability [20] can be formulated as follows: Pr () Pr () where Pr( N | ) is the posterior probability of BP neural network model and Pr( KF | ) is the posterior probability of KF model.
Finally we obtain the weights of two submodels: Using ( 10) and ( 14), we can obtain the final estimation results of the Bayesian combined model.

Case Study
To testify the accuracy and efficiency of two submodels and the Bayesian combined model, we collected the real-world data of 2 hours through a field survey during morning peak hours at the intersection of Zhaodengyu road and Pinganli west road, which is located in Xicheng district, Beijing city, China.We achieved enough data for the case study, including time-varying link flows on all entering and exiting legs and dynamic turning movement flows from all entrance legs.The survey was conducted twice during the same morning peak hour periods in two days, respectively.Data of the first day is used as the historical information, and data of the second day is used as currently detected information.
Furthermore, we implemented three algorithms and achieved all estimated turning movement proportions using three models, respectively.A time interval of 3 minutes is used where  is the total number of time intervals.The evaluation indices of two submodels and the Bayesian combined model are all reported in Table 1.Because the estimation results of KF model during its initialization period are usually unreliable, the results of the first 20 intervals are excluded from the indices statistics.
Graphical illustrations of these estimation results during last 35 intervals are further illustrated in Figures 4, 5, and 6.
As expected, one can find out the following results from the case study.
(1) All three models are rather accurate.Relatively, the Bayesian combined model yields better estimation results than those from two submodels, which is obvious from the magnitudes of all four evaluation indices in Table   (3) Results of all three models fluctuate within a range around the actual values; that is, during some intervals, the BP neural network model tends to overestimate the turning fractions, while the KF model tends to underestimate them, and vice versa during some other intervals.However, the turbulence of estimation results from Bayesian combined model is the minimum.It indicates that the proposed Bayesian combined model is the most stable and robust.
(4) The results of go through direction are better than those of left-turn and right-turn directions.It is mainly due to the rather bigger values of the go through proportions.
(5) Estimation results of all turning fractions show that the KF model is rather more accurate than the BP neural network model.The possible reason is that only data of 40 intervals are used for training, which reduces the accuracy of the BP neural network model.

Conclusions
This paper addresses three models concerning time-dependent turning movement proportions estimation at intersections: BP neural network model, KF model, and Bayesian combined model.We first present a revised BP neural network model and develop a self-adaptive learning rate approach and the gradient descent with momentum method for solution.For more efficient estimation, we propose a revised KF model and design a modified sequential KF algorithm.Taking into account historical information and estimation errors, we further integrate above two submodels into a Bayesian combined model framework, which is calibrated using both historical and present estimation errors.
The reported examples based on practical traffic survey data have demonstrated that the Bayesian combined model is much more accurate and stable than other two submodels.Further researches are directed towards two aspects.The first is to take into account the travel time of intersection turning vehicles under traffic congestion.The second is to extend the combined model to freeway segment and general road network and to estimate O-D flows for large-scale network.

Figure 2 :
Figure 2: Architecture of back propagation neural network model.

3. 1 .
Architecture of the BP Neural Network Model.The architecture of the BP neural network model is shown in Figure 2. It mainly has the following advantages.

Figure 3 :
Figure 3: Updating flow of the Bayesian combined model.

Table 1 :
Statistical results of evaluation indices.For the historical estimation, the previous 30 intervals are used to train the BP neural network model, and the results of last 10 intervals are estimated.Furthermore, all these 40 historical intervals are used to train the BP neural network model for present estimation.The right-turn at east entrance, go straight at west entrance, and left-turn at north entrance, that is,  14 (),  31 (), and  41 (), are taken, for instance, to show the accuracy of the proposed models.The mean average percentage error (MAPE), mean percentage error (MPE), root mean square error (RMSE), and normalized root mean square error (NRMS) between actual and estimated values are selected as the evaluation criteria.The reduction in these indices thus represents the potential improvement of the proposed models and algorithms: