Real-Time Prediction of Lane-Based Queue Lengths for Signalized Intersections

. Queue length is one of the most important traffic evaluation indexes for traffic signal control at signalized intersections. Most previous studies have focused on estimating queue length, which cannot be predicted effectively. In this paper, we applied the Lighthill–Whitham–Richards shockwave theory and Robertson’s platoon dispersion model to predict the arrival of vehicles in advance at intervals of 5 seconds. This approach fully described the relationship between disparate upstream traffic arrivals (as a result of vehicles making different turns) and the variation of incremental queue accumulation. It also addressed the shortcomings of the uniform arrival assumption in previous research. In addition, to predict the queue length of multiple lanes at the same time, we integrated the prediction of the traffic volume proportions in each lane using the Kalman filter. We tested this model in a field experiment, and the results showed that the model had satisfactory accuracy. We also discussed the limitations of the proposed model in this paper.


Introduction
Queue length is the most important index for signal control evaluation [1] or signal optimization [2][3][4][5][6]. Over the years, many researchers have devoted themselves to the study of queue length, which can be divided into three categories (i.e., detection, estimation, and prediction), according to queue length acquisition methods. The first category-the direct detection of queue length using equipment such as cameras-is one of the most commonly used methods to obtain queue length in recent research [7][8][9]. This method can simply and quickly obtain the queue length, but it does not consider fluctuations in traffic flow, and the maximum queue will not be obtained when the queue length exceeds the visual range of the camera.
The second category, queue length estimation, is the most studied by scholars. In the literature, queue length estimation methods can generally be classified into two categories [10,11]: input-output models [2,[12][13][14] and shockwave models [10,11,[15][16][17][18]. The input-output model analyzes the cumulative traffic input-output (arrival-departure curve) of a link to estimate the queue length. This kind of model has simple conceptual properties. It is limited, however, by the inability to capture the spatial queue in actual arterial traffic. At the same time, the traditional input-output analysis cannot describe the spatial distribution of queue length in real time, nor is this model suitable for the estimation of queue length at oversaturated intersections [10].
Recently, much attention has been given to the formation and dissipation of queues using traffic shockwave theory. The shockwave model provides a better analysis framework for queue length estimation [6]. With the development of traffic data acquisition technology, the estimation of queue length by probe vehicles has also become a common method [19][20][21][22]. Because of the unique mobility of probe vehicle data and limitations on probe vehicle size, the precision of queue length estimation can be guaranteed only when the penetration rate of probe vehicles is high. A penetration rate of 30% was recommended by Ban et al. [11] and by Goodall, Park, and Smith [23]. According to Hao et al. [24], penetration rates at or above 10% are able to provide mean absolute error within ±3 vehicles in queue length estimation. In addition, most 2 Journal of Advanced Transportation researchers take isolated intersections as the research subject to establish their queue length estimation model [19][20][21][22]. It is usually assumed that the vehicle arrives from a uniform traffic distribution [11,[19][20][21]. Intersections in real road networks often are not isolated, however, and downstream vehicle arrival is often closely related to upstream vehicle release characteristics, flow rate, and travel time [25,26]. The uniform distribution of vehicles cannot accurately describe the dispersed characteristics of vehicle arrival, nor can it describe the real-time attributes of the arrival of vehicles with disparate characteristics of queue length. How to describe this disparity is a primary focus of this paper.
The third category is the prediction of queue length. With the improvement of traffic control requirements, predictive traffic signal control has become a developing trend, which relies on the ability to obtain relevant parameters of traffic control in advance [27][28][29][30][31]. Therefore, the prediction of queue length is essential for predictive control optimization. The research on queue length prediction is scarce. Hao and Ban [24] used mobile data to estimate queue length and noted that queue length prediction is the direction of future research. RHODES [32] and Sharma et al. [13] predicted vehicle arrival and release rate using conventional input-output and queue length estimations. Akçelik [12] modified the parameters of the Highway Capacity Manual queue estimation model by statistical analysis and established a queue length prediction model. This model, however, cannot describe the spatial distribution and evolution of queue length, nor is it suitable for prediction of queue length at oversaturated intersections. Geroliminis and Skabardonis [25] combined platoon dispersion characteristics and Lighthill-Whitham-Richards (LWR) theory to predict queue length effectively, but the model considers only the maximum queue, and it cannot describe the evolution of the queue in real time or analyze the dynamic effects of different upstream turning flow on the queue length. In addition, this model cannot simultaneously predict the queue length of multiple lanes at the same time.
To overcome the shortcomings of previous studies, in which vehicle arrival was assumed to be uniformly distributed and the evolutionary process of queue length could not be described, we combined the advantages of traffic wave theory and the platoon dispersion model to analyze vehicle arrival. We predicted the queue length in real time and obtained changes in queue length in advance, which provided support for predictive traffic signal optimization.
This study makes the following contributions: (1) We obtained the upstream different turning flows in real time at intervals of 5 seconds and fully considered the discrete characteristics of the vehicle to predict downstream vehicle arrival, which overcame the limitation of the uniform arrival assumption in previous research on queue length estimation.
(2) The proposed model predicted the lane-based queue length in real time-the prediction included incremental queue accumulation (IQA), queue trajectory, maximum queue, and residual queue, which overcame the shortcomings of previous research that could not obtain the evolution trend of queuing in advance; and we determined the specific predicted advance interval of the queue length depends on the travel time between upstream intersection and downstream intersection. This was a convenient way to make an optimal strategy of proactive signal control.
(3) The proposed model obtained the influence of the different upstream turning flows on downstream IQA in real time, and the relationship between upstream and downstream intersections was enhanced, which was helpful for fine signal coordination optimization design. The remainder of this paper is organized as follows. In Section 2, we introduce the assumptions made and terminology used in this paper and provide the simplified queueforming and queue-discharging process. Section 3 presents models to predict the proportion of lane-based traffic volume and the real-time queue length. The model is then tested using field data in Section 4. Finally, Section 5 summarizes the findings and provides directions for future work.

Preliminaries
In this section, we provide the assumptions for deriving the queue length prediction model, some terminology definitions, and a simplified queue-forming and queue-discharging process.

Assumptions.
We make the following assumptions: (1) The vehicle follows the first-in-first-out (FIFO) principle, and there is no obvious overtaking phenomenon.
(2) The vehicle has the same acceleration and deceleration behavior.
(3) The influence of buses on traffic flow is disregarded.
The necessity for these assumptions is explained in Appendix A.

Terminology.
In this paper, real-time queue length is defined as the number of vehicles queuing at any given time.

Simplified Queue-Forming and Queue-Discharging Process.
On the basis of these assumptions, the queue-forming and queue-discharging process can be described as shown in Figure 1(b). The triangular fundamental diagram for constructing this process is shown in Figure 1(a). In these figures, 0 and are jam volume and jam density; and are saturation volume and optimal density; , (5ℎ + ) and , (5ℎ + ) are the volume and density of the downstream arrival flow in during the nth cycle at intervals of 5 seconds, as will be described in detail in Section 3.3. In addition, 1, , 2 , 3 , and 4 are the queue-forming wave in , the queue-discharging wave, the departure wave, and the residual queue-forming wave, respectively. As shown in Liu et al. [10] and Ban et al. [11], the speeds of the four waves can be calculated as follows: In Figure 1(b), A, B, C, and D each describe the queue accumulation of traffic flow in different directions (i.e., through and right-turn movement, right-turn movement, left-turn and right-turn movement, and right-turn movement, respectively) from the upstream intersection to the  downstream intersection. A key point of this paper is describing the dynamic differences in vehicle arrival during conditions of different turning flow at the upstream intersection.

The Initial Moment of Queue Length Prediction.
Oversaturated traffic conditions evolve with the gradual increase of traffic demand in undersaturated traffic conditions. In an undersaturated traffic condition, the queue length of is generally equal to 0 at the end of the effective green time of ( , ). Therefore, we can use , as the starting time for the queue length calculation. As shown in Figure 2, , is when the first (current) vehicle actually moves. To predict the arrival of vehicles in advance, this paper drew upon lessons from the processing methods in Mirchandani and Head [32] and Geroliminis and Skabardonis [25], taking 0, as the advance running moment of the first vehicle, 0, = , − , where , is the end (or duration) of the green time for the th cycle of lane i, and is the average travel time of vehicles from the upstream section (upstream site A) to the downstream stopline (downstream site B). Thus, we have = /V, where is the distance between upstream site A and downstream site B, and V is the average speed of the section , which means that the predicted advance interval of the queue length depends on . Namely, as shown in Figure 2, the queue length prediction process is such that, according to the current state, we determined the advance state according to the travel time, and then we predicted the queue length (giving the predicted state) according to the arrival prediction.

Lane-Based Traffic Proportions Prediction.
To predict the queue length of all lanes at the same time, we needed to predict the proportion of lane-based traffic volume. A Kalman filter [33] is a highly efficient recursive (or autoregressive) filter that can be used to estimate the state of a dynamic system from a series of measurements with moderate noise. Because of its good state estimation and prediction accuracy, as well as its ease of calculation and implementation, it has been applied extensively to traffic flow estimation and prediction [20,[34][35][36].
To start the recursive process, we set the system variables, time update equations, and measurement equations. Let can be represented as follows: To use the Kalman filter to predict state variables, the following integrated transformations are carried out: ) .
is the process noise in the (n-2)th cycle downstream in lane and is assumed to be a white noise with zero mean; and −2 , is the covariance matrix of −2 , in the (n-2)th cycle downstream in lane i.
According to this analysis, we used the following steps to predict traffic flow using the Kalman filter.
(1) Set the Initial Parameters. We set the initial value of the state transition matrix 1|0 , in the Kalman filter equation as the unit matrix , and the dimension was 2 × 2 . We obtained the initial value of the process noise correlation matrix and the observation noise correlation matrix by the random function and covariance function in MATLAB, 0 , = cov(rand ( 2 , 2 )); and in this paper, the observed data were in a one-dimensional time series, so 0 , = cov(rand (1, 1)). The initial value of state vector prediction 1|0 , was [0], and its error autocorrelation matrix 1|0 was the zero matrix. To make the filter gain process convergence faster, we estimated the initial value of the state vector estimation 0|0 , using the R programming language to fit the linear relation (by the method of least squares) between the value of the 0 interval and the value of its previous three intervals, and its error autocorrelation matrix was the zero matrix.
(2) Run a Recursive Prediction Based on the Kalman Filter.
Step 1. Set the recursion cycle variable , where the number of recursions is the predicted length. Then calculate the following quantities.
Step 2. The Kalman gain matrix: Step 3. The observation error: Step 4. The state vector optimal estimate: Step 5. The correlation matrix computation of the error of state vector | −1 Step 6. The correlation matrix optimal estimate of the error of the state vector: Step 7. The estimated value of the state vector: Step 8. The prediction of the observation value based on the estimated state value: Step 9. The Kalman filter estimation of the observation value based on the state filter estimation value: Step 10. Finally, increment by the loop variable and repeat the steps until the loop variable is equal to the predicted length.
In the previous steps, the quantities are defined as follows:

Analysis of Platoon Dispersion Characteristics.
The queue at a signalized intersection presented the problem of stochastic vehicle arrival and fixed service rate. The process of receiving service was relatively simple: when the red light was turned on, the service rate was zero and the vehicle stopped; when the green light was turned on, the service rate was the saturated flow rate. The number of vehicles leaving the intersection was related to the duration of the green light, so part of the problem was to determine the variable service rate. The problem of vehicle arrival was complex, however, so it was necessary to consider the influence of the signal design and platoon dispersion characteristics [25]. The different platoon dispersion characteristics determined different arrival times, and the varying arrival rate determined the dynamic change of the queue length.
When the queuing vehicles of the upstream intersection leave the intersection during the green phase, as a result of the squeeze and segmentation between the vehicles, part of one vehicle is divided into a one-by-one platoon, which causes the vehicle not to reach the next intersection uniformly. Thus, the "dispersion phenomenon" has occurred in the platoon travel process [37][38][39]. The platoon dispersion model can dynamically describe arrival characteristics and predict downstream vehicle arrival [40]. Because the tail of the geometric distribution is longer than the corresponding normal distribution, Robertson's model can better predict the platoon dispersion for any given mean travel time [41]. In addition, because of the low computational requirements of Robertson's model, it is easy to apply this model both to the signal optimization of large road networks [37,[42][43][44] and to the development of other traffic theories [31,[45][46][47][48][49].
In light of this, when the vehicles are controlled by traffic signals and left in the form of a platoon, we used Robertson's model to predict downstream vehicle arrival (as shown in Figure 1, A and C). When the vehicles are controlled by traffic signals, we used the upstream observation value as the downstream predicted arrival value (as shown in Figure 1, B, and D). According to Robertson's model, the relationship between the vehicle arrival rates at the downstream section and the vehicle passing rates in the upstream section can be described as follows: where ( + ) is the estimated vehicle arrival rate on a downstream section in the th cycle of the (x + t)th interval; 0 ( ) is the vehicle passing rate in the upstream section of the th cycle of the xth interval; is 0.8 times the average travel time between the above two sections; and is a coefficient giving the degree of dispersion of the traffic flow in the process of platoon movement, known as the discrete coefficient of the traffic flow. This value was obtained by Bie et al. [45] and is represented as follows: where is the sum of the number of vehicles in the platoon i; ℎ represents the moment at which the lead vehicle of platoon passes through the upstream data collection point (e.g., upstream site A in Figure 2); represents the moment at which the tail vehicle of the platoon passes through the upstream data collection point; is the number of lanes at the upstream data collection point in the direction of the traffic movements; and is the capacity per lane. In addition, to predict the queue length of different lanes at the same time, the effect of the proportion of lane-based traffic should be considered. Robertson's model, after adding the lane-based traffic proportion, is as follows: 3.3.2. Queue Formation Process, Part One. As shown in Figure 3, this paper divides the queue length formation process into two parts: part one ( , 1, ) includes the queue length formed from the moment of initial calculation to the end of the red signal; meanwhile, part two ( , 2, ) lasts from the end of the red signal to the time when the maximum queue length appears. First, we analyzed part one of the queue formation process.
(1) Calculate Queue Length in Intervals of 5 Seconds. Following Bell [50] and Shen et al. [31], we took 5 seconds as the time interval in the application of Robertson's model. To express the dynamic evolution of traffic waves more clearly, we introduced the cell transmission model (CTM) [51,52] to describe the formation of traffic waves in intervals of 5 seconds. CTM is a convergent numerical approximation to the LWR model and is widely recognized as a good candidate for dynamic traffic simulation. Figure 4  ( + ) is further expressed as follows: 1, where , (5ℎ+ ) can be obtained by dividing , (5ℎ+ ) by V [10]. The queue length at the interval of 5 seconds for the nth cycle is 5ℎ, Given the ratio of queue length to section , the travel time ℎ of the hth interval is where max, is the duration from the initial moment of queue length prediction to the appearance of the maximum queue length and is the number of 5-second intervals before the maximum queue length occurs, 1 ≤ ≤ ROUND( max, /5). Thus, the modified Robertson's model can be expressed as follows: queue length of lane at the end of the th red signal) can be calculated as follows: Figure 3, the key to calculating the maximum queue length is to determine the intersection of the queue-forming wave 1, and the queue-discharging wave 2 , that is, to determine the moment at which the maximum queue length occurs ( max, ). We determined this moment from the following equations:

Residual Queue Length Calculation.
After the maximum queue length appeared, the queue dissipated at the departure wave 3 , as shown in Figure 5, where the density in front of the stopline was . Assuming that the tail end of the maximum queue began to move before the end of the green signal, the residual queue length (at intervals of 5 seconds) during the queue-discharging period can be determined by the following equation: The time for queue clearance can be easily obtained by the equation max, , there was no queue at the end of the green signal; when 3 > +1 , + , − max, , it revealed a residual queue , at the end of the green signal, which can be calculated as follows: When the residual queue existed, the traffic was in an oversaturated state, and the queue length could be predicted in real time by the residual queue length , and the previous process, which enabled us to design a signal control strategy to prevent queue overflow in advance.

Test Sites and Basic Data.
In the case study, we selected the intersection of South Qilin Road and Wenchang Street (Qujing, China) as the testing site. The data collection time was from 15:00 to 18:00 on October 31, 2017 (Tuesday), in which 15:00 to 17:30 was the off-peak period and 17:30 to 18:00 was the evening peak period. Figure 6 shows the lane configuration of the intersection of the study area and the  Table 1 shows the signal timing parameters at the intersection of South Qilin Road and Wenchang Street. The yellow interval of each signal stage lasted 3 seconds, there was no red clearance interval, and right-turn vehicles were not controlled by traffic signals. The letters T and L represent through movement and left-turn movement, and E, W, N, and S represent the eastbound approach, the westbound approach, the northbound approach, and the southbound approach, respectively. Table 2 shows the fundamental parameters for model validation. We estimated the jam density using the equation = 1000/ℎ , where ℎ is the average vehicle spacing in a stationary queue, which, according to field investigation, is 6.4 m.

Calculation Results and Analysis.
We used the mean absolute error (MAE), the mean absolute percentage error (MAPE), and the root mean square error (RMSE) to evaluate the accuracy of the proposed model. The MAE, MAPE, and RMSE are defined as follows: where is the total number of intervals (a total of 28 intervals for prediction of the proportions of traffic volume) or cycles (a total of 48 cycles for queue length prediction) in this experiment.  Table 3, when using information from all lanes, the MAE, MAPE, and RMSE were lower than when using information from a single lane, meaning that it was necessary to take all lane information into account when predicting traffic flow proportions. The average MAE (all lanes) and RMSE (all lanes) of each lane were close to three, which indicated that the average error of queue length prediction in the proposed model did not exceed three vehicles and showed satisfactory prediction accuracy. In addition, the MAE and RMSE of different lanes were close and showed no obvious deviation, which indicated that the calculation results of the proposed model were stable and reliable. The overall average  MAPE (all lanes) was 10.33%, which showed favorable prediction accuracy, especially for lane 2 (6.52%) and lane 3 (6.71%). The average MAPE of lane 1 was the largest (17.75%). The main reason was that the left-turn volume was lower than that of the other lanes, and its observed traffic volume was relatively small, which made the MAPE value increase; however, its average MAE (2.43) showed that the prediction result was satisfactory. Furthermore, as shown in Table 3, when using information from a single lane (the previous method with the Kalman filter), every MAE, MAPE, and RMSE was larger than that of the other traditional prediction methods (single exponential smoothing, quadratic exponential smoothing,  and third-order moving average); however, when using information from all lanes (the proposed method with the Kalman filter), every MAE, MAPE, and RMSE was the smallest of all the noted methods. This further demonstrated the effectiveness of the proposed method. Table 4, the average MAE and RMSE of each lane was less than three, which indicated that the average error of queue length prediction in the proposed model showed satisfactory prediction accuracy. The average MAE of lane 3 was slightly higher than that of lane 1 and lane 2, because lane 3 was the through and right-turn lane, and the right-turn vehicles were not controlled by the signals. Thus, some of the rightturn vehicles would leave the intersection during the red signal, resulting in increased error. However, the overall average MAE was 1.82, less than 2; the overall average RMSE was 2.33, less than 3; and the MAE and RMSE of different lanes were close and showed no obvious deviation, which showed that the calculation results of the proposed model were satisfactory. The overall average MAPE was 16.12%, close to 15%, and the average MAPE of lane 1 was the largest (20.94%). As when predicting the traffic volume proportions, the main reason for this finding was that the left-turn volume was the smallest, and its observed queue length was relatively   small, which made the MAPE value increase. However, it can be seen from its average MAE (1.52) that the prediction result was better than that of the other lanes. Overall, Table 4 shows that the proposed model performed very well in the calculated results of all three lanes.  Figure 8 shows that the proposed model can predict the burst phenomenon of queue growth in advance and thus is convenient for the optimization of predictive signal control. Figure 9 gives the queue length variation, which shows that the proposed model clearly described the process of queue formation and discharge. The quadrilaterals depicted the residual queue trajectory points in the queue discharge process (at intervals of 5 seconds).  Figure 10 shows that the queue length prediction at intervals of 5 seconds can dynamically reflect the effect of different upstream turning flow releases on IQA (R, T and R, and L and R represent the right-turn flow, the through and right-turn flow, and the left-turn and right-turn flow at the upstream intersection, respectively). The IQA was consistent with the dynamic trend of traffic flow. The IQA when the upstream through flow was released was obviously larger than the IQA when the other traffic flows were released (see Table 5). This change could provide powerful support for the precise analysis of signal control parameters, for example, in delay analysis [53].   Table 2 (in reality, other parameters are relatively fixed). As discussed in Section 3.1, changing V is actually a change in the initial moment. The survey showed that 90% of the vehicles travel within a speed range of 20 ( /ℎ) ≤ V ≤ 40 ( /ℎ) (excluding 5% lowspeed vehicles and 5% high-speed vehicles, respectively), so the corresponding travel time was 46 s ≤ ≤ 92 s. In accordance with the upstream data acquisition interval, we changed the initial moment at an interval of 5 seconds to calculate the accuracy of queue length estimation. As shown in Figure 11 and Table 6, when 60 s ≤ ≤ 75 s, the average MAE and RMSE of all lanes was less than 3, and the average MAPE of all lanes was less 20%, which showed that the results of the model were satisfactory in the range of 20 seconds (four 5-second intervals); when ≤ 55 s and ≥ 80 s, the calculation error of the model increased gradually. Therefore, it was evident that the calculation results of the model were stable and reliable in a certain range of initial parameters. At the same time, the calculation also reflected that the selection of initial parameters had a significant impact on the accuracy of the model. How to dynamically select the calculation parameters of the model is the next step to be improved.

Queue Length Predictions for Each Lane. As shown in
Inevitably, a delay occurred between the observed and predicted time of the model. In the verification of the maximum queue length, we predicted the maximum queue length and its occurrence time by the proposed model, and the calculation error was compared with the actual maximum queue length value without considering its occurrence time (which was consistent with the processing method of Liu et al. [10]). By comparing the time of maximum queue length between the predicted value and the observed value, we found that, within 60 s ≤ ≤ 75 s, the average time difference between the two was less than three 5-second intervals (15 seconds), which indicated that model accuracy would not be affected when the average error between the observed time and the predicted time was within three intervals. clearly showed that the maximum queue length was predicted in advance of 65 seconds at 15:46:02, which fully demonstrated the proactivity of the model.

Conclusion
In this paper, we used LWR shockwave theory and Robertson's model to establish a real-time prediction model of lane-based queue length, which effectively predicted queue length (including IQA, queue trajectory, maximum queue, and residual queue). This model is convenient for the optimal design of predictive signal control. In the proposed model, vehicle arrival was described with an interval of 5 seconds in Robertson's model. In this way, we described the formation and dissipation of queue length in real time and dynamically described the influence of different upstream vehicles, arriving from disparate turning lanes, on IQA. In addition, the model predicted the queue length of multiple lanes at the same time by predicting the proportion of traffic volume using the Kalman filter. The computational complexity of the model was relatively low, and it was convenient for engineering and design.
Several directions for future research can be summarized as follows: (1) Lane-changing phenomenon. This paper assumed that vehicle lane changing had no effect on vehicle arrival characteristics. However, research has shown that when the vehicle lane-changing phenomenon was prominent, the vehicle running state was disturbed [20,54]. Therefore, analyzing the effect of lane changing on queue length prediction is a promising research direction.
(2) Arrival effect of heterogeneous traffic flow. When the bus occupied a large proportion of the lane, the travel characteristics of the bus (such as passengers, slower speed relative to cars, and so on) interfered with car travel and affected the travel time distribution and formation and dissipation of traffic waves [55]. Thus, another important research direction is to study the queue length under the arrival characteristics of heterogeneous traffic flow.
(3) Dynamic correction of travel time. In this paper, the travel time of Robertson's model was fixed, but the travel time will be different according to the change of the traffic flow. Another research direction will be to optimize and perfect this model while incorporating the short-term prediction of travel time for probe vehicles.

Supplementary Materials
See Tables S1-S6 in the Supplementary Materials for comprehensive data analysis. (Supplementary Materials)