Estimation of Urban Link Travel Time Distribution Using Markov Chains and Bayesian Approaches

Despite the wide application of Floating Car Data (FCD) in urban link travel time and congestion estimation, the sparsity of observations from a low penetration rate of GPS-equipped floating cars make it difficult to estimate travel time distribution (TTD), especially when the travel timesmay havemultimodal distributions that are associatedwith the underlying traffic states. In this case, the study develops a Bayesian approach based on particle filter framework for link TTD estimation using real-time and historical travel time observations from FCD. First, link travel times are classified by different traffic states according to the levels of vehicle delays. Then, a state-transition function is represented as a Transition Probability Matrix of the Markov chain between upstream and current links with historical observations. Using the state-transition function, an importance distribution is constructed as the summation of historical link TTDs conditional on states weighted by the current link state probabilities. Further, a sampling strategy is developed to address the sparsity problem of observations by selecting the particles with larger weights in terms of the importance distribution and a Gaussian likelihood function. Finally, the current link TTD can be reconstructed by a generic Markov ChainMonte Carlo algorithm incorporating high weighted particles.The proposed approach is evaluated using real-world FCD. The results indicate that the proposed approach provides good accurate estimations, which are very close to the empirical distributions. In addition, the approach with different percentage of floating cars is tested. The results are encouraging, even when multimodal distributions and very few or no observations exist.


Introduction
Information of travel time distribution (TTD) is essential for route guidance and advanced traveler information systems.It reveals underlying traffic conditions for measuring travel time reliability, evaluating the level of service of intersection, and detecting the congestion and incident, which can be introduced to improve service reliability performance [1].
Up to now, the increasingly available vehicles equipped with GPS, named as floating cars, provide a significant amount of vehicle trajectory data in urban road networks.Many works have proposed analytical or data-driven models to infer mean link travel time from this data, but yet research on the estimation of link TTD is still evolving, especially when the travel times appear to be multistate features.In reality, the travel times of vehicles traversing an urban link are heavily affected by traffic lights, interaction among vehicles, and conflicting traffic from cross streets [1][2][3][4][5].The characteristics of interrupted traffic flow would likely result in different clusters of travel times, which reflect how vehicles experienced traffic states on a link.Therefore, rather than to provide the mean value that is generally used in planning applications, a probability distribution, that fully characterizes different components of travel times for retaining maximum traffic state information, is preferred.
In order to estimate the link TTD, a relatively large number of floating cars is needed to generate useful travel times among various errors such as GPS errors, mapmatching errors, and estimation errors.Unfortunately, the spatial distribution of floating cars is not uniform.Taxicabs, which have been suggested as the main types of floating cars [6], are used widely to collect travel time information for urban link travel time and congestion estimation [4,5,7].However, taxicabs cover mainly densely populated and commercially well-developed areas, since these areas may have higher passenger flow and taxicabs are more likely to pick up passengers.For those areas with limited coverage by floating cars, it would raise inquiries about how to utilize the limited data sample from a small number of floating cars to produce an accurate link TTD, even for the multistate feature of travel times.
In this situation, it may be possible to combine realtime traffic information incorporating the current traffic states with prior knowledge from historical link TTDs that provide similar data sequences to the actual distributions to reconstruct the TTD.The objective of this study focuses on developing a methodology to estimate such probability distributions under situation when floating car density is quite low and link travel times may follow multimodal distributions.
The main contributions of this paper are as follows.
(1) This research can be regarded as an initial attempt to estimate multistate link TTDs in the context of the sparse and incomplete observations from real-time FCD.The corresponding solutions are given by a Bayesian approach based on particle filter framework, in which the state-transition trends are modeled by historical trends to construct an initial solution providing past travel time patterns, as well as the sampling strategy that focuses on a fundamental issue of the irregular, unsmooth TTDs directly constructed by the only real-time FCD and allows much less real-time information.(2) The results of case study show that multistate link TTD models can be robustly realized by using our proposed approach for different percentage of floating cars, even for the condition with no data available.Moreover, the applications help provide in-depth understanding on the heterogeneity of link travel times that show multistate distributions and imply significantly different state combinations changing with time of day, in which the state converging most of travel times may play a dominant role on the whole link traffic conditions.

Literature Review
In review of literature, models on the estimation of travel time distribution (TTD) can be generally divided into two categories, namely, unimodal and multimodal/multistate distributions.Unimodal distributions, such as Weibull, exponential, Lognormal, or normal distributions, are often used for modeling travel time.For example, Hunter et al. [8] proposed a probabilistic model for inferring vehicle path and travel time based on low frequency GPS data in the arterial network.An expectation maximization (EM) algorithm for the proposed model is employed to simultaneously learn the likely paths and the parameters of lognormal link travel time distributions using the path travel time observations.A recent work of the same group by Herring et al. [9] proposed a statistical approach to capture the evolution of traffic states with a coupled hidden Markov model, since the actual state of current link is unobserved (hidden) and depends on adjacent links (coupled).The authors assumed that travel times between successive links are independent, and only one traffic state on each link can exist in the same period.The authors went on to assume that the link travel time distribution is conditional on its hidden state.Likewise, an EM algorithm is used to iteratively update the parameters of the distributions and the state-transition matrix.The evaluation is done using sparse probe data from taxicabs in San Francisco, CA.
A similar approach was proposed by Westgate et al. [10].In this work, a Markov Chain Monte Carlo (MCMC) algorithm was used to simultaneously estimate the likely paths, travel times, and parameters of link TTDs from GPSequipped ambulances.The findings from these studies in general concluded that the lognormal distribution provides a superior fit over the alternative unimodal distributions.The main limitation of unimodal distributions is the assumption that travel time is the result of a single traffic state on a link.Hence unimodal distributions usually provide some poor fit results under complex traffic conditions, such as interrupted flow.
The multimodal travel time distribution comprising two or more distributions outperforms the single distribution by reflecting the coexistence of heterogeneous traffic states simultaneously [11,12].One of the most attractive features of the multimodal distribution provides a straightforward relationship for the probability of encountering state (e.g., congestion) and the component distribution accordingly under such state.In [11], a multistate model was proposed by using a mixture of normal distributions to fit travel times on freeways.The authors revealed that different traffic states can exist in the same analysis period such as free-flow and congested states.Besides each component distribution is associated with an underlying state.In the research by Park et al. [12], the authors attempted to quantify the impact of traffic incidents on travel time reliability using multistate (3 states) model.The state of a section on freeways describes the level of congestion.They concluded that the congested state corresponding to the third component distribution becomes more dominant since incidents increase the travel time variability.
While very few multimodal link TTD models using GPS data were proposed up to now, for the estimation of the parameters of multimodal distribution, Hofleitner et al. [13] developed a dynamic Bayesian network for the transition between link states using GPS probe data.A simulationbased EM algorithm is proposed to estimate the transition probabilities and the travel time distribution parameters.Feng et al. [7] integrated signal timing and arterial geometry information with real-time GPS data to dynamically update the parameters of the bimodal urban link TTDs based on MCMC simulation.Ji et al. [1] proposed a hierarchical Bayesian bimodal TTD to capture the interrupted nature of urban traffic flow as well as analyze traffic operations from probe vehicles.A further development using a hierarchical MCMC algorithm for producing the posterior distribution of Gaussian mixture models and the lognormal prior was presented in Li et al. [14].The authors assumed that real-time GPS samples are distributed in each component distribution.These studies all adapted iterative procedures to estimate the parameters of link TTDs using maximum likelihood estimation and Bayesian approach.However, one limitation of their work is that travel times of probe vehicles are not readily available for each unobserved state since probe vehicles penetration rate is fairly low.Another limitation of their work is that they assumed the link travel times are independent.In reality, adjacent link travel times may have strong correlations.As an alternative, the multistate nature of travel times and their correlations between adjacent links can be modeled by a Markov chain methodology.Ramezani and Geroliminis [15] proposed a Markov model to estimate arterial route TTD.The model used a Transition Probability Matrix (TPM) constructed from simulated probe data to capture the spatial correlations.However, the approach conducted by Ramezani and Geroliminis [15] needs ample real-time observations conditional on link states to estimate the TPMs, which limits the application under urban areas with missing data (no or few data available).
In this paper, TPMs of the Markov chain between upstream and current links are calibrated from historical travel time observations to alleviate the dependence on the limited data sample.On this basis, the study develops a Bayesian approach based on particle filter framework for estimation of multistate link TTD by combining past information of link TTD represented as an importance distribution with real-time travel time information represented by a likelihood function.The importance distribution, used as an initial solution for approximating the actual distribution, is constructed by the summation of past TTDs conditional on link states weighted by the estimated state probabilities.The likelihood function is chosen as a normal distribution that describes how likely a particle (i.e., sample) tends to follow the real-time observations.As a result, the particles with high similarity are retained and replicated to address the sparsity problem of observations and then can be integrated into a Bayesian algorithm for producing a multistate link TTD.

Data Introduction.
Nanjing is the capital of Jiangsu province, lying in the east of China, with over 8,236,000 inhabitants.The city has been rather well equipped with ITS infrastructures for nearly 10 years.Radio Frequency Identification Data (RFID) technology is applied in some main arterials.Currently, more than 0.7 million vehicles, including taxicabs and private vehicles, have been equipped with RFID tags.There are over 7700 GPS-equipped taxicabs to collect vehicle trajectory data as Floating Car Data (FCD).In this paper, FCD are used to evaluate the performance of the proposed approach, as well as RFID being chosen to produce the ground truth distributions compared to the corresponding estimated TTDs from FCD.
FCD vehicles generate a data record every 30 seconds including the vehicle's position, timestamp, speed, and status (with or without passengers).The process of matching the reported locations to road network links from raw data records has been accomplished by a widely used mapmatching approach [16].Since this is not the focus of this paper, we will not cover the detail here.Besides, the impact of different data polling frequency on the required sample size determination is not considered in this paper.In order to eliminate the effects from vacant taxi driving intentionally slowly, and searching for potential passengers, only FCD from hired taxis are used.As to RFID, several arterials are installed with both RFID sensors and video cameras at upstream and downstream locations of signalized intersections.RFID sensors scan the vehicle plate with high frequency of 20 milliseconds and mainly record plate number, timestamp, and station number when tagged vehicles pass by the RFID stations.More than 80% of vehicles on arterials are detected by RFID sensors [3][4][5].Hence RFID can almost obtain the information of all the passing vehicles which constitutes a full-sample data.
As shown in Figure 1, the case road link divided by two consecutive RFID stations (i.e., the stations between 6004 and 6006) is around 1054-meter long.The road link has eight-lanes double way from Caochangmen Street and Qingliangmen Street all along and intersects with many other streets in between.The design speed of the selected studied link is 60 km/h.
For the source of FCD, the 1-month (November 2011) data for weekdays (Monday to Friday) between 7 a.m. and 8 p.m. were taken as the historical database.For both sources (FCD and RFID), data from 7 a.m. to 8 p.m. on Dec. 1, 2011 (Thursday), were used for the comparison and validation.Considering the word limit, we choose the south bound traffic flow to analyze the travel time.

Characterization of Travel Times.
In order to estimate link travel time for individual floating car traversing a link, a travel time allocation method proposed by Sanaullah et al. [17] is applied here because of the simple calculation and high estimation accuracy.Then, mean link travel time can be calculated with an average of travel times for individual floating cars in an analysis interval.Note that an observation from FCD (RFID) is treated as the travel time in each link experienced by a vehicle.Besides, the time interval for estimation is set as 15 min to avoid unavailable observations.
To better understand the differences between the mean value and the probability distribution for explaining the trends or patterns of traffic states under interrupted flow, Figure 2 shows the temporal variation of travel times and the number of vehicles counted from both FCD and RFID in 15min intervals.Apparently, the development of the mean travel times from FCD and RFID is quite consistent.The mean absolute percentage error (MAPE) is chosen for evaluating the accuracy of the estimated travel time.It turns out that the value of average MAPE is 8.46%, which appears to be quite acceptable.
However, Figure 2(a) shows an intuitive understanding on travel times, which can be clustered into two or more components from the scattered observations in a certain interval.This finding implies that the mean is not a good tracking index for capturing the multistate features of link travel times, once compared to TTD.On the other hand, Figure 2(b) illustrates that the proportions of floating cars are relatively low during time of day and that observations from FCD largely distribute at the high-speed regime (nearly between 25 and 50 km/h).These may likely result in inadequate sample to monitor the remaining travel time states.To   better model multistate TTD, a Bayesian approach based on particle filter framework is then introduced with limited FCD sample.

Methodology
The proposed solution framework, consisting of five major parts, is outlined in Figure 3.The state identification is presented first in this section, followed by a description of how the states transfer between two adjacent links.After that, the importance distribution is introduced to produce historical travel time patterns as prior knowledge.The following subsection presents the sampling strategy to generate a certain number of high weighted particles.Finally, the Gibbs sampling algorithm is applied to approximate the posterior distribution of Gaussian mixture model (GMM).

State Identification.
The multistate nature of travel times on urban links is principally attributed to the road geometric features (e.g., roadway alignment, link length, lane width, and number of lanes), interaction among vehicles (e.g., weaving, diverging, and merging), and signal controls.With the results

Transition Probability Matrix (TPM) calibration from historical data between upstream and current links
Construct the importance distribution: The sum of the conditional TTDs weighted by the estimated state probabilities (Eq.( 8)) Design sampling strategy: (i) Draw N particles (i.e.samples) from the importance distribution.Take density as weight and normalize the weights of particles.
(ii) Update the weights using a Gaussian observation function which describes the likelihood between the particles and real-time observations.
(iii) Select N particles with larger weight by systematic resampling method.

Gibbs sampler for Gaussian mixture model (GMM)
with the selected particles Create the posterior distribution of GMM at time t of the interaction among these factors, four typical traffic states that exhibit travel times on a link experienced by vehicles are shown in Figure 4, and can be defined as follows: In order to distinguish the heterogeneous traffic states, intuitive traffic engineering rules are applied to quantify state boundaries so each state represents the corresponding cluster of travel times.Here the boundaries between two neighbor states can be approximated as follows, respectively.
(i) The first boundary between state 1 and state 2 is to judge whether vehicles pass the signalized link without stopping.
Considering a free-flow traffic condition, the travel times of vehicles traversing a signalized link directly depend on the signal controls and their distribution appears to be bimodal [1][2][3][4][5]7], which features two clusters of travel times corresponding to the states of nonstopped and stopped vehicles, respectively.To achieve the state partitions, a two-component mixed normal distribution, rather than the skewed mixture models (e.g., the components are both lognormal distributions), is adopted to obtain the travel times in state 1 and state 2 individually as a tradeoff between computation complexity and explanatory power [18].Besides, the travel times between 02:00 a.m. and 04:00 a.m. are extracted from FCD for the free-flow condition [19].Hence, the first travel time boundary (denoted as   ), determined by the point of intersection of two normal distributions [20], is stated mathematically as where N( * ) is the normal distribution,  is travel time variable,  ,1 and  ,2 are the mean of the first and second normal distribution, and  ,1 and  ,2 are the standard deviation of the first and second normal distribution.
(ii) The second boundary between state 2 and state 3 is to distinguish whether vehicles only experience intersection signal delay.
Signal delay is the primary factor that influences travel time patterns on an urban link [3].Unfortunately, in most cases, the signal control plans at intersections are generally unknown.As an alternative, we use the red time for through movement (denoted as   ) as additional travel times to judge whether vehicles stop at the end of the link.This part of travel times is assumed to be only generated by the signal control, since it makes a major contribution to travel delay rather than traffic flow.Note that the means of  ,1 and  ,2 for state 1 and state 2 are closely related to the signal settings, and their difference between nonstopped and stopped vehicles can be used to approximately estimate the red time.Consequently, the second boundary by adding the red time to free-flow travel time (denoted as TT  ) is represented as (iii) The last boundary is to judge whether vehicles experience oversaturated conditions.
The oversaturated conditions on urban principal arterials, characterized as extremely low speeds and extensive queuing, is defended as level of service F [21].In this case, a residual queue is the most likely occurring at the signalized intersection in one cycle time.Hence, the last boundary for oversaturated conditions (denoted as TT  ) where vehicles encounter two or more queues can be initially estimated by link length and 30% of speed limit: where L is the link length and V  is the speed limit of the arterial link.

State-Transition Function
Modeling.This section adopts work of Ramezani and Geroliminis [15] to represent the spatial state transition between two adjacent links based on Markov chain framework.The spatial state transition exhibits how vehicles experience multiple traffic states on route's successive links whereas it differs from the traffic congestion propagation.For example, a vehicle that passed the upstream link without stopping would have a certain probability of stopping on the downstream link, and it would have another probability of not stopping on the downstream link.This simple case illustrates that the travel times on the current link are dependent on the upstream link state.
In this paper, there are four traffic states constituting a four by four TPM.The TPM  −1, () between any pair of successive links n-1 and n at time t is shown as follows: where   () denotes the transition probability from state i on link n-1 to state j on link n at time t and can be expressed mathematically as the corresponding observed frequencies: where   () =  indicates that the state on link n at time t is j and   () is the number of observations produced by the same vehicles experiencing state i on link n-1 and state j on link n at time t.Note that we adapt a relatively static TPM to relax the constraint of limited real-time observations and use the historical data during a 1-month period on workdays to calibrate the TPM at each time t from 7 a.m. to 8 p.m.
Given a set of real-time observations on link n-1 at time t, the state probability vector  −1 (), consisting of the probabilities of four states, can be estimated by where  1 −1, is the probability of state 1 on link n-1 at time t and   () is the number of observations of state k on link n-1 at time t.Finally, combined with (4)∼( 6), the current state probabilities   () on link n at time t are calculated as

Importance Distribution Construction.
A straightforward approach for estimation of link TTD is to take advantage of only real-time observations from FCD.However, under many circumstances, it is very difficult to represent the potentially multiple states with an incomplete collection of observations due to the low penetration rate of floating cars.Consequently, the chief challenge in our methodology is to construct an importance distribution (also often referred to as the proposal distribution) properly providing a high similarity to the posterior distribution, from which a series of samples with weights are randomly generated for the heterogeneous traffic states.More importantly, the importance distribution can be modified with the real-time observations to further approximate the posterior distribution.
Here we consider an importance distribution that consists of four historical normal distributions.And each historical normal distribution for the corresponding traffic state is fitted by historical observations located in the same state boundaries at time t.Combining with the estimated state probability vector   (), the importance distribution ( , |  −1, ) on link n incorporating the observation  −1, on link n-1 at time t can be adjusted as where   , is the mean of historical normal distribution for state i on link n at time t and   , is the standard deviation of historical normal distribution for state i on link n at time t.

Sampling Strategy Design.
In this section, a sampling strategy is developed from the concept of the particle filter approach to generate a set of high weighted particles for representing the heterogeneous traffic states.Each particle refers to a travel time sample with a corresponding weight.Here a general particle filter approach mainly consists of the state update process showing a system evolving with time and the measurement update process showing the relationship between the system states and the system outputs at a certain time.Finally, a posterior distribution can be represented by a collection of weighted particles using the particle filter approach.More details about this approach are given in the work of Doucet et al. [22].
Similar to the particle filter approach, the proposed sampling strategy comprises three components, that is, state update, measurement update, and resampling scheme.First, the importance distribution based on the spatial TPMs is applied to produce the particles instead of using a temporal state-transition model in the state update process.Then, with the arrival of new observations, probability densities as weights are updated by the likelihood function with input of the particles and real-time observations.Note that, in Figure 2(a), each observation from FCD has an error distributed around the actual mean from RFID.More importantly, the means from FCD and RFID are quite close.If we take the means from FCD as ground truth, then it is reasonable to assume that errors between particles and the mean of observations in 15-minute interval can be described as a normal distribution.In this way, the smaller error between a particle and the mean achieves the larger weight assigned to the corresponding particle.Hence, the likelihood function is represented by a zero-mean normal distribution with a standard deviation of 50 seconds based on empirical evaluation (i.e., N(0, 50)).Finally, a resampling procedure is used to eliminate particles with low weights and concentrate on particles with large weights.Residual, stratified, and systematic resampling are three common resampling schemes.In this paper, systematic resampling is preferred to be easily implemented since it aims to minimize the Monte Carlo variation and save computational time.
To summarize, the sampling strategy is specified as follows.
Step 1 (state update).Set the number of particles  = 100 and generate N particles   , with the weights   , (the weight refers to the probability density of the random sample) according the importance distribution ( , |  −1, ) on link n at time t: Step 2 (measurement update).
(2.1) Calculate the mean of M observations   , from FCD on link n at time t: (2.2) Update the weights by likelihood function   : Step 3 (resampling).Normalize the weights to ensure that the sum of weights is equal to one.Thereafter, the systematic resampling method, similar to the roulette wheel strategy in a basic genetic algorithm, is applied to retain and replicate the particles with larger weights.For more details of the method the reader can refer to [23].

Travel Time Distribution Estimation.
As mentioned in the literature review, travel times on urban links may have multimodal property which are associated with the heterogeneous traffic states.Hence, Gaussian mixture model (GMM), from which the flexible fitting structure and explanatory power can be provided, is preferred to fit multistate link TTD.Specifically, a GMM for travel time with finite K components can be formulated as follows: where   is the mixture coefficient for the  ℎ normal distribution such that   ≥ 0 and ∑  =1   = 1.  and  2  are the mean and variance for the  ℎ normal distribution, respectively.Θ = {( 1 ,  2 , . . .,   ), ( 1 ,  2 , . . .,   ), ( 2  1 ,  2 2 , . . .,  2  )} is the set of GMM parameters. = ( 1 ,  2 , . . .,   ) is a vector of observations.In this paper, the number of components in GMM ranges from  = 2 to  = 4 in terms of four states in total.
To estimate the GMM parameters, EM algorithm and MCMC simulation are two commonly used methods.However, the EM algorithm based on the asymptotic theory needs samples, and it may result in a local maximum problem [24].Unlike the EM algorithm, the MCMC simulation as a golden method for Bayesian inference can provide a posterior distribution of the parameters in the proposed model, which incorporates prior knowledge with smaller number of sample size in an iterative sampling procedure [25].Such a sampling process would repeat many times until the estimated parameters tend to converge and produces the posterior distribution.
In Bayesian inference, the posterior distribution (Θ | ) for GMM can be expressed in terms of the prior distribution (Θ) and likelihood function   ( | Θ).

𝑝 (Θ | 𝑥
where according to the conjugacy property that describing the posterior distribution has the same parametric form as the prior distribution, the priors for distributions {, , } are assumed to be as follows [24]: where parameters {, , , } are taken as known 'hyperparameters' .In the initialization of the MCMC simulation, the hyperparameters can be set to { = 150,  = 0.1,  = 1,  = 2}.Finally, the pseudocode of the proposed MCMC method using Gibbs sampling algorithm is presented in Table 1. Another important point is that the number of components in GMM is unknown.An information criterion, known as Bayesian Inference Criterion (BIC) for the goodness of fit assessment, is chosen to find the optimal number of components [26].In use of the BIC, a smaller value of BIC indicates a better fitting model.Specifically, the BIC taking into account the log likelihood, the number of estimated parameters d, and sample size N is defined as

Multistate Travel Time Histograms.
Before utilizing the proposed method for estimation of multistate link TTD, the simple histogram can exhibit intuitive information on the TTD. Figure 5 shows two or more clusters in travel time histograms for two different traffic conditions.During off-peak period, Figure 5(a) gives a distinctly bimodal pattern for the observations from RFID, and most of the observations fall into state 1 and state 3. Conversely, during peak period, as shown in Figure 5(b), state 4 gathering over 80% of the observations plays a dominant role in the congested situation.In terms of the FCD, one (Figure 5(a)) is very small number of observations in state 1 and state 2 whereas there are no observations in state 3, and the other (Figure 5(b)) is the observations largely distributed beyond the two peaks in state 3 and state 4, which represent the major clusters of travel times.Overall, this simple example implies that the issues of sparse and incomplete observations from FCD may result in irregular, unsmooth TTDs.To improve the estimation of link TTDs, it is suggested that historical information on prior travel time patterns is strongly needed.

Estimation of Link TTD.
For each number of components, the sampling procedure was run in a total of 20,000 iterations and the first 10,000 iterations were discarded as burn-in, while the remaining samples were used for inferencing the posterior distribution.Convergence of the , where,  ()  is the effective number of particles assigned to component K, and can be calculated by

End for End for Step 3. BIC calculation
Calculate the BIC for a K component GMM using Eq. ( 19) Note: we first execute the sampling with increasing number of components from  = 2 to  = 4. estimated parameters was monitored by the trace plots of the generated sample values, the model log likelihood, and the autocorrelations [26].
To evaluate the performance of the proposed method, the Hellinger distance (HD) and Kolmogorov-Smirnov (KS) test are employed to assess the similarity between the estimated distribution from FCD and the empirical (actual) one from RFID.For discrete distributions, the HD between the estimated distribution TTD  and the empirical one TTD  is defined as follows [26]: where   () and   () are the estimated and observed probabilities for an observation   from RFID.Besides, the HD satisfies two requirements in (21).
It is indicated that HD has the nature of symmetry and nonnegativity.Furthermore, if (TTD  , TTD  ) = 0, the two distributions will be close to each other, whereas (TTD  , TTD  ) = 1 means that there exists the most difference between distributions.
The KS test is a common nonparametric test used to test if the estimated distribution passes the null hypothesis that the observations are distributed according to the estimated distribution.A larger p value provided from the test indicates a better goodness of fit.The estimated distribution is rejected when p value is smaller than 0.05 at a 95% confidence level.
Figure 6 shows a comparison of the estimated TTD and the empirical one.It indicates that if the importance distribution is not accurate enough as shown in Figure 6(a), the observations from FCD imply that there is no data in between the two peaks, and then Gaussian likelihood function tends to pull the posterior distribution in the right direction.Besides, the results of link TTD estimation show the advantage of the proposed approach with few data in terms of the calibration of TPMs from historical travel times.
If the importance distribution is quite accurate, as shown in Figure 6(b), however, the posterior distribution is close to the empirical one, the Bayesian posterior distribution says that some data with high probabilities are missing in the first peak (around 185 s).If we observe the travel time histogram at a.m.peak periods in Figure 5(b), eight observations from FCD located in between two major components from RFID, there should not be such an obvious gap between state 3 and state 4 under congested situation.Besides, the observations nearby a mean of 210 seconds in the Gaussian likelihood function will be assigned larger weights.Altogether, with the Bayesian update, the posterior distribution that incorporates information not only from the prior travel time pattern, but also from the real-time data, may provide a more convincing result.
Table 2 summarizes the model parameters and performance results.It can be seen that the GMM with three components provides a superior fit for empirical data during peak hours.This could be explained by the underlying situations that present two kinds of state combinations, that is, state 1 to state 3, and state 2 to state 4, while the larger weight implies the corresponding state in a state combination may have a leading impact on the whole link traffic conditions.These findings assist with insight into the heterogeneity in travel times.Moreover, the performance results show that of all 52 periods used for the KS test, only nine of them fail to pass the KS test, which help confirm the statistical superiority of the proposed method.

Effect of Floating Car Percentages on the Proposed
Approach.To examine the robust of the proposed method, six different percentages of floating cars, i.e., floating cars% = 0%, 5%, 10%, 20%, 50%, and 100%, are conducted as an input of estimating the link TTDs.For each percentage of floating cars, the sampling procedure was run 100 times to randomly select the corresponding samples from observations and calculate the average values of HD and percentages of passing KS test in each 15-min interval.In particular, if the number of samples is zero, we use an alternative method with the steady-state probabilities and the means of historical travel times instead of the estimated state probabilities in (7) and the means of real-time observations in likelihood function.
According to the irreducible and ergodic properties of the Markov chain, the steady-state TPM Π  () on link n at time t can be defined as follows [27]: where r is the number of steps.When r is big enough, each row of Π  () will all reach steady-state probabilities regardless of whatever the state probabilities on link n-1 are. Figure 7 shows the results of average HD and KS test for different percentages of floating cars during time of day.It can be observed that, with the percentage of floating cars decreasing, the main trend of the average HD increases; on the contrary, the percentage of passing KS test in general decreases.Especially when the number of samples is zero, the performance of the alternative method (i.e., the average HD largely varying between 0.1 and 0.3 while the percentage of passing KS test mainly ranging from 70% to 90%) is superior to the proposed method with sample size less than five, which mainly generates the worst result.This is likely due to the random sampling resulting in higher travel time variability for estimation input, whereas the alternative method only depending on the historical travel time information is not sensitive to the input data.
Overall, the results confirm that the proposed method still performs well with small sample size.For the worst condition, i.e., floating cars% = 0%, the alternative method is preferred in terms of the estimation performance.

Conclusions
This paper presents a Bayesian approach based on particle filter framework to estimate link TTDs using the limited observations and historical data set from FCD.The main objective of the approach is to produce the multistate TTDs with the low penetrate rate of floating cars.Unlike previous studies that require a complete data in modeling the multistate TTDs, the proposed approach uses less observations to update the importance distribution.Concisely, the importance distribution is regarded as prior knowledge and provides past travel time patterns.On the other hand, limited observations are integrated into the particle filter framework and make the estimated TTDs further close to the empirical   distributions by using a sampling strategy.Finally, the result is a posterior distribution incorporating historical trends and the real-time travel time information as well.
The results of the case study demonstrate that the proposed approach produces the posterior distribution agreeing well with the empirical distribution.Besides, with different percentages of floating cars, our proposed approach performs equally well, even under condition with no data available.This suggests that, when the number of floating car sample size is less than five, the approach with the steady-state probabilities helps provide a promising result.
Nevertheless, this paper has some limitations.First, we used the historical observations as an a priori estimation.Information from estimated TTD in previous period should be integrated into the particle filter framework to update the parameters of the current importance distribution and provide a better a priori estimation.Second, the spatial TPMs of the Markov chain did not consider the temporal correlation between link travel times.A Markov process with a short memory taking into account the effect of travel time states at the preceding time interval on transition intensities may lead to better TPMs estimates.Finally, a density-based clustering method such as DBSCAN clustering or other DBSCAN variants should be developed to cluster link travel times for representing heterogeneous traffic states so that the ability of the proposed approach could be generalized to a large range of applications from urban signalized arterials to freeway continuous traffic instead of using traffic engineering rules.In addition, the effectiveness of the proposed approach still needs a further verification by conducting more case studies in other road links with different sources of trajectory data (e.g., smartphone and the ride service platform of Uber or DiDi Chuxing).These will be our future works.Last but not least, we should state that the work from this paper is an extended version of the earlier work presented at the 97th Annual Meeting of the TRB, Washington, DC, January 2018 [5].For more details, it can be found in [5].

Figure 1 :
Figure 1: Layout of the study link selected in Nanjing, Jiangsu province, China.

Figure 2 :
Figure 2: Variability by a 15-min interval: (a) mean travel times and (b) number of vehicles.

Figure 3 :
Figure 3: Link travel time distribution estimation framework.This figure is reproduced from a shorter version of this work presented at the 97th Annual Meeting of the Transportation Research Board [5].
(i) State 1 (nonstopped): vehicles traverse the whole link without stopping.(ii) State 2 (stopped): vehicles stop at the end of the link for a red light.(iii) State 3 (stopped with delay): vehicles traverse the whole link with delays caused by the interference among vehicles and signal control.(iv) State 4 (stopped twice or more): vehicles have to stop twice or more due to the overflow queue at the signalized intersection.

Figure 4 :
Figure 4: Four typical travel time states on urban signalized links.

Figure 7 :
Figure 7: The results comparisons for different percentages of floating cars.This figure is reproduced from a shorter version of this work presented at the 97th Annual Meeting of the Transportation Research Board [5].

Table 2 :
[5]k TTD parameters, HD, and KS tests.This table is reproduced from a shorter version of this work presented at the 97th Annual Meeting of the Transportation Research Board[5].indicates that the estimated distribution passes the KS test at a 5% significance level.
a h = 0 (c) Comparison of the percents of passing KS test