Origin-Destination Estimation Using Probe Vehicle Trajectory and Link Counts

This paper presents two origin-destination flow estimation models using sampled GPS positions of probe vehicles and link flow counts. The first model, named as SPP model (scaled probe OD as prior OD), uses scaled probe vehicle OD matrix as prior OD matrix and applies conventional generalized least squares (GLS) framework to conduct OD correction using link counts; the second model, PRA model (probe ratio assignment), is an extension of SPP in which the observed link probe ratios are also included as additional information in the OD estimation process. For both models, the study explored a new way to construct assignment matrices directly from sampled probe trajectories to avoid sophisticated traffic assignment process. Then, for performance evaluation, a comprehensive numerical experiment was conducted using simulation dataset. The results showed thatwhen the distribution of probe vehicle ratios is homogeneous among differentODpairs, both proposedmodels achieved similar degree of improvement compared with the prior OD pattern. However, under the case that the distribution of probe vehicle ratios is heterogeneous across different OD pairs, PRA model achieved more significant reduction on OD flow estimations compared with SPP model. Grounded on both theoretical derivations and empirical tests, the study provided in-depth discussions regarding the strengths and challenges of probe vehicle based OD estimation models.


Introduction
The origin-destination flow matrix (referred to as OD matrix) is one essential input to many dynamic traffic assignment and traffic simulation systems.Conventionally, a prior OD matrix (or seed OD matrix) is estimated from a large-scale travel survey which is usually conducted every several years due to prohibitively high cost.As a result, survey of both sampling rate and update frequency of travel is constrained which sometimes leads to a biased estimation of OD matrix.To contend with such problem, researchers developed various OD correction models using link flow counts which are collected routinely by detectors.According to Cascetta and Nguyen [1], most popular OD estimation models utilizing link flow counts can be formulated as an optimization problem taking the following generalized form: min   1 (, X) +  2 (, Ŷ) s.t. = () ,  ∈ Ω, (1) where  is the unknown OD flow matrix and X is the prior OD matrix (or seed matrix) representing modeler's prior belief regarding the temporal and spatial distribution of the travel pattern. and Ŷ are the estimated and observed link flow vectors, respectively;  1 ( 2 ) is distance metric function to measure the discrepancy between  and X ( and Ŷ).In the first constraint of (1), the abstract function  essentially represents the traffic assignment process through which OD flows  are mapped to estimated link flows . is usually referred to as assignment matrix.Ω is the feasible domain of .
Depending on the form of metric functions, previous studies can be categorized into three groups: (1) generalized least squares models (GLS) studied by Bell [2,3], Cascetta [4], Cascetta et al. [5], and Cascetta et al. [6]; (2) maximum likelihood models (ML) such as Spiess [7] and Cascetta and Nguyen [1]; and (3) Bayesian inference models (BI) studied by Maher [8].In the above methods, the reference time period is divided into a sequence of uniform intervals and OD flows of all intervals are estimated simultaneously.This type of modeling approach is called simultaneous OD estimation method which is developed primarily for offline applications.Meanwhile, for online applications, sequential OD estimation technique based on Kalman filter technique received intensive research attention during the last several decades.Starting from Okutani and Stephanedes [9], subsequent studies along this direction include Nihan and Davis [10], Chang and Wu [11], Ashok and Ben-Akiva [12,13], and Antoniou et al. [14].The major difference between sequential and simultaneous OD estimation is that sequential estimation performs OD adjustment recursively at each interval based on measurements of current interval and estimated results from previous intervals.Consequently, sequential estimation methods are more suitable for online traffic state estimation applications due to higher computational efficiency.
Recently, the application of emerging surveillance technologies in the OD estimation field has drawn increasing concerns from the research community.Many researchers developed their own models to combine the information from conventional detectors and emerging sensing technologies.Representative studies along this line include AVI system (Dixon and Rilett [15], Zhou and Mahmassani [16], and Chen et al. [17]), vehicle plate scanning (Castillo et al. [18]), sporadic routing data (Parry and Hazelton [19]), GPS probe vehicles (Eisenman and List [20], Cao et al. [21]), floating car data ( Ásmundsdóttir [22], Yang et al. [23]), and cell phone data (Sohn and Kim [24], Calabrese et al. [25], and Iqbal et al. [26]).The basic idea behind these studies is to improve OD estimation accuracy using extra information which was not available before.Conventional OD estimation model given by formulation (1) faces three major challenges: (1) the underspecified nature of OD estimation problem; (2) reliability of prior OD matrix; and (3) accuracy of assignment matrix estimation.
First of all, in most transportation networks, the number of OD pairs exceeds the number of sensors.Therefore, without specifying additional constraints, such as prior OD matrix, the OD estimation problem is an underspecified problem.And through both laboratory experiments and real world dataset, Marzano et al. [27] and Cascetta et al. [6] concluded that "a satisfactory updating, regardless of the quality of the prior estimate, can be obtained generally only when the ratio between the number of equations (independent observed link flows) and the number of unknowns (i.e., OD flows) is close to one."Hence, increasing the amount of observation is a major motivation of exploiting those emerging data sources.A second challenge is the availability and accuracy of the prior OD matrix.Both Marzano et al. [27] and Cascetta et al. [6] demonstrated the importance of the prior OD matrix; also Frederix et al. [28] discussed the possibility of falling into local optimal solutions given an inappropriate seed OD matrix.Besides, depending on the area of analysis, the prior OD matrix with acceptable level of reliability may not even exist.The third challenge is the computation of the assignment which usually involves traffic assignment process.A benchmark study in this regard is the bilevel optimization model proposed by Yang et al. [29,30].Although the method can to some extent capture the impact of congestion on drivers' route choice, some of the assumptions (such as perfect information) of user equilibrium condition may not be satisfied in reality.And to the best of our knowledge, no systematical way has been proposed to correct such potential bias with observed data due to the low observability of the assignment matrix.
In view of the above challenges and potential of GPS location data, this study discusses two GLS based estimation frameworks for OD flows.The basic idea is to take full advantage of the sampled probe vehicle trajectory data to tackle all three above-mentioned difficulties.First of all, by aggregating the origin and destination zones of each probe vehicle trace, one can obtain observed probe vehicle OD flows.Then by scaling up the probe OD matrix using certain set of penetration ratios (the proportion of probe vehicles among the entire vehicle population), a crude estimation of OD flows can be obtained.Such scaled probe OD matrix can serve as a perfect supplement or replacement of the target OD matrix.Eisenman and List [20] conducted an exploratory study in which the scaled probe OD matrix is used in conjunction with prespecified target OD matrix in the GLS formulation with externally computed assignment matrix.Van Aerde et al. [31] focused on the computation of probe penetration ratios by averaging the observed probe ratios at different sensor locations across the network.Although the model does not require target OD matrix as input, unsatisfactory estimation results were reported even under 20% probe penetration rate.
Another relevant study was conducted by Iqbal et al. [26] using mobile phone call records.The phone call records were first used to generate tower-to-tower transient OD matrix which is then scaled up using an optimization model in conjunction with microscopic simulation model.Also, Cao et al. [21] proposed a two-step framework to incorporate probe vehicle data: in the first phase, link flows without traffic sensors are estimated based on observed link speed (from probe vehicles) and precalibrated macroscopic speed-density relationship; in the second phase, a bilevel GLS estimator is formulated to estimate OD flows.Similarly, Tan et al. [32] developed a dynamic OD estimation model using Automatic Vehicle Location information, where DTA is used for obtaining traffic assignment matrix.An important issue that is overlooked by above studies is the heterogeneity of probe penetration ratios among different OD pairs.Such situation may occur especially when probe vehicles are certain type of commercial vehicles.In practice, both delivery vehicles and taxi vehicles could be used as probes.When using delivery vehicles as probes, their restricted OD distribution in a given network would inevitably be an issue and it may affect the accuracy of OD estimation.So taxis would be a better choice.However, the proportion of taxi trips between distant OD pairs may still significantly differ from the other OD pairs.Such probe ratio heterogeneity is considered explicitly in this study.Another key concept proposed in this study is to estimate the assignment matrix directly from sampled trajectory data instead of running some complex traffic assignment process.There are two benefits in doing so: first is by replacing traffic assignment with map matching and data processing, one can avoid sophisticated traffic assignment computation and parameter calibration; moreover, the model does not depend on any theoretical assumption regarding drivers' behavior.
Based on above discussions, two models are presented in this paper.The first model, SPP model (stands for scaled probe OD matrix as prior OD), uses scaled probe vehicle OD matrix as prior OD matrix and applies conventional GLS framework to conduct OD correction using link counts; the second model, PRA model (stands for probe ratio assignment), is an extension of SPP in which the observed link probe ratios are included as additional information in order to explicitly account for heterogeneity of probe penetration ratios.The remaining part of this paper is organized as follows: Section 2 explains basic concepts and notations.Section 3 contains detailed model specifications and is organized into three subsections: the development of SPP and PRA model and computation of assignment matrices.Section 4 discusses the solution algorithm.Section 5 presents the numerical experiment.Finally, Section 6 summarizes the conclusions.

Some Definitions and Notations
Considering a road network represented by a direct graph ⟨, ⟩ where  is the node set and  is the link set, the analysis period is divided into  uniform intervals.Each interval is called a demand interval.Let  denote the origindestination pair set; then the travel demand pattern of the network during the analysis period is represented by the OD flow matrix  ,  = 1, 2, . . .,  and  = 1, 2, . . ., .Let  denote a collection of links installed with sensors;  is a subset of .ŷ,  = 1, 2, . . ., ,  = 1, 2, . . .,  are observed traffic flow counts of all vehicles at sensor locations.
It is assumed that there are two types of vehicles travel in the network: probe vehicles and regular vehicles.Each probe vehicle is able to actively report its position in the form of GPS coordinates.And through map matching algorithm, those GPS coordinates are transformed to corresponding locations in the network.In reality, the actual locations of probe vehicles usually cannot be determined fully due to measurement error of GPS.Since this study emphasizes more on the theoretical aspects, it is assumed that the exact location information is available whenever a probe vehicle reports its GPS coordinate.
The numbers of probe vehicles traveling between different OD pairs within each interval are called probe vehicle OD flows (simplified as probe OD flows) and are denoted by ẑ, ,  = 1, 2, . . .,  and  = 1, 2, . . ., ; for each OD pair, the proportion of probe vehicle in the total vehicle population within the same interval is called OD probe vehicle penetration ratio (simplified as OD probe ratio) and is denoted by  , ,  = 1, 2, . . .,  and  = 1, 2, . . ., .Note that OD probe ratios are both time-and OD-dependent.

Model Specifications
This section introduced two different models for OD estimations.It is noticeable that traffic congestion may bring great challenge to the estimation process.For simplicity, this study does not account for the dynamic routings issue which may have happened in practice.

Scaled Probe OD Matrix as Prior Matrix (SPP) Method.
SPP method consists of two steps: firstly, the prior OD matrix x, is estimated by scaling up the probe OD flows ẑ, with corresponding OD probe ratios  , estimated by averaging the link probe ratios across the network; then OD flows  , are solved correcting the prior OD matrix with a GLS formulation.
To make the paper self-contained, the "direct scaling" method proposed by Van Aerde et al. [31] is summarized briefly in this section.To estimate OD probe ratios, the average link probe ratios across the entire network within each interval is computed by the following expression: where ĥ, are the observed link probe flows, ŷ, are the observed link flows, and   is the common value of  , .Equation (2) computes the average ratio of the total number of probe vehicles to the total number of vehicles observed across the entire network during one interval.Therefore expression (2) implicitly assumes that OD probe ratios are homogeneous among all OD pairs.Then   are used to compute the prior OD flows x, using the following equation: where ẑ, are the observed probe OD flows.
It is noticeable that the prior OD flows x, themselves are an estimator of OD flows.Such method is referred to as direct scaling model (DS) in this study.Since the prior OD matrix is only a crude estimation, it is then adjusted using the following GLS formulation: SPP model: where  , , x, are unknown and prior OD flows, respectively;  , , ŷ, are estimated and observed link flows, respectively;  2 , ,  2 , are, respectively, variances of x, and ŷ, ; and  is the maximum percentage change of OD flows between two consecutive intervals.
Constrain (4b) essentially represents the flow assignment process.  , are the proportion (or probability) of vehicles departed during kth interval traveling between OD pair  to pass link  during ( + )th interval;   , is referred to as flow assignment fractions in this study.The estimated link flows,  , , are then expressed as the weighted sum of all OD flows departed before interval . is the maximum travel time among all journeys converted to the number of demand intervals.In this study, the flow assignment fractions are computed directly from probe vehicle trajectories.
The basic idea of SPP model is to reduce the estimation bias of the direct scaling method using sensor count information.Let d and n be the numbers of OD pairs and traffic sensors, respectively; then the total number of unknowns to independent observations ratio of SPP model is /( + ).Note that the target OD flows are obtained from GPS probe vehicle data; no additional target OD matrix from other sources is required by model.
Finally, formulations (4a)∼(4d) constitute a nonlinear optimization problem with convex objective function and linear constraints.Therefore the global optimal solution exists and any solution satisfying local optimality condition is also the global optimal solution of the problem.A gradient based searching algorithm as discussed in Appendix A is adopted to solve SPP model.

Probe Ratios Assignment Model (PRA).
The second formulation proposed in this study is called the probe ratio assignment model (referred to as PRA).The underlying idea of PRA is to explicitly consider the correlation between OD probe ratios and observed link probe ratios.
Thus, there exist some function  that links OD probe ratios  , and estimated link probe ratios  , :  , =  ( , )  = 1, 2, . . ., ,  = 1, 2, . . ., . ( The structure of  is similar to that of flow assignment matrix ; the difference is that  represents the assignment of probe vehicle ratios instead of flows.In this study,  is called probe ratio assignment matrix.Based on the discussion, we can now define the following equation: Here   , are probe ratio assignment fractions.  , represents the contribution of the probe ratio during th interval between OD pairs  on the probe ratio of link j during (+)th interval.Note that in above equation, the OD probe ratios are estimated by taking the ratio between ẑ, , the observed OD flows,and  , , the OD flows we want to estimate.Let θ, be the observed link probe ratios at all sensor locations.Then PRA model can be obtained by extending SPP formulation by considering θ, and incorporating equation ( 6): , ≥ 0 ∀, .
In the above formulations, please note that V 2 , is the variance of θ, .Other notations are introduced previously.
The objective function of PRA model (7a) adds a third term to that of SPP (4a) which is the sum of weighted distances between estimated and observed link probe ratios.And the first and second terms of (7a) are identical to those of (4a).Constraints (7b) and (7c) represent the assignment of OD flows and OD probe ratios.
The primary feature of PRA as given by (7a)∼(7e) is the utilization of a new set of field observations: the observed link probe ratios which is the combined information of flow counts and probe vehicle trajectories.Also the OD probe ratio assignment matrix is a new concept introduced in this study.Let  and  be the number of OD flows and traffic sensors, respectively; then the total number of unknowns to independent observation ratio of PRA model is /( + 2 * ) which is lower than that of SPP model.Essentially, each sensor provides two observations instead of only one after considering the probe vehicle trajectories: the first is the flow count of the entire vehicle population and the second is the proportion of probe vehicles passing the sensor location.The computation of   , is discussed in the next subsection.Note that the PRA formulation is no longer a convex optimization problem due to the existence of (7c); therefore solving the problem using gradient based searching algorithm faces the possibility of being trapped in local optimums.The solution algorithm of PRA model is discussed in Section 4.

Computation of Assignment Matrices Using Probe Vehicle
Trajectories.In this study, both flow assignment fractions   , and probe ratio assignment fractions   , are estimated through analyzing the GPS trajectories of probe vehicles.Compared with conventional methods (i.e., traffic assignment model), the main feature of the proposed approach is that it replaced dynamic traffic assignment process with map matching procedure of GPS coordinates.
The underlying concept is relatively straightforward.The entire vehicle population is divided into two groups: probe vehicles and regular vehicles.And all the assignment fractions of probe vehicle population can be obtained from their timedependent location information.Then one can use those to approximate the assignment fractions of the entire vehicle population assuming that probe vehicles are randomly sampled.

𝑖,𝑗
as the observed number of probe vehicles traveling between OD pair  and departed during interval  passed sensor  during interval +, then the flow assignment fractions of probe vehicles can be computed as In the above equation, the denominator is essentially the total number of probe vehicles departed during interval  between OD pairs .Therefore the fraction inside the parenthesis is the proportion of vehicles passing link  after  intervals counted from their departure time interval.Note that probe vehicles departed at different time intervals are aggregated together to obtain a single estimation of the assignment matrix.The underlying assumption is that there exists some time period during which drivers' route choice behavior and network traffic condition remain approximately stable, and one can use a single flow assignment matrix (or OD probe ratio assignment matrix) to represent the correlation between OD flows and link flows (or OD probe ratios and link probe ratios).The purpose is to increase the estimation accuracy of assignment matrix by aggregating probe vehicles from multiple demand intervals.Extending (8) into a timedependent form is also straightforward.Essentially probe vehicles are grouped according to their departure time.Consider the fact that the entire period of analysis is divided into  assignment intervals.Each assignment interval contains  demand intervals.Then let  , , ,  = 1, 2, . . .,  be the assignment matrices; then  , , can be estimated based on probe data collected during assignment interval  using ẑ  , )  = 1, 2, . . ., ,  = 1, 2, . . ., ,  = 1, 2, . . ., , where   is the th demand interval during th assignment interval.It is noticeable that traffic congestion level on the network would directly affect the value of .
The probe ratio assignment fractions can be computed using the following equation: -5 -4 -5 ------Similar to (9), the estimation of probe ratio assignment fractions in (10) can be approximated by using the observed number of probe vehicles departed during interval   - passed sensor  during interval   and sensor detected link flows.
This section provides an illustrative example to show how ( 9) and ( 10) work.Consider a hypothetical network consisting of four nodes and five links (shown in Figure 1(a)), among all the nodes, nodes 1, 2, and 3 are demand generation nodes and node 4 is a demand absorption node.Therefore there are three OD pairs, i.e., 1-4, 2-4, and 3-4.It is assumed that traffic sensors are installed at the middle of each link.A total of 12 probe vehicles departed during one demand interval (therefore  =  = 1); the distribution of the three OD pairs is 6, 4, and 2 vehicles, respectively.
Note that the initial locations of all probe vehicles are plotted in Figure 1(b).
Suppose that all vehicles finished their trip within the following four time intervals and each vehicle is observed at least once during each interval.The positions of all vehicles at the end of each subsequent interval are visualized by Figures 2(a)∼2(d).Note that vehicles traveling between different OD pairs are painted with different patterns.
According to observed vehicle positions at the end of each interval, one can identify, from individual vehicle's viewpoint, the sensor location passed by each probe vehicle during each interval which is summarized in Table 1.
According to Table 1, ẑ,+ , are computed and summarized in Table 7. Then the flow assignment fractions and probe ratio assignment fractions are computed according to (9) and (10).The results are summarized in Table 8.

Computation Procedures
This section presents the numerical solution algorithm for the proposed models.Our objective is a nonlinear optimization problem with equality and inequality constraints.Mathematically, such optimization problem takes the following general form: min   In formulation (11), the objective function f is an dimensional scalar function.Since  is continuous and differentiable in this study, the gradient of  can be obtained analytically.Also there are  linear equality constraints and - linear inequality constraints.The gradient vector of  is denoted by ∇: The solution algorithm is summarized as follows.
Step 1 (initialization).Determine an initial feasible solution that satisfies all constraints.Let  0 be the initial feasible solution and set the current iteration ℎ = 0; then enter the main optimization loop consisting of Steps 2∼6.
Step 2 (gradient computation).Based on  ℎ , compute the gradient of objective function ∇( ℎ ).The specific formulas used for gradient computation of SPP and PRA model are given by equations (A.3) and (B.3) in Appendices A and B, respectively.
Step 3 (optimal search direction calculation).Based on the current gradient, compute an optimal search direction considering all constraints.Let  ℎ = [ ℎ1 ,  ℎ2 , . . .,  ℎ ] be the search direction of current iteration;  ℎ is computed by solving the following LP model: In the above LP problem,   is the coefficient vector of th constraint,   = [ ,1 ,  ,2 , . . .,  , ];  and   are the equality constraint set and bounded inequality constraint set.
Step 4 (optimal search step length calculation).According to the optimal search direction  ℎ , perform the following line search to determine optimal step length  ℎ : min This algorithm follows an iterative procedure and it is quite similar to traditional Frank-Wolfe algorithm.
Step 6 (check convergence criteria).If |( ℎ+1 ) − ( ℎ )| < , then terminate the computation; otherwise repeat the process from Step 2 to Step 6.  3 and 4. The simulation period is set to 3 hours which is divided into 18 demand intervals (each interval is 10 minutes).To simplify the simulation process, 39 major OD pairs are selected in the simulation.As shown in Figure 4, 10 out of 38 links are installed with traffic sensors for traffic data collection.Table 10 summarizes the origin and destination nodes of each OD pair along with its time-dependent demand volumes; Table 11 summarizes the route choice probability of all paths between all OD pairs.For convenience of study, between each OD pair shown in Table 11, we only selected those paths whose lengths are obviously shorter than the others.Given the OD flows and route choice in the simulation network, it shall be noted that congestion v/c ratios on all links are below 0.4.By running the simulation network in VISSIM, we collected the GPS trajectory of each vehicle and traffic flow rate on those links with sensors.Notably, the GPS trajectory of each vehicle can directly yield the ground truth of OD flows.For model evaluations and comparisons, this study uses a part of trajectory dataset (based on the preset probe ratio in each scenario) as model inputs which represent the probe vehicles.

Numerical Examples
To reveal the model property under different network conditions, two scenarios are simulated.The first scenario (referred to as scenario A) represents the situation in which the probe vehicle penetration ratios are approximately homogeneous among different OD pairs; and the second scenario (referred to as scenario B) represents the heterogeneous probe ratios across multiple OD case.For both scenarios, the average probe vehicle penetration ratio is set as 15%.The probe ratios across different OD flows in scenarios B rages from 5% to 30%.
For each model (DS, SPP, and PRA), the estimation accuracy of four sets of parameters is examined: ( 1 and x are estimated and ground truth values. is the number of estimates.
Tables 2∼5 summarize all the performance indicators for OD flows, OD probe ratios, link flows, and link probe ratios given by DS, SPP, and PRA models under scenarios A and B. Also using SPP model as the benchmark, the improvement of PRA in estimations is also presented.By using the proposed solution algorithm, the estimations with all models can be completed within 3 minutes.

Results Interpretation.
Several important observations made from model outputs are discussed in this section.First of all, DS model tends to produce biased OD estimation in both scenarios which can be observed from Table 2 and Figures 5(a) and 5(d).Particularly when the probe vehicle ratios are not homogeneous across different OD pairs, the percentage error of DS model reached as high as 82% in MAPE and 125% in RMSPE.Therefore, directly scaled OD flows based solely on observed link probe ratios are not reliable estimator of OD flows and require additional adjustment.
Secondly, in scenario A, both SPP and PRA models offered significantly higher estimation quality comparing with the DS model.Using the SPP as the benchmark, in scenario A, the PRA model reduced the MAPE and MSPE of the OD matrix, respectively, by 1.3% and 7.5%.Based on the performance indicators reported by the tables, one can argue that when the probe ratios are homogeneous across OD pairs, the two models have similar degree of accuracy.
As for scenario B where the probe ratios are not homogeneous, results showed that PRA model is more effective compared with SPP model.According to  5(e) and 5(f).Similar results can also be found in Tables 3∼4.In Table 5, both SPP and PRA can outperform DS in estimating link probe ratios while the MES and RMSE showed that the two models are almost identical.However, in terms of MAPE, MSPE, and RMSPE, the PRA model can yield significant improvement which validates the model's effectiveness.
Another important finding of the experiment is that it is generally more challenging to correct link probe ratios comparing with flow counts.From Figures 6(b), 6(c), 6(e), and 6(f), one can observe that the regression line between estimated and observed link flow counts is very close to 45degree line for both models, while, in Figures 7(b), 7(c), 7(e), and 7(f), the dispersions of link probe ratio scatter plots are obviously larger.One viable explanation is that the computation of the probe ratio assignment matrix is more difficult than flow assignment matrix and a better correction of link probe ratios can be achieved only when the probe ratio assignment matrix is more precisely specified.
To summarize, as a direct enhancement of the DS method, SPP is very effective when the distribution of OD probe ratios is homogeneous; on the other hand, the PRA model, by considering additional link observations, generally outperforms SPP model when the OD probe ratios vary significantly among different OD pairs.However, compared with SPP and DS models, the PRA method also introduces several additional complexities.To maximize the benefit of PRA, one needs to correctly specify the probe ratio assignment matrix and link probe ratio variances and also take some care on the solution algorithm in order to avoid being trapped in local optimal points.
To further evaluate the effectiveness of PRA models in estimating OD flows with different probe penetration ratios, this study conducted a sensitivity analysis, which ranges the ratio from 5% to 30%.As shown in Table 6, the estimation accuracy of PRA is quite sensitive to the average probe ratios.
Higher probe ratios will result in more accurate OD flow estimation.In addition, it is noticeable that when the probe ratio drops to 5%, the estimation results from PRA are not sufficiently reliable.Under such condition, increasing the number of link sensors would help improve the model's performance.

Conclusions and Future Research
This paper presented the development of two offline OD estimation models using probe vehicle data: the SPP and PRA models.Both mathematical formulations and solution algorithm are discussed in detail.Also the study successfully explored the possibility of computing assignment matrices directly from GPS trajectories to avoid complex traffic assignment process.Then, through a comprehensive numerical experiment, the performances of proposed models are analyzed.It is shown that the distribution of OD probe ratios can affect the correction power of different models when the probe vehicle data is used.When the OD probe ratios are approximately homogeneous across different OD pairs, both SPP and PRA performed equally well by reducing about half of the relative error of DS method; however when the OD probe ratios are nonhomogeneous, then PRA model outperformed SPP model to some extent.The results also implied that when the OD probe ratios are heterogeneous, incorporating observed link probe ratios into the objective function can improve the overall estimation accuracy.However, unlike link flows, the correction of link probe ratios turns out to be much more challenging and one needs to carefully specify the correlation between OD and link probe ratios which is the probe ratio assignment matrix  in this study.
As for future research, the proposed models can be integrated with the quasidynamic approach proposed by Cascetta et al. [6] to further reduce the unknown-to-observation ratio; also sensitivity analysis can be performed to investigate the impact of different input parameters on the final outcomes; moreover numerical experiment using real world dataset is another important future work in order to better assess the model performance.In addition, one can note that the probe OD ratios are used as the approximates of the flow assignment fractions, which may not be true in practice.Hence, it is also critical to develop an advanced model to overcome this limitation and apply some state-of-the-art method to estimate the traffic assignment matrix.
Note that the second derivative is computed based on the correlation between  , and  , given by constraint (4b).To summarize,

B. Gradient Computation of PRA Model
According to the objective function given by ( 6), define the following three functions:

C. Assignment Fractions of Example
See Tables 7, 8, and 9.

D. OD Flows and Turning Ratios in the Network
See Tables 10 and 11.

Figure 3 :
Figure 3: The bird view map of the test network.

:
Detectors for both directions

Figure 4 :
Figure 4: Topology of the test road network in VISSIM.
Figures 5,6,and 7 show the comparison between estimated and ground truth values of different parameters displayed in the form of scatter plots.

Figure 5 :
Figure 5: Scatter plots of estimated and ground truth OD flows.

Figure 6 :
Figure 6: Scatter plots of estimated and observed link flow counts.

Figure 7 :
Figure 7: Scatter plots of estimated and observed link probe ratios.

Table 1 :
Summary of sensor locations passed by probe vehicles.
5.1.Simulation Setup and Results.To evaluate effectiveness of the proposed models, numerical experiments are conducted using VISSIM as a laboratory experiment tool.The use of synthetic dataset is due to the lack of real world dataset.A road network in the northern part of Maryland State (near I-495 beltway) is selected as the test site.The network consists of 28 nodes and 74 links.The bird view map of the target area and the network topology constructed in VISSIM are shown by Figures

Table 2 :
Estimation accuracy of OD flows.

Table 3 :
Estimation accuracy of OD probe ratios.

Table 4 :
Estimation accuracy of link flow counts.

Table 5 :
Estimation accuracy of link probe ratios.

Table 6 :
Estimation accuracy of OD flows with different probe ratios.

Table 2 ,
PRA model reduced the MAPE and MSPE of SPP model by 19.8% and 36.5%,respectively.Therefore it indicates that PRA model offers additional correction capability compared with SPP model.Such conclusion can be also inspected visually through Figures

Table 9 :
Probe ratio assignment matrix estimated.

Table 11 :
Turning ratios via different paths.