Route Choice of the Shortest Travel Time Based on Floating Car Data

Finding a routewith shortest travel time according to the traffic condition can help travelers tomake better route choice decisions. In this paper, the shortest travel time based on FCD (floating car data) which is used to assess overall traffic conditions is proposed. To better fit FCD and roadmap, a newmapmatching algorithmwhich fully considers distance factor, direction factor, and accessibility factor is designed to map all GPS (Global Positioning System) points to roads. A mixed graph structure is constructed and a route analysis algorithm of shortest travel time which considers the dynamic edge weight is designed. By comparing with other map matching algorithms, the proposed method has a higher accuracy.The comparison results show that the shortest travel time path is longer than the shortest distance path, but it costs less traveling time.The implementation of the route choice based on the shortest travel time method can be used to guide people’s travel by selecting the space-time dependent optimal path.


Introduction
There is an urgent need to obtain the traffic dynamics in a city for traffic guidance.By providing effective traffic information, it can help travelers to make better route choice decisions.Queries of the type "how do we get traffic information?"and "which path is the shortest distance between two vertices in a graph?" are widely addressed, while queries of the type "how do we get traffic information efficiently and economically?"and "which path is the shortest travel time between two vertices in a graph?" need further analysis.
Although traffic information on road networks can be collected by induction loops or visual systems, it is difficult to obtain an accurate estimation of the instantaneous travel time from the local traffic speed and flow data [1].The spatiotemporal distribution of traffic congestions demonstrates a multinuclear structure in urban road networks [2].With the availability of inexpensive positioning technology, it is possible to use historical navigation data to model the traffic flow at different times on a particular day.
In recent years, an increasing number of cars have been equipped with GPS (Global Positioning System).FCD (floating car data) collects traffic information including realtime position, direction, speed, and other information.If this FCD system achieves more than 1.5% of penetration rate [3], the service quality in urban traffic would be good enough.The results from a large-scale freeway and arterial experiment have highlighted the significance of FCD for traffic management [4].So far there are abundant researches about the floating car and its applications, such as detecting hot spots [5][6][7], road networks updating [8,9], traffic prediction [10,11], experiential optimal paths [12], and spatiotemporal patterns [13].Having large amounts of vehicles collecting data for an urban area, it will create an accurate picture of the traffic condition in time and space [14].Nowadays, it is possible to study the spatiotemporal characteristics of the traffic flow by analyzing the FCD.
In comparison to fixed traffic sensors, FCD is capable of providing a robust overview of current road traffic conditions at significantly less cost [15].A new operational system based on information from a cellular phone service provider for measuring traffic speeds and travel times was conducted [16].The main finding is that there is a good match between FCD measurements and dual magnetic loop detectors.
Pfoser et al. showcased a system which facilitates the collection of FCD, produces dynamic travel time information, and provides value-added services based on the dynamic travel times [14].However, a basic problem was the limited vehicle penetration and insufficient data coverage.Kesting and Treiber considered a vehicle-based approach to collect traffic data and used the data to estimate the upstream and downstream fronts of a traffic jam [17].However, this estimation did not allow predicting travel time which became relevant when no probe vehicle passed the roadside units for a while.
Because of GPS measurement errors and road geometric errors in digital maps, the GPS locations of probe vehicles may not appear on network links [18].Therefore, it is necessary not only to match the GPS points to the road networks, but also to get the route of the consecutive GPS points.There are a number of works that propose methods for map matching.An incremental algorithm that matches consecutive portions of the trajectory to the road network and two global algorithms that compared the entire trajectory to candidate paths in the road network were proposed [19].Due to limitations in the tracking data and the road network of global algorithms, the matching results need to be evaluated to discard portions of bad matches.Adaptive Clipping algorithm which takes tracking error estimates into account was introduced to solve this map matching task [20].However, the quality of this map matching algorithm was not assessed.The procedures and algorithms for the computation and map matching of road segment velocities to a digital road network were presented [21].Because lots of adjacent GPS points belong to the different roads, the consideration of direction and distance factors only is not enough for map matching.
Analysis studies of traffic conditions based on FCD were becoming more prominent.Mean link travel time based on the classification for the traffic flow, offset control, and moving direction at downstream signalized intersections in urban traffic networks was studied [22].The mean origin and destination (OD) travel time was evaluated by summing up the mean travel time of each link in traffic networks.Traffic state was detected with FCD [23].A statistical analysis showed the high quality of the reconstruction of the actual travel times in the net with only 1.5% equipped with FCD vehicles.The probe-car system was used to predict traffic congestion in the immediate future [24].A basic model for predicting traffic congestion in the immediate future using pheromone was developed.Traffic quality was provided by the aggregation and evaluation of FCD with a common evaluation scheme [25].GPS traffic-related data for traffic monitoring and control was presented, and the scope of traffic information was illustrated [26].However, this study is limited in analyzing the road situation and does not provide the accurate traveling time.
In this paper, there are two important problems to be addressed in order to better guide the route choice.The first is the acquisition of the road traffic situation from FCD.The second is to find a route with the shortest travel time by designing an optimal route analysis algorithm.
The two main contributions of this paper are the following.
(1) A new map matching algorithm which fully considers the distance factor, direction factor, and accessibility factor is proposed.This algorithm can be used to acquire the road traffic situation.
(2) A shortest travel time algorithm is designed.An improved Dijkstra algorithm is proposed and travel time is assigned to edges as the dynamic weight of the road.
The paper is organized as follows.In Section 2, FCD are presented and two algorithms including map matching algorithm and the shortest travel time algorithm are designed.In Section 3, experiments are implemented to verify the proposed methods.In Section 4, map matching algorithm and the average driving speed of roads based on historical FCD are discussed.Lastly, Section 5 presents the conclusions and points to the future work.

Data Description.
The original FCD is collected from over 11 thousand taxicabs from Wuhan in September, 2009, at regular intervals (average 20-60 seconds) during the courses of six days.Wuhan is the capital of Hubei province, China.It is located in the eastern Jianghan Plain at the intersection of the middle reaches of the Yangtze and Han Rivers.It is a major transportation hub, with dozens of railways, roads, and expressways passing through the city.It has a population of 91,000,000 people in 2009 [27].The number of cars in Wuhan is about 556 thousand in 2008 [28].
The FCD samples achieve a sufficient penetration rate with 1.9% to calculate the traffic information.There are more than 85 million records in total (over 14 million per day) with attributes of timestamp, CarID, , , speed, and angle.A timestamp is a sequence of characters usually giving the accurate date and time of day.CarID is a unique identification of a taxicab. and  are longitude and latitude, respectively, recording the location of the taxicab.Speed is the instant speed of the taxicab at a given time.Angle is defined as a horizontal angle measured clockwise from a north base line.Brockfeld et al. concluded that the Taxi-FCD system is able to deliver valuable travel time information for mobility services and the average travel times can be detected and calculated reliably [29].Table 1 shows some typical FCD records.

Map Matching Algorithm.
Because of the GPS error or digital map measurement error, the deviation phenomenon between FCD and map often exists.The objective of this section is to develop a map matching algorithm for FCD to assess the traffic condition.The core of this map matching algorithm proceeds as follows (Figure 1).The original GPS points and roads are showed in Figure 1(a).Then, location is determined for each GPS point.GPS points are map matching to the corresponding position of the road in Figure 1(b).Furthermore, the route of the adjacent points is determined.A route is given to describe the path of the points in Figure 1(c).

GPS points Road
(a)

GPS points
Matching points

Route
Matching points Figure 1: The process of the map matching algorithm.
In the location determination phase, a comprehensive model (formula (1)) which includes the distance factor, direction factor, and accessibility factor is fully considered.Distance factor denotes the vertical distance between GPS point and road.Direction factor is the angle of the car driving direction and the road azimuth.Accessibility factor is the spatial accessibility of adjacent GPS points in time.Moreover, where (MM) is the comprehensive result of distance, direction, and accessibility factors and map matching is abbreviated as MM.The parameters of (Dis), ℎ(Dir), and (Acc) are distance factor, direction factor, and accessibility factor, respectively.Distance, direction, and accessibility are abbreviated as Dis, Dir, and Acc, respectively. dis ,  dir , and  acc are the weight of distance factor, direction factor, and accessibility factor, respectively.The sum of the  dis ,  dir , and  acc is 1.Dis and Maxdis represent the distance to the road and the maximum distance of the road buffer, respectively.Dir denotes the angle between driving direction and road azimuth.Maxdir is the maximum angle between driving direction and road azimuth.Dispp and Maxacc are the distance of the adjacent points and the maximum distance of the car within a certain time.According to formula (1), a comprehensive maximum value of the road is selected.
The shortest path algorithm is to find a path between two vertices in a graph such that the sum of the weights of its constituent edges is minimized.Dijkstra's algorithm [30], Floyd-Warshall algorithm [31],  * search algorithm [32], and their improved algorithms were widely used to find the shortest paths.One merit of Dijkstra's algorithm is to stop the algorithm once the shortest path to the destination node has been determined.In the route determination phase, Dijkstra's algorithm is designed to implement the route analysis.Due to large amount of the data, the shortest path algorithm is time consuming.Quadtree index is introduced to accelerate the speed of the shortest path algorithm.The framework of the proposed algorithm consists of three steps: initialization, searching, and the shortest path calculation.
In Step 1 (initialization), road networks data is preloaded into memory in order to search the shortest path.Quadtree index is constructed in order to facilitate the spatial query in the road network.Each road segment is constructed as a rectangle which is inserted into the Quadtree.
In Step 2 (searching), searching rectangle is constructed to index the related roads.Two adjacent GPS points are introduced to construct the bounding box.There is a distance limitation in a given time period for a taxicab.So, the maximum distance in a given time is applied to extend the searching area.After construction of the searching area, the related roads to calculate the shortest path can be selected.
In Step 3 (shortest path calculation), Dijkstra's algorithm is designed to solve the shortest path problem.Because the road networks can be viewed as a sparse graph, list structure is defined to accelerate the route searching.By connecting all the adjacent points, the taxicab route can be acquired.

Route Analysis.
The shortest path problem is to find a path between two vertices (or nodes) in a graph such that the sum of the edge weights is minimized.Route choice of the shortest travel time can also be taken as the shortest path problem.The edge weight is the travel time.The difference from the traditional shortest path problem is that the edge weight is dynamically changed over time.This section introduces the calculation method for the route choice of the shortest time.

Time2, rate2
Weight  Road network is constructed to conduct the route analysis.Node Information and Edge Information are created firstly from road segments for further analysis.The Node Information includes three parts: PointID, Lon, and Lat.PointID is the identification of the point.Lon represents the longitude of the point.Lat denotes the latitude of the point.A new data table named Node Information is created to store the above information.Edge Information records the start and end point identifications of the edge.Meanwhile, two columns named StartID and EndID are added to the attribute table of road segment.StartID and EndID are the foreign keys that are consistent with the PointID of the Node Information table.
According to the Node Information and Edge Information, the network can be constructed.Because some road segments are single way, sharing a common node does not mean that the two segments can have access to each other.In the following section, single way and two-way road segments are further discussed.
Road network is a typical mixed graph (Figure 2).Some of the road segments are one-way, and others are two-way.Each undirected road segment as two-directed edge with the opposite directions is reset in Figure 2.
Three classes including Node, Edge, and Graph are designed for the route analysis.The hierarchical relationship of those three classes can be depicted in Figure 3.The road network can be taken as a sparse graph, so the adjacent list structure is used to store the relation of the Node and Edge.Graph Class contains the node collection.Node Class includes node identification and edge collection from this node.Edge Class contains the edge identification, edge length, bSingleWay, StartID, EndID, Weight, and htTimeRate, where bSingleWay represents whether the edge is one-way or not and htTimeRate is a hash table that records the driving speed of roads at different TimeSlice.
Based on Figure 3, route analysis of shortest travel time is designed.Traditionally, the road segment weight is a static variable.In this study, the edge weight changes with time to solve dynamic edge weight problem. Figure 4 is taken as a case to explain the process of the shortest travel time and the results of every step are listed in Table 2.The minimum cost edge is selected and the corresponding point is placed into a set named  every step in Table 2. Firstly, original point, destination point, and departure time are set up.In Figure 4,  5 ,  4 , and 8:00 are the original point, destination point, and departure time, respectively.Secondly, the minimum time cost of edge (⟨ 5 ,  6 ⟩ is the minimum time cost of edge, because it takes 10 minutes from  5 to  6 and 8:10 is less than arriving time from  5 to other points) from original point ( 5 in Figure 4) is selected.Following steps show the calculation process of travel time for each edge.(a) CurRate that is the average driving speed at Current-Time on the road is acquired from htTimeRate.CurrentTime is the driving time of the vehicle.
(b) Remainingtime that is the rest time of the TimeSlice is calculated by formula (5).TimeSlice is the smallest unit of time period for statistics information of the driving speed of roads.INT means that the integer part of the floating number is acquired.
(c) If the Remainingtime of this TimeSlice with this rate can complete this road segment, then the edge weight is calculated by formula (6).e.Length and e.Weight are the length and the weight of the road segment, respectively.
(d) If, in the Remainingtime of this TimeSlice with this rate, the car cannot pass through this road segment, then use the next TimeSlice and its rate to compute the driving length until the car can finish the road segments in the TimeSlice.If this road segment has been traveled through, then assign the weight to this edge: .ℎ = .ℎ÷ .
Thirdly, the minimum weight of the edge relating to other points which is named  7 in Figure 4 is marked.It means that this node has been searched and will not be considered in the following steps.Fourthly, loop to update the edges weight if path passes  7 and nearer to the original point.Fifthly, continue to select next minimum weight edge and add the point until the destination point is found.Lastly, the path of shortest travel time from the original point to the destination point is acquired.The path of ⟨ 5 ,  6 ,  7 ,  3 ,  4 ⟩ is short travel time from  5 to  4 in Figure 4.
Because road length is not equal to road weight which is dynamically changed over time, edges cannot be sorted by length.This algorithm runs in (|| 2 ) (where || is the number of vertices).

Map Matching Result.
Continuous trajectory of the taxicabs can be acquired by the proposed map matching method.
Firstly, GPS points are projected to roads.Then, the shortest path algorithm is used to acquire the path of the adjacent points.Finally, the spatiotemporal position of taxicabs is obtained (Figure 5). Figure 5 shows the map matching result of FCD. Figure 5(a) is the random five taxicabs in FCD database in 24 hours.The primary GPS points are discrete points in space.Figure 5(b) displays the result of the trajectory of the five taxicabs.By the proposed map matching, continuous trajectory of the taxicabs is well acquired.

Spatiotemporal
Rate of the Road.The weekend and weekday have different patterns in traffic [7].Therefore, the weekday and weekend are separated to study the road traffic situation.The taxicabs in the study area are continuously driven for more pickups to maximum profits.The taxicabs speed of all 24 hours can be acquired.The driving speed of roads is calculated in every TimeSlice.TimeSlice cannot be too long or short.Too short time will lead to inadequate records, and too long time will result in low time accuracy.Various studies indicated that the minimum information rate should be between 10 minutes and 3 minutes [33,34].After consideration of various factors, TimeSlice calculation formula is represented as follows: where TaxiNeed is the minimal number of taxis to calculate the average speed, TaxiNum denotes the total number of taxicabs in study area, RoadSegNum is the total number of the road segments,  ÷  represents the average number of taxicabs every instantaneous moment, RoadLenAvg represents the average length of the road segments, SpeedAvg denotes the average speeds of all the taxicabs, and V ÷ V means the average driving time of the car on the road segments.In this study, the result of TimeSlice is 238.04 seconds.In order to calculate conveniently, the integer of TimeSlice as 240 seconds is taken.GPS points are selected to compute the average speed of the road for every slice.Most of the taxicabs are concentrated in the city center and nearby.If the road has no GPS points, the limit speed of the road is taken as the mean speed of this road.Three roads are selected randomly to calculate the average driving speed of every weekday from FCD: that is, Dingziqiao Road (in the lower right corner of Figure 6 The four weekdays in the same road present a similar pattern in Figures 6(b)-6(d).Generally speaking, all of the three roads have two rush hours, and the rush hours appear around 8:00 and 18:00.An obvious decline period from 0:00 to 4:00 and an obvious rising period from 4:00 to 8:00 are showed.Since the average speed of roads can reflect the traffic information of roads, the spatiotemporal distribution of the road speed at every slice is investigated.Figure 7 shows the mean speed of roads about four weekdays at two typical instantaneous moments.
Following conclusions can be drawn from Figure 7. Generally, the driving speed of all roads changes over time.Specifically, the driving speed of roads in the rush hour (08:00) is lower than that in other hours (06:00).The average driving speed of roads at the centers of the city or nearby is lower than that at the suburb, whereas the change of driving speed of roads at the centers of the city or nearby is greater than that at the suburb.

The Shortest Travel Time Path Experiment.
According to the improved Dijkstra algorithm, the route choice prototype is developed.Both the shortest distance path and the shortest travel time path function are implemented.Since they may produce different results, the same starting point and end point are chosen for route analysis in the following three experiments.The experimental results show the different characteristics (Figure 8). Figure 8(a) shows the shortest distance path.In Figure 8(b), the departure time is set to 6:00:00 and the shortest travel time path is illustrated.In Figure 8(c), the departure time is set to 8:00:00 and the shortest travel time path is illustrated.

Discussion
4.1.Analysis of the Map Matching Algorithm.The accuracy of map matching algorithm has a significant impact on the acquirement of road situation.Therefore, the accuracy is an important factor in the experiment.All FCD within a day (over 14 million records) are selected to verify the proposed algorithm.
Seven sets of values are assigned to the parameters in formula (1) to evaluate the accuracy of the proposed map matching algorithm.In experiment 1, the values of  dis ,  dir , and  acc are assigned to 1, 0, and 0, respectively.The above values mean that only distance factor is considered, and the accuracy rate is 81.7%.In experiment 2, the values of  dis ,  dir , and  acc are assigned to 0, 1, and 0, respectively.The above values mean that only direction factor is considered, and the accuracy rate is 76.3%.In experiment 3, the values of  dis ,  dir , and  acc are assigned to 0, 0, and 1, respectively.The above values mean that only accessibility factor is considered, and the accuracy rate is 70.8%.In experiment 4, the values of  dis ,  dir , and  acc are assigned to 1/2, 1/2, and 0, respectively.The above values mean that distance factor and direction factor are considered, and the accuracy rate is 89.9%.In experiment 5, the values of  dis ,  dir , and  acc are assigned to 1/2, 0, and 1/2, respectively.The above values mean that distance factor and accessibility factor are considered, and the accuracy rate is 87.6%.In experiment 6, the values of  dis ,  dir , and  acc are assigned to 0, 1/2, and 1/2, respectively.The above values mean that direction factor and accessibility factor are considered, and the accuracy rate is 85.1%.In experiment 7, the values of  dis ,  dir , and  acc are assigned to 1/3, 1/3, and 1/3, respectively.The above values mean that distance factor, direction factor, and accessibility factor are considered, and the accuracy rate is 94.9%.The accuracy of proposed method, which includes distance, direction, and accessibility factors, is better than traditional map matching algorithms that include distance and direction factors.
Therefore, the values of  dis ,  dir , and  acc are assigned to 1/3, 1/3, and 1/3 in this research.Most of the GPS points have been matched to the correct road in Figure 5. Based on the proposed map matching algorithm, the driving speed of roads can be well acquired.

Average Driving Speed of Roads
Based on Historical FCD.The average speed of historical FCD for a certain time may reflect road traffic conditions at a particular moment.
CV (coefficient of variation) is introduced to express the dispersion degree of the driving speed of roads: where V is the average daily speed of the weekday and  represents the average speed of the four weekdays.CV of the driving speed about three roads (Figure 6(a)) is shown in Figure 9.The CV mean values of the three roads are 0.099244701, 0.079581216, and 0.110617283 separately.According to the statistical analysis, the percentages of the Dingziqiao Road's CV that are less than 0.1, 0.15, and 0.2 are 55.12%, 83.93%, and 97.23%, respectively.Similarly, the percentages of the Wuhan Yangtze River Bridge Road's CV that are less than 0.1, 0.15, and 0.2 are 73.96%,82.55%, and 92.24%, respectively.The percentages of the Xinhua Road's CV that are less than 0.1, 0.15, and 0.2 are 49.03%,76.73%, and 93.63%, respectively.Obviously, the percentages less than thresholds of 0.2 are absolute majorities for three roads.We can conclude that the average speed of historical FCD can approximate to the mean speed of the road at a particular moment.3. The units of the distance and time cost are separately measured in meters and seconds.If the traffic condition is good, the planning paths (the shortest distance path and the shortest travel time path) may be close in spatiality and partly overlap (Figures 8(a 3).However, the shortest travel time path (Figure 8(b)) usually contains more primary roads.On the contrary, the shortest distance path (Figure 8(a)) usually contains more secondary roads.

Analysis of Shortest
If traffic jams have occurred in some roads of the city center, there is usually a big difference between the shortest travel time path and the shortest distance path, since the traffic condition of highway is usually better than that of other roads.For example, the shortest travel time path in Figure 8

Conclusions
In this study, an effective map matching algorithm is proposed and the results show that the proposed method has a higher accuracy.Based on the results of three roads' CV, we conclude that traffic distribution in four weekdays for each road has a similar pattern.By comparing with routes of the shortest travel time and the shortest distance, the results show that the shortest travel time paths cost less traveling time than the shortest distance path.
Although more than 85 million records are collected and analyzed, traffic conditions may be influenced by other factors, such as weather and holiday.However, we have not considered the influence of these factors on traffic in this study.For the proposed map matching algorithm, because some map matching algorithms are not open source, we have not computed the accuracy of these algorithms.
Clearly, the research in this article can be regarded as an initial step in the application of FCD.Because FCD has the characteristic of big data, future research is planned to apply distributed computing technology to speed up the analysis rate of large-scale GPS records.In addition, multisource data will be used to analyze the traffic condition in the future work.

Figure 2 :
Figure 2: Analysis of a mixed graph.

Figure 3 :
Figure 3: Hierarchical relationship of Graph, Node, and Edge Classes.

Figure 4 :
Figure 4: A case study for the shortest travel time.
(a)), Wuhan Yangtze River Bridge (in the middle of Figure 6(a)), and Xinhua Road (in the upper left corner of Figure 6(a)).The mean speeds of the Dingziqiao Road, Wuhan Yangtze River Bridge, and Xinhua Road are presented in Figures 6(b), 6(c), and 6(d), respectively.

Figure 5 :Figure 6 :
Figure 5: Map matching of FCD.(a) FCD of the five taxicabs.(b) Trajectory of the taxicabs after map matching.

Figure 7 :Figure 8 :
Figure 7: Illustration of spatiotemporal distribution of the average road speed (weekdays).(a) Average road speed at six AM.(b) Average road speed at eight AM.

Figure 9 :
Figure 9: CV of the driving speed about Dingziqiao Road, Wuhan Yangtze River Bridge, and Xinhua Road, respectively.(a) CV of the driving speed about Dingziqiao Road.(b) CV of the driving speed about Wuhan Yangtze River Bridge.(c) CV of the driving speed about Xinhua Road.
Travel Time Path.Distance factor and time factor are used to evaluate the proposed shortest travel time algorithm.The distance and time cost of the planning paths in Figures 8(a), 8(b), and 8(c) are listed in Table ) and 8(b)).The difference in distance of paths in Figures8(a) and 8(b) is small (Table (c) contains more highways.Although the planning path distance in Figure 8(c) is longer than the planning path distance in Figure 8(a), it costs less travel time.The planning paths in different time are also different although they apply the same algorithm.The road traffic condition varies over time, so the shortest travel time paths are totally different in different departure time (Figures 8(b) and 8(c)).

Table 1 :
Samples of typical FCD records.

Table 2 :
The process of the shortest travel time for Figure4.

Table 3 :
The distance and time cost of planning paths in Figure8.