An Online Map Matching Algorithm Based on Second-Order Hidden Markov Model

Map matching is a key preprocess of trajectory data which recently have become a major data source for various transport applications and location-based services. In this paper, an online map matching algorithm based on the second-order hidden Markov model (HMM) is proposed for processing trajectory data in complex urban road networks such as parallel road segments and various road intersections. Several factors such as driver’s travel preference, network topology, road level, and vehicle heading are well considered. An extended Viterbi algorithm and a self-adaptive sliding window mechanism are adopted to solve the map matching problem eﬃciently. To demonstrate the eﬀectiveness of the proposed algorithm, a case study is carried out using a massive taxi trajectory dataset in Nanjing, China. Case study results show that the accuracy of the proposed algorithm out-performs the baseline algorithm built on the ﬁrst-order HMM in various testing experiments.


Introduction
With the development of positioning and wireless communication technologies, floating car data (e.g., trajectories of taxis) have become a major data source for many applications such as location-based services, intelligent transportation systems, and transport policy appraisals [1][2][3][4][5].
e errors of positioning data collected by global positioning system (GPS) equipment on floating vehicles are inevitable and could come from satellite, transmission process, and receiver [6]. Map matching is the process of matching GPS data with errors onto the road network in order to eliminate the impact of errors and maximize the effectiveness of data. In practical applications, a map matching algorithm plays a vital role, for example, travel time prediction based on floating car data, which needs to match GPS points to the corresponding road segment accurately. erefore, the map matching algorithm is the basis for the large-scale application of floating car data.
e algorithms with geometric technique utilize geometric information of GPS point and road network (e.g., distance, angle and shape) without considering the topology of the road network. ese algorithms show high efficiency of map matching, but the accuracy is low when matching low-precision GPS data to complex road networks. With regard to topological technique, both geometric factors and road topology are considered. To some extent, topological technique improves the matching accuracy but is still vulnerable to the influence of low-frequency sampling interval and large sampling noise. e probability statistics technique sets an ellipse or rectangle confidence area for each GPS point, thus we can obtain the probability according to the distance between the GPS point and the position in confidence area. Optimal matching paths are determined according to values of the probability. Compared to the geometric technique and topological technique, the probability statistics technique is relatively more complex and difficult to implement, and shows low time efficiency. By combining geometric, topological, and probability factors, advanced techniques, such as Kalman filter [19], Bayesian filter [20], fuzzy logic model [21], multihypothesis tree [18], and hidden Markov model (HMM) [22], can effectively improve the map matching accuracy and achieve online incremental matching.
Of the advanced techniques, HMM has become popular in map matching studies. HMM is a prevailing paradigm of network-based dynamics modeling, which well suits the process of finding the most suitable matching point (i.e., hidden state) to each GPS point (i.e., observed state) on the road network in map matching problem. Existing map matching algorithms based on HMM can be categorized into two categories [20]: offline algorithms and online algorithms (refer to Table 1).
Offline HMM map matching algorithms are applied using historical data, batching the whole input trajectory to find the optimal matching path in the road network [23][24][25][26][27][28]. Whole trajectories enable offline algorithms to take account of the relationship between the front and the back points to achieve higher accuracy. Offline algorithms show robustness to the reduction of sampling rate, but the computation efficiency is low. Online algorithms estimate the current segment immediately after obtaining GPS data, and this kind of algorithm can be used for providing online services such as real-time navigation and trajectory monitoring. Because of the unavailability of future points, online algorithms are more complicated and require higher computation demand for real-time applications. Most studies utilize the sliding window mechanism with fixed window size to realize online matching [29,30]. As the number of GPS points increases, the points in sliding window change dynamically. However, under the condition of low data quality or complex road network, small window leads to a significant decrease in matching accuracy while large window brings a significant decrease in computation efficiency. A few online map matching algorithms adopt variable sliding windows, but it requires a lot of extra computation [31]. Considering these, in this study, we proposed self-adaptive sliding windows to realize online map matching based on HMM, which promises accuracy and efficiency at the same time.
HMM builds on the stochastic processes of observation and state transition. In the map matching context, two probabilities are important: observation probability and transition probability. Observation probability is usually obtained by the Gaussian distribution of great-circle distance between GPS points and candidate points. In the literature, several factors have been considered in calculating observation probability. For instance, unsupervised HMM [25] considers the location of Antenna when matching mobile phone data. Other studies, e.g., Quick Matching [26], Multistage Matching [27], and SnapNet [29] consider more factors including the speed constraint, road level, and vehicle heading. With regard to transition probability calculation, to consider temporal relationship of different points, some factors such as speed constraint and free-flow travel time are considered in several studies [23,25,28,30,31]. To consider spatial relationship, some factors are included such as the difference between great-circle distance and route distance [24,28,29,31], difference between vehicle's heading change and road segments' heading change [27], and same road priority [29]. Based on the analysis of advantages of each algorithm, this study is a pioneering endeavour devoted to comprehensively considering various factors in online map matching, i.e., road level, driver's travel preference, vehicle heading, and network topology (same/adjacent road priority).
To the best of our knowledge, almost all map matching algorithms based on HMM adopt first-order HMM. e basic hypothesis of first-order HMM is that the observation probability is only related to the current state while the transition probability is only related to the previous state. Because the moving of a vehicle is a continuous process, there is a complex space-time relationship between the current state and the previous states. ere is no doubt that first-order HMM over-simplifies several practical systems. Recently, Salnikov et al. [32] explored possibilities to enrich the system description and exploited empirical pathway information by means of second-order Markov models. Experiments show that the higher-order model is more effective than the first-order model in dealing with spacetime continuum. erefore, a need is likely to exist for solving the map matching problem using higher-order (e.g., second-order) HMM to achieve better map matching results.
Along the line of previous online studies, this study proposes a new map matching algorithm based on the HMM technique.
e proposed algorithm extends the previous studies in the following aspects: firstly, the proposed novel map matching algorithm is on the basis of second-order HMM, which can better consider the space-time relationship among different states. It can be effectively applied to complex urban road network with parallel segments using low-frequency sampling GPS data. Secondly, the proposed algorithm comprehensively considers driver's travel preference towards road segments, road level, vehicle heading, and network topology when calculating the probability matrix of second-order HMM in order to improve the matching accuracy. irdly, the proposed algorithm introduces a self-adaptive sliding window mechanism. Compared to the conventional fixed window size mechanism, the introduced mechanism using a self-adaptive window size can significantly improve the map matching accuracy and has a reasonable computational performance.
In summary, the contributions of this work are threefold: (i) An online map matching algorithm based on the second-order hidden Markov model (HMM) is proposed, which can better consider the spatialtemporal relationship among different states and large perception fields. (ii) e proposed algorithm comprehensively considers driver's travel preference, road level, vehicle heading, and network topology when calculating the probability matrix of second-order HMM to improve the matching accuracy. (iii) Experiments on real-world dataset show that with the help of the self-adaptive sliding window mechanism and an extended Viterbi algorithm, our second-order HMM-based model can reach a high accuracy while ensuring efficiency. e rest of this paper is organized as follows: in the next section, we state the problem of map matching. After the problem statement, an online map matching algorithm is proposed based on second-order HMM. A case study is carried out using a large taxi trajectory dataset in Nanjing, China, to test the validity of the algorithm under various road conditions. Finally, we conclude this study and discuss directions for further research.

Problem Statement
Vehicle trajectory data are a series of GPS points recorded in chronological order. Each GPS point indicates longitude and latitude, vehicle speed, timestamp, etc. Because the errors of data collected by GPS equipment are inevitable, map matching is a key process before using the vehicle trajectory data. It is a process of matching GPS data onto the road segments and obtaining the continuous and specific locations of vehicles on the road. e concepts used in this study are listed as follows: GPS Point. A GPS point g t is a record indicating the longitude, latitude, timestamp, and velocity of the vehicle. GPS Trajectory. A GPS trajectory T is a series of GPS points. A T is showed as: g 1 ⟶ g 2 ⟶ · · · ⟶ g n . Road Network. Road network G(V, E) is a directed graph where V is the set of vertexes and E is the set of edges.
Road Segment. A road segment e is a directed edge in road network with length, road level, start vertex, and end vertex. Candidate Point.
e candidate point c n t is the nth candidate point matched with GPS point g t on the road network. Route. A route R is a sequence of road segments that matched best to a GPS trajectory T; each road segment belongs to the edge set E of road network G(V, E). R is showed as: e 1 ⟶ e 2 ⟶. . .⟶ e n .
With the above concepts, the map matching problem solved in this study can be defined as follows: find the candidate points c 1 t , c 2 t , . . . , c n t on each road segment e corresponding to GPS point g t . Select the most likely candidate points sequence for GPS trajectories T, and connect the matched road segments on network G to get route R.

Data Preprocessing.
Generally, there are a lot of "redundancy" and "incompleteness" in floating vehicle GPS data, which may be caused by devices or road environments (e.g., stopping in or passing through tunnels). In order to ensure the efficiency and accuracy of map matching, we first need to preprocess the GPS data, including the removal of redundant data and the interpolation of missing data.
For the currently received data point g t , calculate the great-circle distance [24] of g t and g t−1 (denoted as D t−1,t ); if D t−1,t is less than a predefined lower bound, the current point g t is omitted and not matched. If D t−1, t is greater than an upper bound, the two points will be interpolated linearly.
With the data preprocessing, the redundant GPS data points can be effectively eliminated to avoid unnecessary matching. At the same time, interpolation of two points with too large intervals helps to process low-frequency GPS data.

Candidate Point Selection.
For the currently received data point g t , we search for its candidate points (refer to Figure 1(a)) with the following steps: Step 1: using the R-tree index, the road segments within a predefined error circle or nearest to the point g t are selected as road segment candidates [13,17].
Step 2: vertically project the point g t on the candidate road segments, and the projection point c i t is a candidate point for g t . If the projection point falls outside the segment, choose the closer vertex of the segment as c i t . As shown in Figure 1(a), the candidate points for g t are c 1 t , c 2 t , . . . , c 5 t . e distances from g t to the candidate points are denoted as

Observation Probability.
In the first-order HMM, the observation probability is used to measure the probability of getting some kinds of observed value in a hidden state [33]. e map matching algorithms based on HMM usually regard the GPS point g t as the observation value of state t, and the actual position of g t as the hidden value of state t. e observation probability is modeled using a Gaussian distribution for GPS trajectories. e first-order HMM observation probability in this paper is obtained as where P(g t |c i t ) is the observation probability of the candidate point c i t on g t . d i t is the great-circle distance between g t and the candidate point c i t . σ t is the standard deviation of a Gaussian random variable that corresponds to the average great-circle distance between g t and its candidate points. τ is a weight given on vehicle heading, which is related to the road direction angle α road and the trajectory direction angle α GPS : In equation (2), the road direction angle α road is the direction angle of the two vertexes of a segment. e trajectory direction angle α GPS indicates the direction angle of the last GPS point and the current GPS point. Because of the bidirectional property of the road, there are two results of |α road − α GPS |, and the smaller value of the two results should be used. υ is a parameter which can be estimated with real data.
ρ is a weight reflecting the effect of road including road level (denoted as r level) and driver's travel preference for the road segment (denoted as p level): where μ is a parameter to be estimated. In this study, rlevel is within [0, 5]. A high rlevel indicates a high level of road. e value of plevel is also ranging from 0 to 5. Considering driver's travel experience as a sigmoid curve [34], plevel can be derived as where ϖ is the actual number of times drivers pass the road segment in a certain time period, and ϖ ′ is a predefined expected number. In this way, the observation probability can be obtained. By using vehicle heading weight τ and road weight ρ, we can consider road level, driver's travel preference, and the heading of the floating vehicle at that time, which are significant in online map matching with limited information. Take Figure 1(b) as an example to illustrate the merit of road weight ρ. e current GPS point g t is located in the middle of two parallel road segments. e distances from g t to c 1 t and c 2 t are the same. In conventional map matching methods, c 1 t or c 2 t is selected randomly as the real position of vehicle. However, if road level and travel preference are taken into account using our proposed method, we can consider c 1 t as the real position of vehicle. It can be seen that without subsequent GPS points, we must make full use of the information provided by existing GPS points and road network in order to improve the matching accuracy. Figure 1(c) shows the merits of incorporating vehicle heading weight τ. e GPS point g t+1 is located near the intersection, which is close to the candidate point c 1 t+1 and c 2 t+1 , and the distance d 1 t+1 is the same as d 2 t+1 . Connecting g t and g t+1 , the vehicle heading weight between the connecting line and the two segments is τ 1 and τ 2 . Considering the impact of vehicle heading weight, c 2 t+1 has a greater probability of observation, and we can suppose that c 2 t+1 is the real position of the vehicle at time t + 1.

Transition Probability.
In the first-order HMM, the transition probability measures the transition from one hidden state to another [33]. e map matching algorithm based on HMM uses the transition probability to measure the probability of moving from a candidate point c i t−1 at time t-1 to a candidate point c j t at time t [29]. e formula for calculating the transition probability of the first-order HMM in this paper is given as Equation (5): β e −s t /β , c j t and c i t−1 are on the same/adjacent road segments,  (5), we can get the transition probability with explicit consideration of network topology (i.e., considering if c i t−1 and c j t are on the same or adjacent road segments). In this way, the topological relation of road segments is taken into account. β is the mean of s t . s t is the difference between the great-circle distance from g t−1 to g t (denoted as dist(g t−1 , g t )) and the route length from c i t−1 to c j t (denoted as routeDist(c i t−1 , c j t )):

Self-Adaptive Sliding Window and Second-Order
Probability. Existing first-order HMM online map matching algorithms usually only focus on one single GPS point, considering its local geometric relation and road topology, which results in the precision of online map matching algorithm far behind the second-order map matching algorithm.
Figure 1(d) shows an example that the conventional firstorder HMM online map matching results in an incorrect match. Obviously, from GPS point g t to g t+2 , the vehicle does not turn and the correct matching path should be c t ⟶ c 2 t+1 ⟶ c t+2 . However, in the process of the firstorder HMM online incremental matching, an incorrect matching result is c t ⟶ c 1 t+1 ⟶ c t+2 . e reason for this error is that the first-order HMM only considers the observation probability of a single point and the transition probability between two points. However, the measurement of transition probability should be on a larger scale. e real location of the current GPS point is not just related to the previous point, but to multiple previous points. e higher-order HMM is an extension of the first-order HMM [35]. e basic assumption of the higher-order HMM is that the current state is not only related to one previous state but also to multiple previous states. In some cases, the second-order HMM is more consistent with the real situation, such as natural language processing, speech recognition, and so on [36,37]. For the map matching problem, because the vehicle movement is continuous, the real position of the current point is not only related to the previous High-level, high travel preference Low-level, low travel preference point but also to the trajectory formed by two or more points. erefore, the higher-order HMM is somewhat more suitable for map matching than the traditional first-order HMM. Analogous to human eyes observing things, we should first pay attention to the characteristics of things as a whole. For example, in Figure 1(d), the connection from g t to g t+2 is approximately a straight line, so the GPS point g t+1 is more likely to be matched to c 2 t+1 than c 1 t+1 . To overcome the matching errors which may be resulting from the firstorder HMM and to improve the accuracy of online map matching, in this study, we extend the first-order HMM map matching to a second-order one. Compared to the first-order HMM, the difficulties in using second-order HMM lie in the design of the probability matrix and how to improve the computational efficiency.
In the applications such as real-time navigation and travel time estimation, online map matching is necessary. e existing HMM map matching algorithms usually use the sliding window to realize online matching. Denote the sliding window size as w (i.e., number of GPS points). If the window overflows after the current point g t entering the window, the first point in the window g t−w is removed, and the matching result of g t−w point will be finally determined. As the new point continues to join, matching results within the window may be changed continuously. e introduction of the sliding window makes online map matching possible, but it is difficult to determine the window size w. If w is too large, the matching speed will be too slow to meet the realtime performance requirement. If w is too small, the matching accuracy will be compromised. To solve this problem, a self-adaptive sliding window is proposed in this study.
In this study, we consider different sizes of self-adaptive sliding window. By calculating the average value of GPS points positioning error in the current window, sliding windows of different sizes are automatically selected to adapt to the current GPS positioning error, which can improve the accuracy of the online map matching as much as possible. e average value of GPS points positioning error (denoted as E ave ) can be obtained as where c n is the candidate point which is matched to g n . e observation probability of the second-order HMM P(g t−1 , g t |c i t−1 , c j t ) can be obtained from the first-order HMM: Define the second-order HMM state transition probability (denoted as P(c i t |c j t−2 , c k t−1 )) as where λ is the mean of k t . k t is the difference between the great-circle distance from g t−1 to g t+1 and the route length from c i t−1 to c j t+1 : e second-order transition probability describes the state transition between three consecutive candidate points, that is, the actual position of the current GPS point is related to the previous two points. In this way, the strong assumption of the first-order HMM is relaxed and the accuracy of map matching is improved. In fact, we can continue to extend the proposed method to the third-order HMM and define appropriate observation and transition probabilities to improve accuracy. However, the third-order HMM will make the calculation process more complicated, which is not conducive to online map matching.

Extended Viterbi Algorithm.
In the previous sections, we introduce the second-order HMM to solve the map matching problem. Although we use the sliding window mechanism to reduce the computational complexity of matching a single GPS point, the algorithm complexity of traversing the second-order HMM is still O(n w ). Traversal search seriously affects the online performance of the matching algorithm.
us, some dynamic programming algorithms should be used to reduce the complexity. e objective function of second-order HMM dynamic programming is defined as max n�t n�t−w+3 P c i n |c j n−2 , c k n−1 × P g n−2 , g n−1 |c j n−2 , c k n−1 .

(11)
Viterbi algorithm is an efficient dynamic programming algorithm, which can effectively avoid repeated searches of path and quickly achieve the optimal solution. It is widely used to solve the first-order HMM. For solving the secondorder HMM with a complexity of O(n 2 ), we extend the traditional Viterbi algorithm [38] using an order reduction process as follows: Step 1: order reduction In the second-order HMM, P(g t−1 , g t |c i t−1 , c j t ) is regarded as the observation probability, which is equivalent to the observation probability of a single candidate point in the first-order HMM. Equation (8) shows that the observation probability of the secondorder HMM is the product of the observation probability of two consecutive candidates in the first-order HMM and the state transition probability. us, the order of the second-order HMM can be reduced by using equation (8) (refer to Figure 2). If the secondorder HMM has two layers, each layer has m and n nodes, respectively, the second-order HMM can be reduced to one layer with m × n nodes.
Step 2: recursive tracing After Step1, we can use the traditional Viterbi algorithm for iterative calculation to solve the second-order HMM in the following process (refer to Figure 2): a. Starting from the first layer's nodes, the observation probability of each layer's nodes after reduction and the transition probability between adjacent two layers' nodes are calculated. b. Calculate the maximum total probability of each node from the second layer to the last layer. Save maximum total probability and precursor node of each node. c. Select the node with the highest total probability in the last layer, and go back to its precursor node until the first layer.
With the above steps, we can find the optimal matching path (c i t−w+1 , c j t−w+2 , . . . , c k t ) in the sliding window.

Case Study
In this section, we make sensitivity analyses of the parameters involved in the algorithm, and use real data to show the merits of the proposed second-order HMM map matching algorithm.

Data Preparation and Evaluation Metric.
We used the road network data of Qinhuai District in Nanjing, China, including 6901 sections and 4647 nodes. Taxi GPS data with 30 s sampling interval collected in September 2016 were used, including 500 trajectories for 20 taxis. We manually match these trajectories to the road network as the ground truth. In order to verify the effectiveness of the algorithm under extreme conditions and reflect the advantages of the proposed algorithm, we resampled the original data and added the random noise of Gaussian distribution. e resampling intervals are 60 s to 300 s. e Gaussian noises with a standard deviation of 10 m to 80 m (convert to degrees) were added to the longitude and latitude.
Evaluation metric is defined as follows: first, we find the common matching sequence X (the sequence that matched correctly) between the matched output route M and the real trajectory T. Based on this sequence, the precision and the recall of the map matching result (denoted as pcs and rc, respectively) can be calculated as rc � X T , (13) where pcs is defined as the ratio of the length of matched sequence X and the total length of the matched trajectory M. rc is defined as the ratio between the length of the matched sequence X and the total length of the real trajectory T. In this study, F 1 − score, which is widely used to evaluate the performance of classification models and prediction models [39], is adopted to evaluate the proposed model:

Results
Effects of different parameters on map matching accuracy are investigated in this study. In the proposed model, there are three parameters to be estimated, i.e., μ, υ, and p same . According to previous studies, the approximate range of the three parameters can be obtained. Figure 3 shows the impact of different parameter values on F 1 − score and Table 2 shows the optimal parameter values. It can be seen that, when the road weight μ is around 0.02, the vehicle heading weight υ is around 0.6, and the same/adjacent road priority p same is around 0.6, and their impact on the final performance becomes optimal and stable.
Order reduction  Journal of Advanced Transportation Figure 4(a) shows the effect of window size w on the accuracy of map matching. It can be seen that when w � 3, the value of F 1 − score increases significantly. e reason is that when the size of the sliding window is larger than 3, the second-order HMM comes into play. Under different standard deviations of noise (SDNs), when the sliding window size increases from 3 to 10, the matching accuracy remains unchanged. However, as the sliding window's size increases, the computation time of matching a single GPS point increases rapidly.
us, the optimal self-adaptive sliding window sizes are 3, 4, and 5. Figure 4(b) shows the effects of the sample interval and the random SDN on accuracy of map matching. With the increase in the sampling interval and SDN, the F 1 − score decreases. It can be seen from Figure 4(b) that when the sampling interval is between 30 s and 90 s and the SDN ranges from 0 to 30 m, the F 1 − score is kept above 0.9.
With the map matching algorithm proposed in this paper, various factors (i.e., road level, driver's travel preference, vehicle heading, and network topology) are considered. Figure 5 shows some map matching results in complex urban road network environment. From Figure 5(a), it can be seen that the first-order HMM map matching algorithm may bring about mismatch when it deals with parallel road segments. Under the constraints of topological relations, the second-order HMM algorithm gives a greater transition probability to the segment, which is adjacent to the previous segment to effectively reduce errors. When the GPS points are located near the road intersection, the first-order HMM algorithm may match the GPS points to the section that intersects with the current road. e second-order HMM and sliding window can help solve this problem.
e second-order transition probability can effectively avoid the detour of matching trajectory at the intersection and improve the accuracy of map matching. Figure 5(b) shows an overview of map matching result in the central area of Nanjing, where the road network is dense and complex. e proposed algorithm is found well performed on parallel segments and intersections. is is because the second-order HMM model has a wider field of view, and our method considers a variety of factors, which is helpful for map matching in complex conditions. Figure 6(a) compares the accuracy of the proposed second-order HMM map matching algorithm with the accuracy of our baseline (the first-order HMM map matching algorithm) at different sample intervals without adding random noise. It can be seen that the F 1 − score of the proposed algorithm is higher than that of the first-order HMM. With the increase of the sampling interval, the advantages of the proposed algorithm become obvious. Taking the 300 seconds sampling interval as an example, the distance between two GPS points is about 2500 meters considering the average speed of 30 km/h on urban roads. In this situation, the position correlation between two consecutive GPS points is very low. e traditional first-order HMM algorithm only considers the transition probability between two points, so the error tends to be very large. Our proposed algorithm integrates several factors such as road level and driver's travel preference, and the second-order transition probability can match GPS trajectory on a larger scale, so it shows higher accuracy (F 1 − score is about 0.67). Figure 6(b) compares the accuracy of the proposed second-order HMM map matching algorithm with our baseline (the first-order HMM algorithm) at different SDNs with 30 s sample interval. e map matching accuracy of the proposed algorithm is always higher than that of the firstorder algorithm. e reason is that the conventional firstorder HMM algorithm only considers the difference between the great-circle distance and route distance when calculating the observation probability of candidate points. When the positioning error of GPS point increases and the road network is dense, matching errors are numerous. In     practice, the GPS positioning error is significant in city centre with dense high-rise buildings. As the proposed second-order HMM algorithm excelled conventional algorithms in accuracy (0.6 compared to 0.5 when SDN equals 80 m), the proposed algorithm can be adopted to achieve high accuracy of map matching in the whole city. When comparing with the state-of-the-art methods that are most relevant to our proposed method on the condition of raw GPS data, the results in Table 3 show that our secondorder HMM method performs well with regard to accuracy. Figure 7 compares the efficiency of the proposed secondorder HMM map matching algorithm with the conventional first-order HMM algorithm. For the first-order HMM algorithm, the sliding window size is set to 5. It can be seen from Figure 7 that the computation time at each point using the second-order HMM algorithm is slightly longer than when using the first-order HMM algorithm, and the average computation time is less than 1 s. In the process of selfadaption of the sliding window size, a small number of outliers appear. For example, using the second-order HMM algorithm, there are a few points whose the computation time is longer than 2 seconds. However, in this example, the overall matching efficiency is close to the first-order HMM map matching, which can meet the requirements of online map matching. Moreover, compared to the first-order HMM, the second-order HMM can better consider the spatial-temporal relationship among different states and larger perception fields, which can get remarkable accuracy under complex conditions.

Conclusions
Accurate and efficient matching of GPS data onto road network is the basis and prerequisite for conducting traffic flow analysis and providing location-based service. An online map matching algorithm based on the second-order HMM is presented in this paper. Various factors (i.e., road level, driver's travel preference, vehicle heading, and network topology) are explicitly considered in the algorithm, which effectively improve the accuracy of map matching in complex urban road network environment. An extended Viterbi algorithm is adopted to solve the map matching problem efficiently. A self-adaptive sliding window mechanism is proposed to adjust window size on a real-time basis and ensures high accuracy.
We tested the proposed algorithm using real road network and massive taxi GPS data collected in Nanjing, China. e proposed map matching approach was found to outperform state-of-the-art algorithms built on the first-order HMM in various testing environments. Sliding window with self-adaptive size is shown to be an effective method for online incremental map matching. Some typical types of mismatching can be avoided in complex urban road network environment such as parallel road segments and various road intersections. e map matching accuracy of the proposed algorithm is demonstrated to be higher than that of the conventional first-order HMM algorithm. e efficiency of the proposed algorithm is close to the first-order HMM map matching algorithm, which can meet the requirements of online map matching. erefore, the proposed algorithm is applicable in real-time navigation, trajectory monitoring, traffic flow analysis, and other related fields.
To solve the map matching problem, there are some other solutions such as considering driving direction and turning behaviour. e consideration of users with heterogeneous activity/travel behaviour is suggested as another interesting extension of the proposed method, potentially improving the accuracy of map matching [31,40]. In the case study, the proposed algorithm is tested using a single processor. How to incorporate the parallel computing technologies into the proposed algorithm with a large number of trajectories needs further investigation [41]. Besides, the comparison of the advantages and disadvantages of the second-order-HHM-based method and other advanced map matching algorithms can also be the focus of future research.

Data Availability
e GPS data used to support the findings of this study have not been made available because of the confidentiality agreement.

Conflicts of Interest
e authors declare that they have no conflicts of interest. Acknowledgments e work described in this paper was jointly supported by the National Key Research and Development Program of China (2018YFB1600900), National Natural Science Table 3: Comparison of the accuracy (F 1 − score) with some stateof-the-art methods. Method Accuracy HMM-DPP [28] 0.910 SnapNet [29] 0.909 is study 0.975