^{1}

^{1}

^{2}

^{1}

^{1}

^{1}

^{2}

We present a driving route prediction method that is based on Hidden Markov Model (HMM). This method can accurately predict a vehicle’s entire route as early in a trip’s lifetime as possible without inputting origins and destinations beforehand. Firstly, we propose the route recommendation system architecture, where route predictions play important role in the system. Secondly, we define a road network model, normalize each of driving routes in the rectangular coordinate system, and build the HMM to make preparation for route predictions using a method of training set extension based on

Currently, many drivers use different kinds of navigation software to acquire better driving routes. The main function of vehicle route recommendation in the software is to find several routes between given origins and destinations by combing some path algorithms with historical traffic data, for example, Google Map and Baidu Map. And then a driver could select one of those recommendation routes according to personal preference, driving distance, and current road congestion information. People usually would like to choose routes with more smooth roads. However, the above methods for driving route recommendation have some problems. Firstly, more people would like to choose routes with many smooth road segments. Thus, the original relatively smooth roads will become congested and the original congested roads will become smooth. Secondly, once a route is selected, the software could not timely inform the driver to adjust the route according to real-time traffic congestion data as the trip progresses. Finally, most of traffic route navigation software programs rely on historical data to predict traffic congestion [

In view of the above problems, a driving route recommendation system is proposed and highlights a method for driving route predictions based on the knowledge of Hidden Markov Model (HMM). The method can predict which road segments are congested or smooth through route predictions. The system will also update traffic information in real time in the near future and inform the driver to adjust the driving route as the trip progresses.

At present, several methods of route prediction have been suggested, but there remain some problems. Karbassi and Barth [

This paper is organized as follows. The next section describes the architecture of our route recommendation system and explains each module in the system. Section

The architecture of the driving route recommendation consists of the following phases (see Figure

The architecture of route recommendation system.

This section will give details on how to build a road network model in the rectangular coordinate system. The connection relationship between roads is followed strictly in the model. And it should reflect the difference between roads as large as possible.

Assume that each road

In the rectangular coordinate system, the rule for a road network model construction composed of different road segments is represented as follows:

If and only if

If and only if three different roads

Five roads intersect at a point.

Three different roads intersect at three points.

The length of each line segment is defined as follows: the length of the line segment

Therefore, as shown in Figure

An example of the road network model construction.

Suppose that the starting point of the vehicle route is

For example, the line connecting point

A path between points

It is necessary to train the HMM from drivers’ past history. In particular, the larger the size of training examples is, the more accurate the HMM for path predictions is. In view of the limitation of given training examples, the training set cannot contain all of routes that drivers will take in the future. So the paper proposes a method of extending training examples based on

After analyzing the given training examples, it is found that starting and endpoints of vehicle routes are distributed in residential, commercial, and work areas. People usually go to work from residential areas in the morning and then go back from work areas or they will first go to commercial areas and then go home. Therefore, it is believed that vehicle routes are generally regular in some extent so that a path can be regarded as two return paths. In addition, it is also found that when traffic reaches its peak, a driver will generally avoid congested roads and select a route with the shortest time to the destination. In other times, drivers will select the shortest distance to the destination to save costs. For a beginning and end of a path, it is able to generate two kinds of routes according to different times.

Last, it is not sure how many clusters the coordinate point set

The algorithm of extending training examples based on

Initialize coordinate point sets

Traverse a given training set

Use the

Traverse each cluster

Input: A training set

Output: The extending training set New

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

(13)

Since it is necessary to input a driver’s just-driven path represented by coordinate points into a HMM and then output future entire paths, coordinate points’ sequence corresponding to the just-driven path can be regarded as an observation sequence and the corresponding sequence composed of different route sets can be regarded as a hidden state sequence

In (

Besides, the definition of hidden states is relatively more complex than observation states. At first, assume that each hidden state is defined by

The algorithm for hidden state determinations is as follows (see Algorithm

Initialize a hidden state sequence set QS (Line 1).

Obtain a beginning point

Compute the cosine value of intersection angle between vectors

If

After calculating all of the hidden state sequence, insert each hidden state sequence

Input: A training set

Output: A hidden state sequence set QS.

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

After determining observation states and corresponding hidden states in the HMM for route predictions, our method uses the total training dataset Total

The following equation is used for the initial probability distribution:

where

The following equation is used for the hidden state transition matrix:

where

The following equation is used for the confusion matrix:

where

As described above, our method could build the HMM for vehicle route predictions. But drivers would like to choose different vehicle routes from a starting point to an endpoint during different time of each day. For example, people hope to reach the end during the rush hour (7:00~9:00 A.M. and 17:00~19:00 P.M.) as quickly as possible and try their best to avoid congested roads. But at other times people may choose the shortest route to drive. Therefore, training examples can be classified according to the time of day. A group of training examples is from 7:00~9:00 A.M. and 17:00~19:00 P.M., and another is from other times. Section

The aim of this section is to introduce how to predict upcoming routes based on just-driven road segments. The solution to this problem is corresponding to a HMM decoding which is to discover the hidden state sequence that was most likely to have produced a given observation sequence. Here, the Viterbi algorithm [

The process of driving route prediction.

Perhaps it will encounter some problems in the process of implementing Viterbi algorithm. The total training set, including the given and extending training examples, is still so limited that it could not fully contain all of possible upcoming vehicle routes. Assuming that the upcoming route does not occur in the total training set, which means

Suppose that

Traverse the observation sequence

Define that

If

If

If the input observation sequence

Suppose that

Suppose that

Suppose that

In addition to the above cases, suppose that

Input: An observation sequence

Output: A set

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

(13)

(14)

(15)

(16)

(17)

(18)

(19)

(20)

(21)

(22)

(23)

The function Viterbi_Route is described as follows (see Algorithm

Use Viterbi algorithm to calculate the hidden state sequence

Define that the number of elements in the hidden state sequence

If

Calculate the intersection between

(1) Hidden state sequence

(2) int

(3) if (

(4)

(5) else

(6) for (int

(7) if (

(8)

(9) else

(10)

For example, if two hidden states are separately

Every vehicle should be equipped with a device for collecting vehicle route data. And data collectors use a mobile phone with software Map Plus. We mainly focus on one of functions, path tracking, to record down the path of driving. It runs in the background, while someone could run other apps or lock the device at the same time. It also can export or send tracked paths as KML files. However, continued use of GPS running in the background can dramatically decrease battery life of mobile phone. So the experiment also needs an external large-capacity battery to support the phone continuously. In addition, researchers install the software Google Earth on the computer to present each of collected vehicle routes.

A total of 20 volunteers are selected for the purpose of collecting the experimental data. In order to facilitate the communication between volunteers and us, all volunteers are from our university, including 15 teachers and 5 students. A month later our researchers finally acquire a total of 1052 paths, where the number of different routes is 51. The same path is the journey that volunteers start from a point to the end through the same road segments. But in the process of the data collection, there are some problems inevitably.

In tunnels, underground parking, and high-rise dense areas, the phenomenon that part of paths are offset from GPS noise will appear [

Volunteers forget to open the software for recording route data, resulting in collecting route data unsuccessfully.

Volunteers forget to turn off the software when they drive to the end, resulting in the path to be relatively concentrated in a small area.

Once researchers come across the above problems when checking path data, we will manually correct the GPS data. In summary, the experimental results can overcome the influence of GPS noise and human factor to ensure the accuracy of the collected data.

In the actual process of collecting the GPS data, collective data do not only focus on the longitude and latitude but also combine the GPS data of the starting point, the middle, and the end with road segments, describing the route as a path that is made up of the starting and endpoints and driven streets.

To evaluate the performance of route predictions based on HMM, a metric to explore is the correct prediction accuracy based on driven process. Suppose that a vehicle has passed through

For example, assume that the total training examples are shown in (

In the experiment, all of collected route examples are from the software Map Plus, where each route is included in a .KML file composed of a series of GPS data. Researchers check these data in a certain time period through Google Earth. According to previous description of the road network model, routes represented by GPS data points could be changed into ones represented by coordinate points.

Besides, some extending training examples are introduced here. These examples are extended from original collected data through a method to enlarge the training set based on

Figure

The trip data overlaid on two maps, one of original data (a) and another of original data and extending data (b).

Finally, the composition of test training examples is illustrated in detail. To test the prediction accuracy of our prediction algorithm, our method should acquire part of real-world vehicle route data. Here the method applies a leave-one-out approach [

Figure

The performance of our prediction algorithm and Jon Froehlich’s algorithm.

Correct prediction rate of all trips by percent of trip completed

Correct prediction rate of repeated trips by percent of trip completed

Correct prediction rate of all trips by miles driven

Correct prediction rate of repeated trips by miles driven

For test examples (i), Figure

For test examples (ii), Figures

Figure

Figure

Our algorithm’s sensitivity to time of day.

This paper firstly presents a driving route recommendation system, where the prediction module is the core of recommendation system, thereby giving details on a method to accurately predict a driver’s entire route very early in a trip. Then, a road network model was defined and normalized each of driving routes in the rectangular coordinate system. The method also builds HMMs to make preparation for route prediction using a method of training set extension based on

As a direction of the future work, the improvement will be from two points: (i) investigate to enhance the Laplace smoothing technique to suit HMM for driving route predictions; (ii) apply the statistics method to make Viterbi algorithm work with unknown coordinate points.

The authors declare that there is no conflict of interests regarding the publication of this paper.

The research is support by National Natural Science Foundation of China (nos. 61170065 and 61003039), Peak of Six Major Talent in Jiangsu Province (no. 2010DZXX026), China Postdoctoral Science Foundation (no. 2014M560440), Jiangsu Planned Projects for Postdoctoral Research Funds (no. 1302055C), and Science & Technology Innovation Fund for higher education institutions of Jiangsu Province (no. CXZZ11-0405).