Research on Large-Scale Road Network Partition and Route Search Method Combined with Traveler Preferences

Combined with improved Pallottino parallel algorithm, this paper proposes a large-scale route search method, which considers travelers’ route choice preferences. And urban road network is decomposed into multilayers effectively. Utilizing generalized travel time as road impedance function, the method builds a new multilayer and multitasking road network data storage structure with object-oriented class definition. Then, the proposed path search algorithm is verified by using the real road network of Guangzhou city as an example. By the sensitive experiments, we make a comparative analysis of the proposed path search method with the current advanced optimal path algorithms. The results demonstrate that the proposed method can increase the road network search efficiency by more than 16% under different search proportion requests, node numbers, and computing process numbers, respectively. Therefore, this method is a great breakthrough in the guidance field of urban road network.


Introduction
Path optimization is one of the most key issues in traffic flow guidance filed, and it is mainly used for vehicle positioning and system navigation.As we know, travelers need some route information to assist them to get to their destinations.So path guidance technology has been developed which not only can provide travel routes but also relieve the traffic congestion.
The optimal path search theory [1] is the core of the path optimization.There are a lot of relevant researches on the path search algorithms and classic calculation methods around the world.For example, Dijkstra algorithm, A * algorithm, and ant colony algorithm are mature algorithms.But all the above algorithms are generally applied to small-scale road network.Currently, the advanced search path algorithms were improved via parallel computing [2] in order to solve the large network path problem (LNPP).So some optimized algorithms such as Dijkstra algorithm with approximation barrel structure, Pallottino algorithm, and Dijkstra algorithm [3] with quad heap structure have been proposed.These mentioned improved path search algorithms can meet the needs of search speed of large-scale road network path optimization.However, it rarely considers the travelers' drive habits and preferences, which leads to the path optimization results which are often not accepted by travelers.That is, the travelers' obey rate will drop, and guidance effects cannot be presented.
In view of above-mentioned insufficiency, we consider the drive habits and preferences in order to ensure implementation of the real-time path optimization of large-scale road network.And we construct a traffic network simulation system and a multilayer and multitasking network data storage structure in which parallel data is shared according to the traffic network data [4,5].We also adequately analyze the interactional relationship between road network decomposition and shortest path calculation.And then a Hierarchical Pallottino parallel search algorithm model has been proposed in accordance with the data storage and parallel computing technology.The proposed model can dynamically adjust the search level in terms of travel distance.And the multilayer Pallottino search algorithm is run separately in subnets, achieving real-time interaction within boundary nodes via information transmission among the adjacent subnets.Consequently, a global optimal solution can be obtained.

Insert temporary label node
Insert not access node

A Hierarchical Pallottino Parallel Search Algorithm Model Based on the Data Storage
Pallottino algorithm adopts priority deque structure to calculate the optimal path.Deque algorithm [6][7][8] is a combination of two first-in first-out (FIFO) queues, Q 1 and Q 2 .Q 2 has a higher priority than Q 1 .As for a node, if its state numeral is not accessed, then this indicates it has a higher priority and it will enter the candidate node set from the end of Q 2 ; if its state numeral is temporary, then this indicates it has a lower priority and the candidate node will enter from the end of Q 1 .During the process of route search, the nodes of Q 2 are always firstly searched for achieving the two-queue search mechanism and the fast iterative correction of node labeling [9].When removing nodes from the candidate node set, the deque needs to be detected whether it is empty or not.If the queue Q 1 is not empty, then it will delete nodes from the front of Q 1 ; if the queue Q 1 is empty, it will continue to detect whether the queue Q 2 is empty or not.If the queue Q 2 is not empty, then it will delete nodes from the front of Q 2 .Operate the above cycle after deleting a node.The operation does not end until the queue Q 2 is empty.The schematic diagram of deque data structure is shown in Figure 1.
In daily life, a sizable number of drivers prefer to select the main sections to travel.However, when driving for a near distance, they may not select the main road [10].Therefore, a multilayer road network data storage means is utilized to improve the Pallottino search algorithm, and a Hierarchical Pallottino parallel search algorithm is put forward.When the driver travels for a far distance, the proposed algorithm should be used to search the main road node layer.When the driver travels for a near distance (when the number of the cells between the origin and destination is less than or equal to 1, it is regarded as "the travel distance is near"), it needs to search both the main road node layer and secondary road node layer.

Urban Road Network Partition and Optimization
2.1.1.Urban Road Parallel Data Structure.The data structure of urban road network [11] is a key point of solving path problem, which directly determines the difficulty and timeliness of path algorithm procedures.When organizing network data, it not only invokes conveniently and fast, but also reduces the occupation of memory resources and expresses clearly in order to facilitate data calls and program checks.In terms of urban road traffic network, its corresponding data structure has the following three characteristics: (a) the data is so large, and the program needs to call and access it frequently; (b) the topological relations of road network are clear and the relationship between the nodes and the sections is fixed; (c) there is concurrency when the data structure is accessed by the program.
The basic element of traffic network data system is composed of car, section, route, intersection, clock, and so forth.Furthermore, every traffic network element and road network structure all can be considered as a single object.This paper will adopt C++ to set up traffic network simulation system and shared parallel data structure through objectoriented class definition (shown in Figure 2), to meet the demand of the following Hierarchical Pallottino parallel search algorithm.
The above different types of member variables and member properties are shown in Algorithm 1.
After defining various types of entities in network structure, for every type of entity, a single object is just one unit of traffic network objects set.Therefore, multitasking concurrent accessing data could be implemented through establishing shared data structures so as to improve the speed of accessing data, satisfying the timeliness requirement of large-scale road network guidance.

Network Structure Optimization Considering Drivers'
Characteristics.As we know, drivers often choose the road they are familiar with or the urban trunk roads as their priority travel routes.In view of the characteristics, we should give priority to search and calculate this layer of the road network to ensure the speed of calculation, which is particularly important for the large-scale road network.
In this paper, urban road network is divided into multiple levels [12,13].The roads preferred by drivers are taken as main roads to constitute the main road network which divides the urban road network into small zones.In addition, a series of minor road networks are constituted by the road network of every zone (like zone K, M, and L in Figure 3).With regard to doing route optimization, the final route optimization network is made up of the minor road network within the scope of drivers' origin zone or destination zone and the main road network.Thus searching road network will be simplified and the speed of searching improved.
As for the road impedance function, there are five popular items including traveling time, driving distance, congestion degree, road quality, and comprehensive cost [14][15][16].In view  of above ideas, real-time dynamic generalized travel time could be defined as road impedance (1) in this paper.There are several ways to determine the road impedance data, which reflects different user requirements and system control strategies.Considering the circumstances, the final route is not the absolute optimal one in accordance with generalized road impedance in the whole road network, but the result of quasi-drivers changing directions based on optimization principle.
Consider the following: The formula above is the most common dynamic road impedance form, which indicates the average time that vehicles take on the th road section, containing travel time and parking delay time (including intersection delay).
In order to avoid common problems caused by the road network partition, this paper adopts the thought of layered optimization.Firstly, the main roads and the minor roads are separated into two layers.The intersections of the main roads are defined as main nodes and the intersections of the minor roads are defined as minor nodes (Figure 4).The triangle symbol indicates the node of the main nodes layer and its number is equal to the number of intersections of the main roads in the whole map.According to the network layered perspective, removing the intersections of the main roads and the minor roads will greatly reduce the time complexity in the process of route calculation, thereby the calculation speed will be improved.
In order to solve the problem that drivers may not choose the main roads when the driving distance is relatively short, this paper adopts the calculating method of neighboring zones union.If the origin and the destination are in the same  or the neighboring zones, this paper will take the nodes of the same or the neighboring zones as a union and search the union during the process of calculation.In principle, if there are excessive nodes in zone, it will affect the speed of route calculation.Thus in the process of zone division, the scope of zones should be reduced as much as possible in order to reduce the number of nodes.Figure 5 shows the route calculation results incorporated into the principle.

The Hierarchical Pallottino Parallel Search Algorithm
Model.According to the above mentioned road network data storage and network decomposition method, a large-scale network will be decomposed into several subnetworks [17] whose number is the same as that of the processor.We use the Hierarchical Pallottino search algorithm in every processor.
At this moment, it cannot obtain the information of adjacent nodes of boundary nodes, so the global optimal solution cannot be obtained.Therefore, while using the Hierarchical Pallottino search algorithm for parallel computing, the paper uses the way of information transfer to send the boundary node's real-time information to the processor.And the processor receives the information sent from the adjacent subnet so as to complete the boundary node label iterative correction and obtain global optimal solution.The specific calculation steps are as follows.
Step 1.According to the starting point  and the terminal point  position that user inputs, processor could analyze the road node layer and generate a road barrier matrix .Define the node collection of cell  as (), the node collection of the main road as , and the road network node collection as .
Identify the starting point  and the terminal point  that user inputs and determine their cells , and , respectively.If the starting point and the terminal point are located in the same cell, then set  = (); if they are not in the same cell, then set  =  and incorporate the nodes of the cells  and  into .If there is only one cell  between the starting point and the terminal point, then the nodes of cell  will be incorporated into .
Step 2. Label each node  in the search network node collection.There are three numerals which are length numeral   , predecessor node numeral   , and current node status label   .Length numeral   indicates the accumulated weights from the starting point to the point .Predecessor node numeral   is the previous direct point when the shortest distance between the starting point  and the point  obtains the maximum value.Current node status label   contains unvisited temporary labels and permanent labels.
Step 4. Initialize the queues Q 1 and Q 2 when calculating the shortest path from arbitrary starting point  to all other nodes , and we can set Step 5. Judging the condition of queue Q, if it is empty, then go to Step 7, and if not, go to Step 6.
Step 6.Judge whether the formula (3) is established or not for each adjacent node  of node  in the nonempty queue Q where  , is the section weight between nodes  and .If the formula (3) is established, the following formula will be set: If node  is not in the Q and   is not accessed,  will be inserted into the end of the Q 1 and   will be updated to a temporary label.If node  is not in the Q and   is temporary mark state,  will be inserted into the end of the Q 2 .
Step 7. In current subnet, observing each adjacent node  of node  and judging whether it is the shortest path, we can write   >   +  , . (5) If ( 5) is established, the node  will be added to its subnet message queue, and message queue will be sent to all adjacent subnetworks.
Step 8. Judge the condition of message queue.If it is empty, go to Step 10, and otherwise, go to Step 9.
Step 9.Each subnet sends a message queue to adjacent subnets and accepts feedback from other subnets, and each received adjacent node  which is adjacent to node  in the current process is calculated as follows: If ( 6) is established, then the following formula will be set: If node  is not in the Q and   is not accessed, then  will be inserted into the end of the Q 1 and   will be updated to a temporary label.If node  is not in the Q and   is temporary mark state,  will be inserted into the end of the Q 2 .
Step 10.Each processor sends the results of shortest path calculation to the host processor, and the algorithm ends.

Experimental Analysis
In order to verify the optimization results of the proposed Hierarchical Pallottino parallel search algorithm based on data storage, the paper uses C++ and OpenMP parallel programming method to design and develop parallel computing program of large-scale road network, utilizing client/server/(C/S) mode for setting up parallel computing platform.We test the proposed method based on real road network data of Guangzhou.This traffic network includes 11,489 nodes and 18,364 paths.

Road Network Decomposition.
Combined with the characteristics of drivers route choice, Guangzhou road network (central urban area as shown in Figure 6(a)) is divided into the following areas, and the dynamic topological relations of each subinterval (the corresponding dual connectivity graph) are shown in Figure 6(b).
According to the corresponding dual connectivity characteristics, it can be illustrated that the selected network, to some extent, possesses the small-world property, and road network has both smaller characteristic path length and larger clustering coefficient.Furthermore, fewer obstacles of road network are reflected from the small-world property, which has a good accessibility.According to the calculation results of network characteristic value and characteristic judgment conditions of small-world network, combined with the characteristics of the network and the connectivity of the intersection, we are convinced that it is the basis for evaluation of road network structure and the optimization strategy.Due to limited space, this paper is not to research this part.Decomposition of Guangzhou actual road network is shown in Figure 7.In order to ensure the executive effectiveness of algorithms, the paper adopts both undecomposed and decomposed networks to analyze algorithms.In the undecomposed network conditions, we separately compare them in different conditions (i.e., 1 : 1, 1 : N, and N : N) and search results are shown in Table 1.

Shortest Path
Table 1 shows that the search speeds of the proposed HTWO Q algorithm and TWO Q algorithm are basically equivalent in the standard network.In 1 : 1 search, the computation times of DIKQH and DIKBA algorithms are superior to the proposed algorithm, but their calculation times are inferior to the HTWO Q algorithm and TWO Q algorithm in N : N search.Due to the advantages of the deque data extraction and storage, the path search times of HTWO Q and TWO Q algorithms are superior to the other two search algorithms by more than 5% in 1 : N search.
Under the decomposed network conditions, we use the parallelization algorithms to calculate the search time by computing cluster under the processes 8 and 10.The calculative results are shown in Table 2.
From Table 2 we can see that the search time of the proposed HTWO Q algorithm is obviously superior to those of the other three algorithms in 1 : N and N : N search, but it is inferior to DIKQH and DIKBA algorithms in 1 : 1 search.In the actual urban guidance system, 1 : 1 path search has limitations, while 1 : N and N : N search could satisfy the dynamic requirements of traveler and have more extensive application value.
During the path search, it is obvious that different numbers of network nodes will affect executive time of algorithm.In order to better illustrate the calculation time of these algorithms under different vertices, the algorithms' executive efficiency is compared and analyzed, respectively, in 4000, 8000, and the whole road network nodes.Results are shown in Figure 8.When road network data is stored in the way we have mentioned, the time of multiprocessor parallel computing optimal path (or global optimal solution) is closely related to the number of processes.Then, we compare the algorithms, respectively, using 6, 8, and 10 processes, and the results are shown in Figure 9.
Figure 8 demonstrates that the search speeds of HTWO Q and TWO Q are similar within 4000 nodes' network, which are lower than those of DIKBA and DIKQH.However, when searching in 8000 nodes and in the whole network, the proposed HTWO Q algorithm is superior to the other three algorithms, and its operating efficiency is 16.32% faster than that of the other three ones.
As shown in Figure 9, with the increasing of processes, the path-searching efficiency is not upward trend.By contrast, it firstly increases and then decreases, which is concerned with consumption of communication time when parallel algorithms are running.The result indicates that when process number is 8, the search efficiency of road network can be greatly enhanced.

Conclusions
Considering travelers' preferences, the purpose of this paper is to solve large-scale road network path optimization problem.With the way of object-oriented classified definition, a simulated traffic network system and parallel data structures have been constructed.To analyze the relationship between data storage and network decomposition, a multilayer Pallottino parallel search algorithm is proposed.Then, series of sensitive experiments are conducted to verify the algorithm under actual road network environment.Finally, the results show that the search time of the proposed optimization    method is significantly better than those of DIKBA, TWO Q, and DIKQH during the time of running 1 : N and N : N patterns.Calculated results can not only meet the real-time demand but also satisfy the preferences of travelers, which can effectively improve the obey rate of guidance service and have positive impacts on development of intelligent guidance systems.

Figure 2 :
Figure 2: Traffic network simulation systems and shared parallel data structure.
Parallel Computing Experiment.On the traffic network, the paper separately applies the proposed Hierarchical Pallottino parallel search algorithm (HTWO Q), Dijkstra algorithm based on approximate barrel structure (DIKBA), Pallottino algorithm (TWO Q), and Dijkstra algorithm based on improved quad-stack structure (DIKQH) to calculate shortest path search time within all pairs based on the weights of generalized travel time.

Figure 7 :
Figure 7: Decomposition of actual road network of Guangzhou.

Figure 8 :
Figure 8: Computation times comparison of the four algorithms under N : N search pattern.The -axis represents numbers of vertices (unit: thousand) and the -axis represents computed times (unit: ms).

Figure 9 :
Figure 9: Computation times comparison of the four algorithms under multiple processes.The -axis represents numbers of processes and the -axis represents computed times (unit: ms).

Table 1 :
Comparison of path search times under initial network environment.

Table 2 :
Comparison of path search times under Hierarchical network environment.