Solving Large-Scale TSP Using a Fast Wedging Insertion Partitioning Approach

A new partitioning method, called Wedging Insertion, is proposed for solving large-scale symmetric Traveling Salesman Problem (TSP). The idea of our proposed algorithm is to cut a TSP tour into four segments by nodes’ coordinate (not by rectangle, such as Strip, FRP, and Karp). Each node is located in one of their segments, which excludes four particular nodes, and each segment does not twist with other segments. After the partitioning process, this algorithm utilizes traditional construction method, that is, the insertion method, for each segment to improve the quality of tour, and then connects the starting node and the ending node of each segment to obtain the complete tour. In order to test the performance of our proposed algorithm, we conduct the experiments on various TSPLIB instances. The experimental results show that our proposed algorithm in this paper is more efficient for solving large-scale TSPs. Specifically, our approach is able to obviously reduce the time complexity for running the algorithm; meanwhile, it will lose only about 10% of the algorithm’s performance.

The algorithms for solving the TSP can be categorized into two main paradigms: exact algorithms and heuristic algorithms.The exact algorithms are guaranteed to find the optimal solution in an exponential number of steps.The wellknown exact algorithms are branch and cut algorithms [20], with which large TSP instances have been solved [21].The major limitation of these algorithms is that they are quite complex and have heavy requirement of computing time [22].For such reason, it is very difficult to find optimal solution for the TSP, especially for problems with a very large number of cities. Heuristic algorithms attempt to solve the Traveling Salesman Problem focusing on tour construction methods and tour improvement methods.Tour construction methods build up a solution step by step, while tour improvement methods start with an initial tour then try to transform it into a shortest tour.
In the past decades, a major technological breakthrough was obtained with the introduction of metaheuristics, such as simulated annealing, tabu search, genetic algorithms, ant colony optimization, particle swarm optimization, variable neighborhood search, and neural networks.These mentioned algorithms have the possibility to find their ways out of local optimization [23,24].
Since there are so many algorithms for the solution, we have to evaluate the quality of each algorithm.A general way is to compare the obtained routine using the algorithm with 2 Mathematical Problems in Engineering the optimal solution, through which we could judge the quality for the obtained solution; that is, the closer the better to the optimal solution.However, we do not know the optimal solution for the TSP instances in many cases, especially for the very large number of cities in the era of big data nowadays, with a variety of real-world applications ranging from multimedia [25][26][27][28][29][30][31][32][33][34][35] to mobile sensing [36][37][38][39][40][41][42][43][44][45][46][47][48][49].Therefore, there will be an alternative and commonly used method: for each specific instance, we use a certain method to obtain the lower bound for the optimal solution, and then we evaluate the quality of the algorithm based on the lower bound, which is called Held and Karp bound (HK bound) [2][3][4].
There are lots of applications for the TSP problems in the real-world, also along with several variations of TSP.What we concern about is called Symmetric TSP (STSP), which is the type of most widely used TSP.In this kind of TSP, the distances between any two nodes are symmetric and equivalent: that is, (, ) = (, ).Another one is Asymmetric TSP (ATSP), where (, ) ̸ = (, ), while we would not discuss this kind of problem in this paper.One thing that should be mentioned is that we focus on geometric TSP in two-dimensional space, because at present the most widely applied one is this kind of TSP with geometric information, and a considerable portion of the algorithms are designed according to this kind of problem; accordingly our proposed algorithm in this paper is such kind of TSP.
The rest of this paper is organized as follows.In Section 2, we briefly introduce the construction methods.Section 3 briefly reviews several types of partitioning methods.In Section 4, we detail the first stage of our method.For the second stage the proposed algorithm is described in Section 5. Section 6 shows the experimental results, and we finally make the conclusion in Section 7.

Construction Method
There are a variety of solutions for TSP, which include several kinds of construction methods.The construction method is starting from a city, adding the unvisited cities into the tour constantly, and the process will be ended until all the cities have been inserted.
Insertion method is a typical kind of construction methods.Firstly, it selects a city randomly from all the cities and finds the nearest city.Furthermore, it inserts the unvisited city into the constructed sequence according to different rules.Finally, the method stops until all the cities have been inserted into the sequence.Insertion methods can be divided into these four categories by selected different rules: Nearest Insertion (NI), Cheapest Insertion (CI), Farthest Insertion (FI), and Random Insertion (RI).For the NI, it selects the nearest city for the next city to be inserted; for the CI, it chooses the next city with the shortest added distance after inserting; for the FI, it selects the farthest distance city from all nearest cities as the next one; for the RI, it chooses the inserted city randomly.Among these insertion methods, the performances of FI and RI are better than others, which averagely exceed about 15% compared to the HK bound, while the performances of NI and CI are a little worse with around 20%.
Spacefilling Curve [50] is another heuristics construction method for TSP.It firstly sorts the nodes in the order of their appearance along the spacefilling curve and then visits these nodes in the plane to make the short tour.The time complexity of this method is ( log ), and the space complexity is ().

Partitioning Method
When the scale of TSP becomes larger, in order to improve the running time or solve the memory limitation issue, it is necessary to utilize partitioning methods for solving largescale TSP.The partitioning method firstly divides the cities to several segments.Secondly, it finds the shortest tour for each segment.Finally, the partitioning method connects each segment and obtains the final tour.There are two kinds of partitioning methods: one is Geometric Partitioning and the other is Tour-Based Partitioning [16][17][18].
Karp's Partitioning Heuristic (Karp) is proposed in [10].For the construction of K-d tree, even though both of the subsets of cities contain the median city by the cut, through the median cities, we start to recursively divide the cities into parts using cuts from horizontal and vertical direction.When any set of the partition has no more than  cities, which is a setting parameter, the entire of recursion process will be stopped.By employing the DP (dynamic programming) approach proposed by Bellman [11], derived from the sets of cities, we cope with the subissues by optimal way in the final partition.At last, by utilizing the shared medians, we recursively fix the solutions along with fixed , which is timeconsuming parameter.
In Strip, firstly we partition the rectangle of minimal enclosure into √/3 equal-width vertical strips, secondly for each strip order the cities by top-bottom way, and thirdly we obtain the tour leveraging the process running the strip from the leftmost to the rightmost, alternately traveling around by up-and-down way, with finally the edge being from the last city in rightmost strip to the first in leftmost strip.Obviously, we can see that Strip's tours could be as much as Ω( √ ) times optimum in the worst case.For equivalently distributed node ones in the unit of square, the anticipant Strip's tour length could be no more than 0.93 √ , and for these instances no more than 0.71 √  over Held-Karp bound, which indicates that for these instances Strip's anticipant excess should be less than 31% [6][7][8].
Fast Recursive Partitioning (FRP) has been proposed in [9].This begins with the rectangle of minimal enclosure; next recursively partition each rectangle, which includes more than 15 cities, into two rectangles with half of numerous cities approximately.If the rectangle of parent one is longer than it is wide, in the rectangle we could find the median 0-coordinate of cities and make a vertical partition at the  value; likewise, we could find the median -coordinate and make a horizontal partition at the  value.Call the final rectangles, all including 15 or less cities, buckets.For all the buckets, we can construct nearest neighbor tours, which are then fixed along to make an overall tour.

The First Stage-Partitioning
The first stage of the algorithm is to partition the cities by the rule of avoiding any cross for all the partitions.

The Simple Instance.
As shown in Figure 1, we suppose the recent TSP instance includes only 4 cities, and any of the 3 cities are not in a straight line.The shortest way is always A-B-C-D-A, no matter how the distances between 2 cities change.
Now, we add one more city based on the 4 cities (we suppose that the added city would not replace any one from the mentioned A, B, C, and D cities; thus, it will not be any corner for the new bitmap).The order in tour for the added city will be related to its position.If it is above the line connecting between the upper left corner (city A) and the upper right corner (city B), there would be no doubt that the order is A-E-B-C-D-A; otherwise, it forms crosses.Similarly, if the city is on the right side of the line connecting between the upper right corner (city B) and the bottom right corner (city C), the order should be A-B-E-C-D-A.If the city happens to be below the line connecting between the bottom left corner (city D) and the bottom right corner (city C), the order will be A-B-C-E-D-A.If it is by the left side of the line connecting between the bottom left corner and the upper right corner, the order will be A-B-C-D-E-A.The above-mentioned situations are under the conditions that added 4 cities are all outside the convex.While the node could possibly fall in the convex, which obviously will make things a little bit complicated, we could confirm its position by a simple search and then insert it in the available tour order to make the added distance be the shortest of the whole circle.Figure 2 shows all the five possible situations.
While there are more than 5 cities, let us suppose there would be hundreds of or thousands of cities (we also suppose that the added city would not replace the position of any one from the mentioned A, B, C, and D cities), insertion proceeding would became complicated.However, we will always solve all the problems with the same method: putting all the added cities to the tour, while they are always located the two corners of the 4 ones, no matter how they would be inserted.
Let us explain it by another way: actually the whole tour is divided into four segments by four corner nodes.According to the clockwise direction, the first segment of the 4 segments starts from the upper left corner city and finally reaches the upper right corner city; the 2nd segment begins from the upper right corner city and ends at the bottom right corner city; the 3rd segment starts from the bottom right corner city and reaches the bottom left corner city finally; the 4th segment begins from the bottom left corner city and at last arrives to the starting city.
The conclusion works for any instance of TSP.It means any kind of TSP tour is combined by 4 segments, and we could get the final TSP circle by connecting the beginning node and the ending node of the four segments.

Wedging Insertion Partitioning
Algorithm.The details of the proposed algorithm for wedging insertion partitioning are summarized as follows.
Step 1. Get the bitmap closest rectangle.
Step 2. Search for corner nodes.
Step 3. Judge all nodes which is inside node or outside node.
Step 4. Sort inside nodes and outsides.
Step 5. Obtain the four segments.
Step 6. Sort all preinsert nodes then insert them in turn.
Step 7. Determine the optimal segment and its position.
Let us suppose the bitmap to be solved including  cities.In the first step, obviously the time complexity is ().In the second step, the time complexity is () for finding the corner nodes.In the third step, judging all nodes which is outside nodes, the time complexity is ().In the fourth step, the time complexity is ( lg ) if we run for fast-speed making sequence for the outside nodes and inside nodes, and the time complexity is ( lg ).In the fifth step, the time complexity is the same as ( lg ) if we run FI for making arranging for the inside nodes and outside nodes, respectively.In the 6th step and 7th step, first we need to find out the inserting position for each inside node, and the time complexity is ( 2 lg ).After analyzing process above, we can get the time complexity which is ( 2 lg ) for the whole algorithm.Due to the reason that it takes extra stage space only for the making sequence for the inside nodes and the outside nodes, we can get the result that the space complexity is () for the algorithm.

The Second Stage-Construction
During the first step of construction, the purpose for using the wedging insertion method is to reduce as much as possible crosses both between their segments and inside each segment; hence, we get the four segments and obtain the loop finally.However, it is obvious that the tour we have is not yet the optimal one.As we can see, too much peaks are formed inside the segments.Fortunately, there is no cross between the segments, we just need to reduce the peaks of each segments and then we could obtain an optimal result.
The reason for forming peaks is that we utilize the wedging-shape insertion method.During the process of wedgingshape insertion method, we only consider the relationships between the coordinates but without thinking of distance factors.
Therefore, we are inspired that we just need to change the sequence orders for the inside nodes of each segment; then the length could be optimized, so we only need to reconstruct each segment (the starting node and the ending node kept the same but only change other nodes' sequence position).We could use those abovementioned methods, while constructing each segment, different from that, and previously we search for the shortest tour, but now we look for the shortest route among the confirmed nodes' sequence, whose starting and ending nodes are known in advance.
Since the solution is the shortest route for each segment instead of the whole routine, we need to reform the known construction algorithms.For Inserting method as an example, since the starting node and the ending node are determined, we only need to select the nodes to insert among the starting node and the ending node by different methods (e.g., the Nearest Inserting method and the Farthest Inserting method) during the inserting process, until all the nodes in the segment have been inserted.Once the reconstruction for each segment is completed, we can achieve the final tour after we connect the starting node and the ending node of each segment.

Experiments
In this section, we present our experimental testbed and setup, and evaluation of our proposed algorithm and compared methods.
6.1.Experimental Testbed and Setup.We conduct the experiments which are carried out by MATLAB.The hardware platform is ASUS portable computer equipped with Intel Core Duo 2 CPU 2.2 G and Memory 2 G.The total number for testing instances is 12 from TSPLIB with more than 1000 cities.In the testing, we record the results of two time: one is the total time, including the time of loading data and calculating the distance matrix; another one is the practical running time.

Evaluation.
First of all, we evaluate the partitioning tour for three instances, which is shown in Figure 3. From the results, we can draw the observation as follows.We obtain four segments in testing instances, and each segment does not cross with others and itself.
Furthermore, as shown in Table 1, we evaluate the running time of our proposed wedging insertion algorithm in comparison to four baselines: FRP, Strip, Spacefilling Curve, and Karp.We choose 12 instances from TSPLIB to test these algorithms, including more than 2000 cities.From the results, we can see that our proposed algorithm obtain the advantage on computational cost.
Besides, we evaluate the performance of our proposed wedging insertion algorithm with different insertion rules, that is, Wedging Insertion-Nearest Insertion (WI-NI), Wedging Insertion-Cheapest Insertion (WI-CI), Wedging Insertion-Farthest Insertion (WI-FI), and Wedging Insertion-Random Insertion (WI-RI).From the results in Table 2, we can observe that the maximum running result of our proposed  algorithms with different insertion rules exceeds 42% than the HK Bound, and the minimum one exceeds 7% than the HK bound, while most of them are around 20%.Finally, we evaluate the performance of the compared algorithms (Strip, FRP, and Karp) with Average Percent Excess over the HK Bound, as shown in Table 3.
From the results, we can draw the conclusion that, although there are differences for the running ranges of the algorithms, the performance of our proposed wedging insertion partitioning algorithm with four construction rules outperforms the compared algorithms for the same instance.

Conclusions
In this paper, we propose a novel partitioning approach to solve large-scale Traveling Salesman Problem (TSP).According to our experimental results, we can conclude that wedging insertion partitioning method is a kind of effective partitioning methods, it enables us to reduce and save the running time effectively when combining with other insertion construction methods, while it will lose only around 10% of the algorithm's performance.In the near further, our work can try to exploit parallel techniques [51][52][53] to accelerate the algorithm efficiency and further enhance the performance by fusing with other algorithms, such as 2-Opt, 3-Opt, LKH local search, or other construction algorithms.

Figure 1 :
Figure 1: The tour of four nodes.

Figure 2 :
Figure 2: The tour of five nodes.

Table 2 :
Running results for WI-NI, WI-CI, WI-FI, and WI-RI (average percent excess over the HK bound).