Can the Agent with Limited Information Solve Travelling Salesman Problem ?

Here, we develop new heuristic algorithm for solving TSP (Travelling Salesman Problem). In our proposed algorithm, the agent cannot estimate tour lengths but detect only a few neighbor sites. Under the circumstances, the agent occasionally ignores the NN method (choosing the nearest site from current site) and chooses the other site far from current site. It is dependent on relative distances between the nearest site and the other site. Our algorithm performs well in symmetric TSP and asymmetric TSP (time-dependent TSP) conditions compared with the NN algorithm using some TSP benchmark datasets from the TSPLIB. Here, symmetric TSP means common TSP, where costs between sites are symmetric and time-homogeneous. On the other hand, asymmetric TSPmeans TSP where costs between sites are time-inhomogeneous. Furthermore, the agent exhibits critical properties in some benchmark data.These results suggest that the agent performs adaptive travel using limited information. Our results might be applicable to nonclairvoyant optimization problems.


Introduction
Travelling Salesman Problem (TSP) is well-known as one of the combinational optimization problems of finding the best solution out of a finite set of possible solutions.To solve complicate problems, for example, TSP, there is heuristics as optimization algorithms inspired by real systems of animate beings [1].It appears that, in such optimization algorithms, the advanced intelligence, such as sophisticated cognitive ability and predictive ability, is demanded [1][2][3].However, real organisms such as insects and cells appear to solve complicate optimization problems by using only simple rules [4,5].This fact implies that real organisms do not need high ability of spatial cognition for solving complicate optimization problems.Therefore, we can see that agents in heuristic inspired by real systems of animate beings do not need high ability of spatial cognition.
The nearest neighbor (NN) method is the famous method for solving TSP by limited cognitive abilities based on only local information.Agents obeying the NN method select always the nearest site from current site.In this sense, the NN method can be regarded as the algorithm using only neighbor sites.The NN method has been conveniently used for a long period, since it can derive reasonable solutions by only local information.Actually, it appears that animate beings use the NN method to solve TSP such as a foraging activity through repeatedly visiting a series of locations [6].
However, the NN method cannot necessarily output suitable solutions in TSP [7].To obtain a suitable solution in TSP, agents might need to occasionally ignore the NN method.Actually, animate beings appear to flexibly change movement strategies based on local environmental situations [8][9][10].For example, bees appear to use a NN strategy at small spatial scale when neighbor resources are close to each other, but to flexibly use more efficient optimization strategies at larger spatial scales when neighbor resources are far from each other [10].
It is known that individuals in living systems coordinate their behaviors using local information and behave adaptively in response to various environments [11].In addition to that, they appear to adopt negative information and make a profit for a long-term span.This mechanism is also applied to the cognitive model of human [12].It seems that each individual 2 Complexity generates an action that is required for not a short-term span but a long-term span [13].
With reference to such an action of animate beings, we developed a heuristic algorithm for solving TSP, in which individuals occasionally tuned their rules based on local environments while they basically obeyed the NN method.In our heuristic algorithm, it is assumed that agents can estimate topological distances to a few neighbor sites, where topological distances are regarded as neighbor distances [14].Agents in our heuristic algorithm change site-selection rules to estimate not local efficiencies but global efficiencies for shortening total travel length.If two sites belong to neighbor sites and relative distances between these sites and current site are not much different, agents have a tendency to select farther site than closer site.Selecting farther site means that agents adopt a negative strategy in the short run.
Because we assumed that agents in heuristics could not recognize global tour lengths, agents had to make decisions only using local information.Here, we developed two different experimental conditions.One was simple symmetric TSP.The other one was asymmetric TSP.Here, symmetric TSP means common TSP, where costs between sites are symmetric and time-homogeneous.On the other hand, asymmetric TSP means TSP where costs between sites are time-inhomogeneous.Particularly, we suppose that the energy spent in the moving from the current site to the next site is increased with a load, that is, the number of sites which the agent had already visited.Such a situation has been investigated in the single machine scheduling problems with sequence dependent setup times by Bigras et al. [15].In this situation, the agent in heuristic needs to select farther sites in the beginning of its tours due to the energy spent increased with the load.Then, unlike symmetric TSP which has the time-homogeneous travel cost, the increment travel cost has been considered in asymmetric TSP.Simulating increment travel cost condition analogizes with the situation of animate beings that agents collect cargos at each visit to a location [16].We confirmed that agents showed effective tours in both conditions of symmetric TSP and asymmetric TSP.

Model Description: Rule Change Algorithm.
In this algorithm (Rule Change (RC) algorithm), the agent firstly follows the NN algorithm on each time step.In a word, the agent tends to select the nearest neighbor site.However, the agent selects the other site if following situation is satisfied: If  1 >  max × rn, then the agent chooses the farthest site among n-neighbor sites, where rn is random number satisfying rn = [ratio, 1.00],  1 is the distance between current site and the nearest site for the agent,  max is the distance between current site and the farthest site among n-neighbor sites for the agent.
Here, the aim of this research is to investigate that agents enable achieving optimal searching as a result by using negative cues.To this end, we allowed agents to choose either the nearest or the farthest of the neighbor sites.

Simulation Experiment.
We solved symmetric TSP and asymmetric TSP using several dataset from the TSPLIB [17].In this paper we only used benchmark data containing less than one hundred cities and described distances as Eucrid-2D (EDGE_WEIGHT_TYPE: EUC_2D).We adopted two experimental conditions.One was simple symmetric TSP.The total cost obtained after one tour was dependent on each route length the agent followed.The other one was asymmetric TSP by considering the agent's crop capacity.Here, "agent's crop capability" in this paper meant the load increasing due to time duration from a current site to the next site.The energy spent for time duration from a current site to the next site was increased with the load, that is, the number of sites which the agent had already visited.In this paper, we defined cost between current site and next site for the agent as (time steps) × (distances between these two sites, i.e., route lengths).Therefore, even though the agent followed same routes, total cost obtained after one tour would be different depending on whether the agent rotated clockwise or anticlockwise.
We developed three control algorithms for comparison; the NN algorithm, the FN algorithm, and the FN 0.5 algorithm.In the FN (farthest neighbor) algorithm, the agent always selected the farthest site among n-neighbor sites.In the FN 0.5 algorithm, the agent selected the nearest/farthest site among n-neighbor sites with probability 0.5/0.5.All algorithms were used for both experimental conditions (symmetric/asymmetric TSP).
We set  agents for -site benchmark problems in all algorithms.Each agent was, respectively, assigned to every site as starting site.We conducted one hundred trials using each algorithm.

Simulation Procedure:
The RC Algorithm versus the NN Algorithm.We supposed local neighbors for the agent based on topological distances.The parameter ratio and  were initially set to 0.90, 2 respectively (ratio = 0.90,  = 2).By setting  = 2 as initial value, we set the agent's detection ability as low-capacity detection.Also, it might be natural for the agent to consider that distances between the nearest site and farthest site among n-neighbors are not much different when ratio was close to 1.00.We therefore set ratio = 0.90 as initial value.When the RC algorithm output worse solutions than the NN algorithm, we changed parameters as the following manners until we could obtain better solutions using the RC algorithm than using the NN algorithm: (ratio = 0.90,  = 2) → (ratio = 0.80,  = 2) → (ratio = 0.90,  = 3) → (ratio = 0.80,  = 3) → ⋅ ⋅ ⋅ → (ratio = 0.80,  = 5).
When we could not get better solutions by using the RC algorithm even after (ratio = 0.80,  = 5), we stopped calculations and concluded that the RC algorithm output worse solutions than the NN algorithm.In respect to the RC algorithm versus the FN/FN0.5 algorithms, we used the same set of parameters (ratio, n) as those of the RC algorithm versus the NN algorithm.
On each trial, summation of total cost from each agent was calculated as where (total cost)  means total cost of each agent ,  = 1, 2, . . ., .Then, we obtained average summation by conducting 100 trials as follows: where summation  indicates summation of total cost in trial .

Results
Tables 1 and 2 illustrate the results.As seen from those tables, the RC algorithm performed better solutions compared with the NN, FN, and FN 0.5 algorithms for each experimental condition (symmetric/asymmetric).All data were compared using Mann-Whitney U test.Then, the  value is defined as the probability of obtaining a result equal to or different from what was actually observed, when the null hypothesis that the average of summation of total cost based on the RC algorithm ≥ the average of summation of total cost based on other algorithms is true.Note that the RC algorithm could not exhibit dominantly better solutions than the NN algorithm for symmetric experimental condition when rat99 was used as benchmark data.However, based on the relation of 146101 < 146122 and  value = 0.35 in average summation of total cost from each agent in the case of rat 99, it seems that the RC algorithm could exhibit relatively better solutions than the NN algorithm.In consequence, it could be concluded that the RC algorithm is superior to the NN, FN, and FN 0.5 algorithms from the standpoint of summation of total cost.We also evaluated whether or not critical behaviors could be achieved in the RC algorithm.It is known that systems possess the critical property that those can be adaptive to various environments due to the balance of the phase transition between order phase and disorder phase [18].Systems in order phase seem to exhibit stable behaviors while they seem not to be able to adjust to the change of environment correctly.On the other hand, systems in disorder phase seem to be at random.In our algorithm, order phase indicates obeying the NN method, while disorder phase means the condition that systems ignore the NN method.We examined critical values of the parameter ratio regarding matching rate of sequences of tours [18,19].Matching rate was defined as similarities of sequences of tours between two consecutive trials randomly chosen from 100 trials.For example, matching rate between tour-sequence 123 and toursequence 312 is 0 for 3-site problems.Here, ratio was initially set to 0.1 and replaced with other values by adding +0.1 until reaching 1.0.Figures 1 and 2 show the average matching rate from all agents in respect to each ratio.As you can see, critical values were roughly corresponding to selected values in both experimental conditions (see Tables 1 and 2).Examples of matching rate for each agent are also shown (Figures 1(f) and 2(f); data from st70/pr76 are shown).Sequences of tours were sometimes dramatically changed.However, note that the average matching rate against data from rat99 was omitted in Figures 1 and 2 since the RC algorithm could not exhibit dominantly better solutions than the NN algorithm in the case of rat99 under symmetric experimental condition.

Discussion
From experimental results for symmetric TSP and asymmetric TSP (time-dependent TSP) conditions using several TSP datasets, it is seen that the RC algorithm performs better than the NN, FN, and FN 0.5 algorithms.In the RC algorithm, the agent sometimes ignores the NN method and selects the farthest site among n-neighbor sites, which is dependent on relative distances between the nearest site and the farthest site.Actually, performance decreases when the agent determines whether or not it always selects the nearest/farthest site (the NN/the FN algorithms) or it randomly selects either nearest or the farthest site (the FN 0.5 algorithm).Interestingly, critical behaviors emerge in some benchmark problems.It implies that the agent in the RC algorithm will constantly exhibit adaptive behaviors even if environmental situations are changed [18].The agent in the RC algorithm appears to occasionally follow other routes while it frequently follows fixed routes, which are the same routes as the routes in the NN algorithm.The agent in the RC algorithm might exhibit better performances in several conditions (experimental conditions and various benchmark problems) due to its transitions between order phase and disorder phase.
The TSP can be approximated by nonliving and living systems [4-6, 20, 21].However, global information is often required in order to solve TSP [1][2][3].The NN method is the first-person perspective algorithm and therefore does not require any global information.In that sense, that algorithm can be regarded as a simple bioinspired algorithm.Actually, it is revealed that animals obeying the NN method can solve TSP [6].However, it is also true that living systems solving TSP ignore sometimes the NN method [4].We developed the first-person perspective algorithm in which individual agents coordinated their transition rules by using local information by considering that those systems tune their decision-making at agent-level.Our proposed algorithm offers solutions via local neighbor information.The RC algorithm also performs better than the NN algorithm for asymmetric TSP.Here, asymmetry is defined based on the agent's crop capacity.The more the sites the agent needs to visit, the more the loads it must take naturally.Our algorithm might be suitable for solving optimization problems connected to decisionmaking problems or transportation problems.
Travelling Salesman Problem can be applicable to solving for some problems such like vehicle routing, clustering, and job-shop scheduling [22].Using the NN method for solving these problems might give optimal solutions to comparative easy problems [23].However, it should not be applied when solving strongly NP-hard problems or tasks having more complex processes [24].Furthermore, some approximate algorithms for those situations might require global knowledge like the processing time [25].Contrary to that, we assumed that the agent could not estimate global tour lengths.Using local information, the agent tried to exhibit adaptive solutions.This point will be advantageous for solving nonclairvoyant problems such like the evacuation route problems.

Figure 1 :
Figure 1: Matching rate of successive two trials regarding ratio (symmetric TSP).(a) Data from eil51.(b) Data from berlin52.(c) Data from st70.(d) Data from eil76.(e) Data from pr76.(f) An example of matching rate distributions for each agent (data from st70 are shown).The smaller the ratio is, the more frequently the agent chooses the furthest site.Matching rate becomes worse at that situation.

Figure 2 :
Figure 2: Matching rate of successive two trials regarding ratio (asymmetric TSP).(a) Data from eil51.(b) Data from berlin52.(c) Data from st70.(d) Data from eil76.(e) Data from pr76.(f) An example of matching rate distributions for each agent (data from pr76 are shown).The smaller the ratio is, the more frequently the agent chooses the furthest site.Matching rate becomes worse at that situation.

Table 1 :
Results of the RC algorithm versus the NN, the FN, and the FN 0.5 algorithms for symmetric TSP.First column indicates benchmarks.The parameter n indicates the number of n-neighbors.The parameter ratio determines the similarity between travel distances to the nearest site and to the farthest site among n-neighbors.Further, average summations of total cost from each agent in respective algorithms are shown.Further, note that bold p values indicate that the RC performs better than control algorithms under Mann-Whitney U test.

Table 2 :
Results of the RC algorithm versus the NN, the FN, and the FN 0.5 algorithms for asymmetric TSP.
First column indicates benchmarks.The parameter n indicates the number of n-neighbors.The parameter ratio determines the similarity between travel distances to the nearest site and to the farthest site among n-neighbors.Further, average summations of total cost from each agent in respective algorithms are shown.Further, note that bold p values indicate that the RC performs better than control algorithms under Mann-Whitney U test.