Coordination of Pheromone Deposition Might Solve Time-Constrained Travelling Salesman Problem

In this study, we develop two Ant Colony Optimization (ACO) models as new metaheuristic models for solving the timeconstrained Travelling Salesman Problem (TSP). Here, the time-constrained TSP means a TSP in which several cities have constraints that the agents have to visit within prescribed time limits. In our ACO models, only agents that achieved tour under certain conditions defined in respective ACOmodels are allowed to modulate pheromone deposition. The agents in one model are allowed to deposit pheromone only if they achieve a tour satisfying strictly the above purpose. The agents in the other model is allowed to deposit pheromone not only if they achieve a tour satisfying strictly the above purpose, but also if they achieve a tour satisfying the above purpose in some degree. We compare performance of two developed ACOmodels by focusing on pheromone deposition. We confirm that the later model performs well to some TSP benchmark datasets from TSPLIB in comparison to the former and the traditional AS (Ant System) models. Furthermore, the agent exhibits critical properties; i.e., the system exhibits complex behaviors. These results suggest that the agents perform adaptive travels by coordinating some complex pheromone depositions.


Introduction
M. Dorigo proposed Ant Colony Optimization (ACO) as a metaheuristic for solving combination optimization problems [1,2].ACO was inspired by indirect communications collective of real ants interacting with each other using chemical elements (pheromones).
ACO models have been applied to the Travelling Salesman Problem (TSP) which demands the shortest tour under the condition that travelling agents are allowed to transit each city only once and return to the start city.Then, ACO might be a powerful solving tool for TSP and some dynamic manufacturing problems in the real world [3][4][5][6][7].Further, it is well known that TSP is related to some scheduling problems [6,8].
The time-dependent/constrained TSP is widely studied as an important problem because, in natural conditions, the cost between any two cities can be varied based on the time evolutions [8,9].This problem can be concerned with scheduling time-dependent tasks, such as the process of scheduling manufacturing jobs.Bioinspired or metaheuristic models have been proposed for the time-dependent/ timeconstrained TSP and related problems [10][11][12].ACO models have been also proposed for such problems [13][14][15].In fact, advanced ACO models have been proposed for various problems [16][17][18].
Previous ACO models reveal that both exploiting and exploring the solution space can be an effective searching manner on the time-dependent/constrained TSP [13,14].However, those models seem to adopt the transcendental point of view, i.e., using the global best solution for the pheromone update event, which is incompatible to real ants' behaviors.As a swarm intelligent system, artificial ants must make a decision individually using limited local information.
To this end, we propose ACO models for the timeconstrained TSP in which individual agents judge whether or not they deposit pheromones after each tour.In our time-constrained TSP, several cities have to be visited within individual prescribed time spans; i.e., each agent must find an optimal tour under the constraints of visiting certain 2 Complexity cities within respective specified times.This situation means a delivery problem with specified delivery-time constraints.
The ACO models imitate positive feedback of real ants and eventually lead all the ants to a single path.However, further positive feedback might be needed for above time-constrained TSP.Real ants deposit pheromone more often when they encounter profitable resources [19,20].We therefore add similar mechanism into our ACO models and propose two different models.
In the first ACO model, agents are allowed to update pheromone if and only if they achieve a tour in which the agents visited all time-constrained cities within a specified period time and that tour was better than any tour each agent found until then.Although the system based on this rule for pheromone update attracts rapidly the agents to one solution, diversity of solutions in the system will be lost because this rule for pheromone update obstructs that the agents deviate from one solution.Real ants allow multiple food locations to be exploited simultaneously when they encounter the ambiguous situation, by upregulating pheromone deposition [21].In that sense, ants might coordinate the deposition of pheromones.
With reference to this feature, we construct the second ACO model in which agents deposit pheromones positively when they finish a tour by visiting not all cities with time constraints but some cities within a specified period of time.
The rule for the deposition of pheromone in the first model means the strict learning procedure corresponding to the requirement of the strict satisfaction of constraints in the mathematical programing.On contrast, the rule for the deposition of pheromone in the second model corresponds to the tolerant learning procedure in soft computing.We found the "upregulated pheromone" in the second ACO model could serve as a key in order to find better solutions.

Ant Colony
Optimization for TSP.Ant system (AS) is basic ACO model.Here, we demonstrate concepts of AS in this section.
Firstly, the city that the agent is assigned means the start city and goal city in circuit tour of this agent.The agent  determines next city based on the following probability: where   represents the set of cities that the agent  has not yet visited at present time in tour .Then,  and  mean the present city and candidate city as next-visit city included in the nonvisit city set   , respectively.Further,   indicates the pheromone amount between cities  and .The parameter   indicates the heuristic information defined as an inverse number of distance between two cities  and .The parameters  and  indicate the weights of the pheromone and distance function in the city selection, respectively.
After all agents finish one tour, pheromone amounts   are updated as follows: ,  (,) ∈   () , 0, ℎ. ( Here,   () represents the circuit tour length that the agent  travelled in tour , and   () indicates a set of city pairs on the circuit route that the agent  travelled in tour .Further,  represents the pheromone evaporation rate satisfying 0 <  < 1, and  indicates the number of agents.

Definition of the Time-Constrained TSP.
We apply ACO models to the time-constrained TSP.There is the classical time-dependent TSP as a version of TSP in which the transition cost between one city and another city depends on the period of the day [8].In this paper, however, we introduce the time-constrained TSP where several cities have respective time constraints where agents must visit such a city within a prescribed time as a different version of the time-dependent TSP.For example, delivery workers must arrange a time to deriver if costumers specify the delivery time.Here, we introduce time-constrained cities as the following manners.
In the beginning of each trial, the start city is randomly chosen.All agents are arranged on the same start city.Thereafter, several cities are randomly chosen as time-constrained cities. Agents must visit those cities in limited time duration.More specifically, chosen time-constrained city  is assigned following value as limited time duration: where  #, indicates the distance between the start city of agents and the constrained city  and "#" means a tentative number of the start city.Further,  indicates random number satisfying [0.0, 1.0].When the agent  visits the time-constrained city , it is regarded that the agent visits that city within a specified time only if its tour length up to that time is smaller than limited time duration of that city.

Details of Our Models.
Here, we describe the details of our models.The first one is named as the strict ACO model for time-constrained TSP.In the strict ACO model, agents are allowed to add pheromone on their paths only if they visit all time-constrained cities within limited time duration and update their own best-so-far solution.The second one is named as the tolerant ACO model for time-constrained TSP.In the tolerant ACO model, agents are tolerated to add pheromone positively on their paths if they visit several cities out of all time-constrained cities within limited time duration and update their own best-so-far solution.
We explain submodels for tour iteration of each ACO models.Please note that we adopt synchronous updates in respective ACO models.Step 1 (city selection).The agent  chooses a city from a set of unvisited cities using (1).
Step 2 (on time?).If the agent  choses a time-constrained city and succeeds to visit that city within limited time duration, then where    describes the number of time-constrained cities that the agent  succeeds to visit within limited time duration.Then update its position.

Results
We solved the time-constrained TSP using two symmetric TSP dataset (eil 51, berlin 52) from the TSPLIB.In this paper, we only used benchmark data described distances as Eucrid-2D (EDGE WEIGHT TYPE: EUC 2D).Please also see Table 1 for parameters in respective ACO models.Table 2 presents the number of trials in which each model performs the best solution (including equivalence) among three models.Here, we focused on tours in which the agents succeeded to visit all time-constrained cities within a constrained time.These data were obtained from 100 trials.We found that the tolerant ACO model performed better than other two models (vs. the strict ACO model, chi-squared test, eil51:  2 = 10.39, = 1,  < 1.00 − 03, berlin52:  2 = 4.18,  = 1,  < 0.05, vs. the AS model, chi-squared test, eil51:  2 = 34.22, = 1,  < 1.00 − 03, belrin52:  2 = 45.79, = 1  < 1.00 − 03).Please note that we replaced the parameter   from 3 with 2 in case of berlin 52 because the former could not show any significant differences between the tolerant ACO model and strict ACO model ( 2 = 2.21,  = 1  = 0.14, ).
Figure 1 represents the tour interval between two consecutive pheromone depositions of individual agents, which followed a power-law distribution (Akaike Information Criterion (AIC) weights for the power-law against the exponentiallaw = 1.00, n of data = 69,  = 1.39).Here, we used the dataset from eil 51.This result suggests that the tolerant ACO model differs from the regular-transition models in which the agents switch rules of pheromone deposition regularly.
Finally, we would like to comment on parameter effects by conducting additional analysis using eil 51 datasets.The upper raw of Table 3    raw of Table 3 represents the number of trials in which each model performs the best solution among three models when   = 4 is replaced with   = 6.The tolerant ACO model is again not inferior to any other models (vs.strict ACO, Chi-squared test,  2 = 0.035,  = 1,  = 0.85, , vs. AS, Chi-squared test,  2 = 14.98  = 1  < 1.00 − 03).These results suggest that the tolerant ACO model performs to solve time-constrained TSP flexibly to some extent.

Discussion
In this paper, we developed the ACO models to deal with the time-constrained TSP.We proposed two different models.The one was the strict ACO model in which the agents deposited pheromone if and only if agents found a tour that all the time-constrained cities were visited within limited time duration and that tour was better than any tours individuals achieved until then.The other one was the tolerant ACO model in which the agents sometimes deposited pheromone positively even if they did not achieve above tours strictly.We found that the latter model output better solutions compared with the former model and the AS model.
It is known that the ACO models fall into a local solution [22].To overcome such a problem, our models imitate the system of real ants.Ants seem to deal with the overcrowd population on a certain path by modulating pheromone deposition [21].Agents appear to face difficult problems that they cannot judge whether obtained information is profitable for their system or not.In that sense, agents exploring the different possibilities might enable to prevent their system from being attracted to a local solution.We found that the interval between two consecutive pheromone depositions followed a power-law in the tolerant model.Complex evolutions regarding the interval between pheromone depositions might be essential to achieve an optimal tour.While obeying a power-law tailed distribution, agents might often deposit pheromones on certain tours but occasionally stop depositing pheromones.Such a balance regarding pheromone depositions enables the system both to attract a local solution and to deviate that solution.
The tolerant ACO model is not inferior to any other models when some parameters are replaced.However, we might be able to improve our proposed model when considering parameter effects.Proposing ACO models in which agents modify their own parameters adaptively would enable the system to perform flexibly in various conditions.That will become an issue in the future.

Figure 1 :
Figure 1: Tour period between any two consecutive pheromone depositions for agents.

Table 1 :
Parameters for the calculation.

Table 2 :
The number of trials of the shortest tour of each model.Here, we focus on the tours in which all time-constrained cities are visited within limited time duration.One hundred trials are conducted.

Table 3 :
The number of trials of the shortest tour of each model by replacing a certain equation.Eil 51 was used for these analyses.One hundred trials are conducted.