Advanced Phasmatodea Population Evolution Algorithm for Capacitated Vehicle Routing Problem

,

As a novel population-based algorithm, PPE was introduced in 2020. It has some merits, e.g., the principle of the algorithm is simple and easy to implement. However, there are also some disadvantages, e.g., it is time-consuming, it has low precision, and it falls into the local solution easily. erefore, the improvement of PPE is a challenging and meaningful study.
VRP was proposed by Dantzig and Ramser in 1959 and has been developed for many years [14], and as a branch of VRP, the related research of CVRP was very popular in transportation. CVRP refers to some customers with known demands served by some vehicles with certain capacity limits.
CVRP is an important issue. On the one hand, it is a practical problem because many real-world problems can be abstracted into CVRP. For example, in a restaurant, a waiter needs to serve multiple tables to meet customers' order needs. When only one waiter is considered, the problem is abstracted into CVRP. Another example is that the heart of the human body supplies blood to other organs, and the blood vessels are regarded as pathways. After simplification and hypothesis, this process can be modeled as a CVRP. On the other hand, CVRP is also a challenging scientific issue. VRP is an NP-hard problem [15,16] that is difficult to a use precise algorithm to get the optimal solution in finite time.
Many scholars have studied CVRP. In 2006, Wu et al. used a new real number encoding method of PSO for solving VRP [17]. Chen et al. proposed a novel hybrid algorithm for CVRP, and in the hybrid algorithm, discrete PSO searches for optimal results and simulated annealing are used to jump out of the local optimum [18]. In 2009, Ai and Kachitvichyanukul proposed two solution representation methods, namely SR-1 and SR-2, for CVRP [19]. In 2015, Zhang and Lee proposed a routing-directed artificial bee colony algorithm to solve CVRP [20]. In 2019, Altabeeb et al. put forward an improved hybrid firefly algorithm for CVRP, and it was tested by 82 instances [21]. In 2020, Khairy et al. put forward the enhanced group teaching optimization algorithm to solve CVRP, and it was tested by 14 instances with promising time results [22]. In 2021, Fu et al. proposed the parallel equilibrium optimizer algorithm to solve CVRP [23].
In this paper, the proposed APPE deletes competition, deletes conditional acceptance and corresponding evolutionary trend update, combines a novel jump mechanism, and adds a history-based searching for population's position update and the population closing moving method for early phase searching. en, the algorithms are tested by CEC2013 [24]. We also apply APPE to solve CVRP. e following is the remaining of this paper. e CVRP model is described in section 2. Section 3 introduces PPE. APPE is described in section 4. In section 5, the experiments of CEC2013 functions and CVRP are described. In section 6, a conclusion is given.

CVRP
e objective of CVRP is to minimize the sum of distance. e path taken by each vehicle satisfies the capacity constraint. ere is only one warehouse or depot. e model of CVRP is as follows: s. t.
Formula (1) is the objective function. Formulae (2) and (3) describe the decision variables. Formula (4) shows the vehicle's capacity constraint. Equation (5) guarantees that every customer is served only once. Equations (6) and (7) ensure that one vehicle serves one custom. Equation (8) ensures the continuity of the route so that every vehicle coming in from the customer point would go out from that point, as well as back to the depot. Formula (9) is to eliminate the subloop.

PPE
PPE is inspired by the evolution process of the stick insect population. PPE initializes Np solutions randomly like other population-based algorithms. Every solution x has two properties, the first property is population number p, and the second property is population growth rate a. ev reflects the current evolution trend. en, calculate the fitness value, and find the global optimum, which is denoted as g best. e first k best solutions are stored in Ho, and k is equal to log(Np) + 1.
Next, in iteration, the new position is the sum of old position and evolution trend, which is as follows: where t is the current generation, and x t means the current solution's position. en, calculate new solutions' fitness and update the global optimum gbest and Ho.
When the new solution's fitness is better, the current solution is absolutely replaced with it. Growth rate a, population quantity p, and population evolution trend ev are updated as follows: Journal of Advanced Transportation In equation (13), the population evolution trend ev consists of three parts. In the first part, s(Ho, x t ) means the nearest solution to x t in Ho.
is part reflects similar evolution. e second part preserves the inertia of evolutionary trends. e final part is the mutation.
When the new solution's fitness is worse, the current solution is conditionally updated by the new solution. If a generated random number in [0, 1] is less than the population number p, namely acceptance probability, the worse solution is accepted, and growth rate a and population quantity p are conditionally updated as equations (11) and (12). e evolution trend ev will change, irrespective of whether conditional acceptance probabilities are met, as follows: where st t+1 controls the exploration range. c initializes to 2, and cis updated by using (16), when the new solution of the algorithm is a worse solution.
Competition also affects population evolution trend. When two solutions' distance is less than G, competition will occur. Population quantity p and population evolution trend ev are updated in competition as follows: where x j is a random selected solution from Np − 1 solution. e distance from x i is less than that from G. PPE's pseudocode is shown in Figure 1.

Advanced PPE
PPE also has shortcomings, such as low convergence precision, high consumption of time, and it falls into local optimum easily. Given the shortcomings of PPE, we propose APPE, which deletes competition, deletes conditional acceptance and correspondingevolutionary trend update, and adds jump mechanism, history-based searching, and population closing moving.

Without Competition.
e competition mechanism exists in many algorithms. e imperialist competitive algorithm is based on imperial competition, in which weaker empires collapse and stronger empires take over colonies [25]. In the PPE, the competition mechanism is also considered. When the distance between the two solutions is too close, the population quantity p and population evolution trend ev will be updated. However, when the algorithm converges to the global or local optimal solution, this competition mechanism will affect its convergence tendency. At the same time, the calculation of particle distance takes time. erefore, we remove the competition mechanism, thus effectively reducing the algorithm's running time. It is equivalent to deleting the di st(xj , xi) < G part of the PPE pseudo-code.

Without Conditional Acceptance and
Corre-spondingEvolutionary Trend Update. Some algorithms adopt the method of conditionally accepting a worse solution to improve the algorithm diversity, such as simulated annealing [26,27]. Similarly, PPE will adopt a worse solution with probability, increasing the algorithm diversity and also increasing the algorithm running time. erefore, we delete the conditional acceptance and corresponding evolutionary trend update of the PPE to save time consumption and maintain a certain convergence trend. It is equivalent to deleting the f(newx) > f(x) part of the PPE pseudo-code.

Jump Mechanism.
We introduce a kind of jump mechanism, increasing the algorithm's probability of jumping out of local optimum. When the algorithm enters the late iteration, the optimal solution will either keep the INV generation unchanged or the standard deviation of the optimal solution in the INV generation will be less than the threshold value INVGate, and we let the solution jump by the following formula: where x t+1 n is a t + 1 generation solution modified by the jump mechanism to replace the nth solution. x t n is a selected solution from population. g best is current generation's optimal solution. r 0 is the random number, and it obeys uniform distribution from 0 to 1. In this paper, we use the jump mechanism to modify Jump Num � 5 solutions, i.e., the worst solutions from the second to the sixth. When the probability is greater than r Jump and the current population is not less than 8, we delete the worst solution. Moreover, when the number of Ho is more than the current population number, the number of Ho is reset again by k � log(Np) + 1, and the redundant poor solutions are deleted as well.

History-Based Searching.
Most swarm intelligence algorithms use the current information to interact with the Journal of Advanced Transportation information of the last generation, and the information before the last generation is lost, however, this information will also affect the convergence performance of the algorithm. erefore, we design a history-based container HA, whose capacity is HAnum.
If the container is not full, the optimal new solution generated in each round adds to the container, which is the best particle of x t+1 generated by equation (10). Otherwise, the global optimal solution g best of the previous generation is used to replace any solution in the container. Meanwhile, in the second half of the iteration, if the random number generated is less than 0.5, the worst new solution generated in each round is used to replace any solution in the container, which is the worst particle of x t+1 generated by equation (10).
Based on the above container, we design a history-based searching method, which is used to update the population. e history-based searching formula is as follows: (20) where x t+1 HA is a new solution generated by history-based searching method, x HA best is a solution randomly selected from the top 5 better solutions of HA. x H Arandom is randomly selected from HA. x t i means the ith solution of the current iteration, and x t p and x t p are the randomly selected solutions from the population that are different from x t i . r 1 is an indirect random number, and r 1 � 0.1 tan(π(r 3 − 0.5)). r 2 and r 3 are the direct random numbers, and they obey uniform distribution from 0 to 1. In this paper, the population position update of APPE combines PPE's update with history-based searching, and the detailed equation is as follows: if t is less than H Anum or, rand is less than < 0.5, otherwise, where t is the current generation. When a random number rand is less than 0.5 or the current generation is no more than HAnum, the population position is updated by equation (10) or equation (20).

Population Closing
Moving. Swarm intelligence algorithm in the early stage is mainly exploration. Here, we use HA, and in the first half of the iteration cycle of the algorithm, when the random number is greater than 50%, the whole PPE population is moving toward the HA solution to produce new solutions. e moving formula is as follows:  (11), (12), (13) Update ev i use (14), (15) and (16) Randomly choose a solution where x t and x t+1 HA are the whole population, r dn 1 and r dn 2 are random number matrices that follow standard normal distribution, and ran d is the random number matrix that follows uniform distribution from 0 to 1. e number of rows in x t , x t+1 HA , r dn 1, r dn 2, and rand matrix is the population size, and the number of columns is the dimension. e * symbol is the multiplication of the corresponding positional elements of the matrix and is neither matrix multiplication nor convolution. Notice that 2 * r dn 1 means that every element of the matrix r dn 1 is multiplied with 2. en, we compare the new solution by equation (21)with thenew solutiongenerated by moving method (18) and retain the excellent solution as the new solution.
4.6. APPE. In this paper, we combine the above methods to form APPE. e following are the detailed steps of APPE.
(1) Initialization: Np stick insect populations x t are initialized, and p, a, and ev of each stick insect are initialized to 1/Np, 1.1, and 0, respectively. Initialize the parameter k of Ho, and k � log(Np) + 1. Initialize parameter INV � 4 of jump mechanism and threshold value INV Gate � 10 − 6 . Initialize parameter HAnum of history-based archive HA and HAnum � Np. Initialize current iteration t � 1 and the maximum iteration MAXGENS. Calculate fitness f(x t ). Update the global best solution gbest and global best value gbestval. If t � 1, initialize Ho with the current best k solutions. e global best solution gbest is put into HA. (2) Jump mechanism: If the global best solution gbest has kept the INV generation unchanged and the algorithm enters the late iteration or the standard deviation of the optimal solution in the INV generation is less than the threshold value INVGate, then the jump mechanism is applied. Using (19) to modify worse solutions from the second to the sixth. If a uniform random number is larger than probability r Jump and the current population number is not less than 8, then delete the worst solution of PPE population and HA solutions. Moreover, when the number of Ho is more than the current population number, the number of Ho is reset again by k � log(Np) + 1, and the redundant poor solutions are deleted. x are generated by equation (22).
Calculate fitness f(PCM t+1 x ). en, compare the new solution generated by equation (22) with the new solution generated by equation (21), the good one is reserved as current iteration's new solution. (5) HA and Ho update: If HA is not full, HA is filled one-by-one with optimal new solution of each round. If t > HAnum, the g best of each round replaces the random one solution of HA. When the number of iterations is more than half, the algorithm has a 50% chance of randomly replacing one of the HA solutions with the worst new solution per round. Update the global best solution g best and the global best value g bestval. If t > 1, the current best k solutions are comparing with the corresponding solution of Ho, the good ones replaces the solution in Ho. (6) Absolutely accept: When the new solution's fitness is better than that of the old one, accept it, and update p, a, and ev of each stick insect by equations (11)-(13). (7) Parameter border check: reinitialize the corresponding stick insect. (8) Termination: Steps 2 to 7 are repeated until reaching the maximum generation. Finally, record the best fitness gbestval and the best solution g best.

Setting for CVRP.
We adopt the SR-1 method for solution representation, which means that the dimension of the PPE's particle is n + 2 m [19]. e first n dimensions of the solution are related to the customer, and each dimension represents the corresponding customer. e smaller the value of this dimension, the higher the priority of the customer.
us, the priority list of the customer can be obtained. e last dimension of 2 m represents the coordinates of m vehicles. According to the distance of coordinates, the priority matrix of vehicles for each customer can be obtained. Arrange customer points to vehicle route one byone according to customer priority order. For a customer point, which vehiclepath it is arranged into is determined by Journal of Advanced Transportation the vehicle priority matrix.According to the priority order of vehicles, arrange a customer point into thevehicle path with higher priority. If the current vehicle path meets thecapacity constraint, arrange the customer in this vehicle path. If the currentvehicle path exceeds the capacity constraint, place the customer in the lowerpriority vehicle path. en, the corresponding path is formed. e example of SR-1 is as shown in Figure 2.
We use some local searching methods to optimize the routes.
ese methods are divided into inter-route optimization and intra-route optimization. Inter-route optimization mainly consists of three-point communication and deleting-adding communication, and the search time is the sum of the instance's points and vehicles. ree-point communication refers to the random selection of three paths in the solution. A point of each path is randomly selected, and the points of the path are exchanged in the order of 3, 1, and 2. If the newly generated three paths are better than the original three paths, then the original path will be replaced [28]. Deleting-adding communication refers to the random         selection of two paths in the solution, with a point randomly selected in each path. One of the paths deletes the selected points and adds the selected points to the other path. If the new two paths are better than the original two paths, then the original path is replaced [29]. Intra-route optimization consists of path scrambling, path inversion, and path 2-point swapping. e number of executions of these methods is the square of the number of points in each path of the solution.
Path scrambling refers to the random rearrangement of paths, except the starting and ending points, which are rearranged using the randperm function of Matlab. If the generated new path is better than the best solution, then it replaces the original best solution, and the next searching will still change the original path, however, the comparison object is the best solution. Path inversion means that two points of the path are selected each time, and then the paths between the two points are rearranged in the reverse order. If the new solution produced is good, then retain it as the best          solution for the next search [28]. Path 2-point swapping means that two points of the path are selected each time, and then the two points are exchanged. If the new solution produced is good, then retain it as the best solution for the next search [28].

Experiment and Application
In this section, we utilized CEC2013 to test our proposed algorithm, as shown in Table 1. APPE experiment results are  shown in Tables 2-13. en, the CVRP results are as shown in Tables 14-15. ese are the results APPE compares with DE, SSA, PSO, and PPE. We also compare APPE with some existing work in Tables 6-19. ree types of functions are included in CEC2013, as shown in Table 1. e first type is the unimodal function, which tests the exploitation ability. e second is the basic multimodal function, which tests the exploration ability. e third type is the composition function, which is composed by the above-mentioned function, representing the challenging problems. e search range is [−100, 100] for each dimension.

Experiment Setting.
In this experiment, we compare APPE with DE, SSA, HHO, and PPE. Each algorithm has 100 solutions, i.e., ps � 100, with Di m � 10, 30, 50. Each algorithm has 51 independent runs in each benchmark, and Maxgen equal to 10000 × Dim/ps. e parameter setting is  e qualitative metric uses the convergence curve, and the quantitative measure comprises the best, mean, standard deviation, and average running time of the specific benchmark functions. Table 2 is the best experimental result of APPE with Di m � 10, which is the best of 51 runs. Table 3 is the mean experimental result of APPE with Di m � 10, which is the mean of 51 runs. Table 4 is the standard deviation experimental result of APPE with Di m � 10, which is the standard deviation of 51 runs. Table 5 is the time experimental result of APPE with Di m � 10, which is the average running time of each algorithm in 51 runs. Similarly, Tables 6-9 are the result of Di m � 30, and Tables 10-13 are the result of Di m � 50. We use fitness error f − f optimum for simplicity. We also use W/D/L to record each algorithm's win/draw/loss number in 28 benchmark functions from Tables 2-13. Under a benchmark function test, if the algorithm has the best performance (minimum fitness value), then W adds one. If the algorithm is equal to other algorithms (with the same fitness value), then D adds one. Otherwise (the algorithm's fitness value is not minimum), L adds one. Figure 3 shows the convergence curves of APPE, and the dimension is 10. APPE compares with DE, SSA, HHO, and PPE. Tables 2, 6, 10, APPE's W/D/L of the best experimental result are 20/0/8, 22/0/6, 18/0/10, respectively. It indicates that APPE can search better solutions and is more likely to jump out of local solutions. From Tables 3, 7, 11, APPE's W/D/L of mean experimental result are 18/0/10, 17/0/ 11, 18/0/10, respectively. It means that APPE has a higher convergence accuracy. As shown in Tables 4, 8, 12, APPE's W/ D/L of standard deviation experimental result are 8/0/20, 9/0/ 19, 7/0/21, respectively. Compared with other algorithms in these tables, it can be seen that APPE has moderate convergence stability. From Tables 5, 9, 13, APPE's W/D/L of time experimental result are 25/0/3, 24/0/4, 23/0/5, respectively. It shows that the running time of APPE is relatively short compared with other algorithms. erefore, the convergence precision and running time of APPE are quite well.

As shown in
As shown in Figure 3, the convergence accuracy of APPE at the beginning of iteration is at a general level. In the middle of iteration, many algorithms are in a stable convergence state, which is similar to falling into the local optimum, while APPE still continues to search at this time, which has a certain exploration ability and can jump out of the local optimum. In the latter part of iteration, it enters the stable convergence stage like many algorithms. In Figure 3, APPE's convergence curve is basically at the bottom, i.e., the convergence precision is the best. In addition, APPE convergence curve obviously has a steep slope of decline, i.e., APPE can get a better solution faster.

Applying for CVRP.
In this section, we apply APPE for solving CVRP. We use instances from [30,31] to test APPE's performance. We compare APPE with DE, SSA, PSO, and PPE. For more effectively, we run 10 times for getting more reliable data. e BKV item means the best-known value of instance.
e particles are 50. e iteration is 1000. e setting of DE, PPE, and APPE are the same as aforementioned. For SSA, the number of the producers accounts for 20%, S D � 20, and ST � 0.8. For PSO, ω � 0.8, and c 1 � c 2 � 2. We also use W/D/L to record each algorithm's win/draw/loss number in instances from Tables 14-19, and the compared fitness value is the sum of distances here. e CVRP experiment is shown in Tables 14 and 15.
In Table 14, obviously, the performance of APPE searching for the global optimum is better than that of the comparison algorithms. Most of the instances obtain solutions close to BKV. Similarly, in Table 15, APPE has a better convergence accuracy for CVRP than the contrast algorithms. Figure 4 is the best route result of A-n32-k5 solved by APPE. It has 5 routes, and its fitness is 787.0819 that is close to BKV. It is proved that APPE can effectively solve CVRP.
We also compare APPE with some existing results. Table 16 is the comparison results of APPE with the results of Korayem et al. [32]. e setting is consistent with [32] that the population is 20, the maximum generations are 500, and each instance runs 10 times. In Table 16, the convergence precision of APPE is better than KmeansFnO, KmeansFnP, and KmeansFnR that are all cluster-first route-second methods, and they combine k-means with gray wolf optimizer [32]. Table 17 shows the comparison results of APPE with the results of Yan et al. [33]. e setting is consistent with [33], and each instance runs 20 times. In Table 17, the convergence precision of APPE is better than the constraint optimization harmony search (CO-HS) of [33], PSO, and GA, and three comparison algorithm's data comes from [33]. Table 18 shows the comparison results of APPE with the results of Zhao et al. [34]. e setting is consistent with [34] that the population is 200, the maximum generations are 500, and each instance is run 10 times. It can be seen from Table 18 that APPE is superior to quantum DE (QDE), quantum evolution algorithm (QEA), and DE in both best and mean performance, with data from [34]. Table 19 shows the results of APPE and the work of Khairy et al. [22]. e setting is consistent with [22] that the population is 40, the maximum generations are 1000, and each instance is run 10 times. In Table 19, the convergence accuracy of APPE is significantly better than GA, ant colony optimization (ACO), and group teaching optimization (GTO), with data from [22].

Conclusions
We propose APPE that deletes competition and conditional acceptance and corresponding evolutionary trend update for shortening the algorithm's running time. It also adds a jump mechanism, history-based searching, and population closing moving for making PPE more likely to jump out of the local optimum and improving PPE's convergence accuracy. en, we test APPE by CEC2013, which compares with DE, SSA, HHO, and PPE. Experiment results show that APPE has higher convergence accuracy and shorter running time. Finally, APPE is applied to solve CVRP. From the test results of instances, APPE is more powerful to solve CVRP than DE, SSA, PSO, and PPE. We also compare our algorithm with some existing work. e results show that APPE is able to solve CVRP.