Driving Strategy Using an Improved Ant Colony System for Energy-Efficient Train

. Optimal energy-efcient train operation optimization is one of the widely studied areas in transportation science, which can signifcantly reduce energy consumption that accounts for a large proportion of operating costs. In order to adapt to the complex and changeable railway line conditions such as gradient, slope length, and speed limit and avoid the error in tracking speed curve, an optimal driving strategy decision-making (ODSD) model is proposed in this paper. Te model considers the non-fxed sequence of driving regimes, and the regimes are directly selected in the discrete micro-subsegments of an equal time-division pattern. To solve this model efciently, an improved ant colony system algorithm with the diference edges (ACSd) is proposed, which takes the heuristic efect of the diference between the best solutions of two adjacent iterations, i.e., “the diference edges,” into account. Additionally, energy-efcient heuristic factor and speed heuristic factor are presented to balance energy saving and speed. Te results demonstrate that ACSd performs better than the basic ant colony system algorithm in solving traveling salesman problem (TSP) and provides more fexible driving strategies for the ODSD model.


Introduction
Transportation energy consumption of China reached 9.0% of the total energy consumption in 2019 [1], and the railway sector consumed 14 billion kilowatt hours of electricity in 2021 [2].Te train operation consumes so much energy, and thus how to improve energy efciency in train operation has attracted more attention in recent years.Many energysaving strategies have been considered for energy efciency, such as equipment innovation in lightweight vehicle body.Despite the fact that such a hardware modifcation can promote energy conservation, the improvement on train operations is a highly promising choice for energy saving [3] since it does not require any costs.Te improvement may be achieved by solving the train driving regime selection problem for fnding an optimal sequence of the driving regimes.An energy-efcient driving model should be developed before solving this problem.
Ichikawa [4] introduced Pontryagin's maximum principle into an energy-efcient driving model and frst proposed the analytic method for optimizing driving regimes.
Since then, this method has widely been used in the energyefcient models [5][6][7][8][9].Te analytic method refers to getting the optimal driving regime sequence through strict analytic function derivation upon the optimal control theory.Specifcally, the optimal driving regimes of train operation and the sequence of the regimes are derived through the analytic equation constructed by Pontryagin's maximum principle, and then the location of the regime switching point is derived under diferent constraint conditions.
Te optimal driving regime sequence derived is diferent from the diferent established energy-efcient models in the literature.For gentle slopes and short intervals, Milroy [10] established an optimization model and derived the optimal driving regime consisting of maximum traction, coasting, and maximum braking.On this basis, Lee et al. [11] thought that for longer operating ranges, the optimal driving regime should include cruise regime.As for the combination of diferent slopes, Cheng and Howlett [12] derived the optimal driving regime corresponding to diferent slopes.To establish a model that adapts to any slope and speed limit, Liu and Golovitcher [13] divided the cruising into partial traction cruising and partial braking cruising and provided the necessary conditions for the existence of the optimal switching points between two regimes.Furthermore, Albrecht and Howlett et al. [14,15] provided a calculation formula for the optimal regime switching point on a slope, while ensuring the minimum local energy.Te results from the analytic method are accurate but not suitable for solving the problems with many regime switching points.Te models established to solve such a problem are often oversimplifed to the point where there is a gap between theoretical analysis and practical applications.
To avoid the impact of the simplifcation of the actual solution, the simulation methods based on the actual situation of the railway line are often used.For example, Mao et al. [16] proposed the target-speed-control method, which allows the train to run within the preset target speed range and determines the change of driving regimes according to energy efciency.Feng [17] analyzed the traction energy cost and transport operation time of trains at diferent target speeds through computer-aided simulation.However, it is difcult for such method to determine the accurate target speed.Moreover, the solutions from such simulation are rough, and the results cannot be guaranteed to be the global optimal solution.
Two typical strategies are usually considered in the process of establishing a train energy-efcient driving model when the heuristic search methods are utilized.
One strategy is to preset a relatively fxed sequence of driving regimes, and then on it the switching points between driving regimes are searched and a combination of driving regimes containing the optimal transition point is obtained.Wong and Ho [29] dynamically allocated the number of coasting switching points, and the locations of these points are searched with the genetic algorithm.Similarly, Sandidzadeh and Alai [30] determined switching points of diferent regimes in a continuous domain with genetic algorithm and ant colony optimization.To fnd the optimal traction utilization coefcient, braking utilization coefcient, cruise position, coasting position, and braking position, He et al. [31] proposed a simulation model with an improved diferential evolution algorithm.However, such models usually can only deal with a relatively fxed regime sequence, and they are not suitable for those railway lines whose gradients, slope lengths, and speed limits are kept with a constant change.
Another strategy is to directly construct a speed curve to serve as an auxiliary tool to form an optimal driving regime combination.Te "speed code" model is known for constructing speed curve with a lattice composed of discrete operating intervals and speeds, and then the speed change points of the curve are obtained through the optimization search.To construct the speed curve, Lu et al. [18] applied ant colony algorithm, genetic algorithm, and dynamic programming to search the speed change points in diferent lattices on the "speed code" model.Zhan et al. [32] formulated the detailed train speed profle between two stations as a multiple-phase optimal control model, which is solved using a pseudo-spectral method.He et al. [33] optimized the end speed of each discrete subsegment to construct speed curve based on an improved chicken swarm optimization algorithm, considering both train energy consumption and regenerative braking energy.However, in the "speed code" model, the gradient of the connecting line between adjacent preset speed points may not reach the maximum dynamic characteristics of the train (maximum traction or maximum braking), and thus the optimal regimes derived from Pontryagin's maximum principle may not be fully utilized.Furthermore, even if a speed profle has been generated, the train will still not be able to track the profle accurately under the driving regimes determined from it since there are speed errors [34,35].
A railroad line usually involves some complex conditions, such as many diferent slopes with diferent gradients and lengths, and these slopes are superimposed with different horizontal curves, as well as diferent speed limits on diferent gradients.Te established model using the above two strategies faces difculties in adapting to such complex conditions.To cope with such conditions mentioned above, we divide the train running section into many microsubsegments.Because the line conditions in such subsegments are unchanged, the choice of driving regimes becomes easier.Hence, with the micro-subsegments, we propose a novel optimal driving strategy decision-making (ODSD) model for minimizing energy consumption.In this model, the required driving regimes rather than the speed curve can be directly obtained as a driving strategy is directly represented by a combination of driving regimes, avoiding the speed tracking error.
On the other hand, there are too many discretized subsegments in this case, so the driving regime selection problem becomes a discrete optimization problem for obtaining a driving regime combination.For solving discrete optimization problems, ant colony algorithm [36] and its various variants have been used naturally and widely in engineering problem [37][38][39][40][41][42].An important component of this algorithm is a record of pheromone trails that refect colonies' experiences with previously constructed solutions.
Although the algorithm has the robustness in performance, it has also some inherent defects in updating pheromone trails [43].Traditionally, there are two diferent strategies for updating pheromone trails: global-best and iteration-best.However, the former may cause a too-fast convergence of the algorithm toward some suboptimal solutions, while the latter may sometimes converge too slowly and lack focus.To improve the performance of ant colony algorithm, many scholars have studied the methods of updating pheromone trails.Afek et al. [44] presented the minimum number of pheromones necessary for a colony of ants to fnd a food source.Acharya et al. [45] introduced an exponentially 2 Journal of Advanced Transportation increasing pheromone deposition approach by artifcial ants.For updating pheromone value, Myszkowski et al. [46] selected the worst or best ant found solution in a given iteration and updated the pheromone value by the worst.Ivkovic et al. [47] analyzed the efect of diferent pheromone trail reinforcement strategies and confrmed that numerically adjustable strategies can signifcantly improve algorithmic performance.
A pheromone updating strategy depends on the type of problem, heuristic information, conditions, and parameters [47,48].For the driving regime selection problem, the heuristic efect of the diference between the best solutions of two adjacent iterations is exploited to update pheromone for improving computing performance of ant colony system (ACS algorithm) to deal with many subsegments in the model.
Te main contributions of this study are as follows: (1) A novel fexible ODSD model is established, which is not dependent on an unfxed sequence of driving regimes.Tis model divides the railway section into multiple micro equal time subsegments, and fnally the energy-efcient driving regime in each subsegment can be obtained directly, which adapts to complex railway line conditions and avoids the speed tracking error since there is no need to generate a speed curve.(2) An improved ACS algorithm (named ACSd algorithm) is presented.Due to the increasing solving difculty from many discretized subsegments, the diference edge strategy is introduced into the ACS algorithm to improve its global pheromone updating rule, which can avoid premature convergence and further improve the optimization performance.(3) For achieving balance of the average speed and energy saving on the selection of driving regimes, heuristic information is presented and introduced into the ACSd framework in solving the ODSD model.
Te remaining of this paper is as follows.Te section "Optimal Train Driving Strategy Decision-Making Model" constructs the ODSD model with equal time-division pattern.Te section "Solving the ODSD Model with ACSd" explains the proposed the ACSd algorithm and combines it with the ODSD model.Te section "Experiments on TSP with the ACSd Algorithm" tests and analyzes the performance of ACSd in solving TSP.Te section "Experiment on the ODSD Model with ACSd" applies the ODSD model and the ACSd algorithm to a case and gives the experimental results.Finally, the section "Conclusions" summarizes the main fndings.

Optimal Train Driving Strategy Decision-Making Model
2.1.Mathematical Model Formulation.Te shorter the total running time of a train on the same railway line, the greater the energy consumption [49].Tis paper considers a scenario of timing requirements, i.e., the train runs with a fxed running time (T) between two adjacent stations.In the given and fxed running time, the train passes the railway section with length L from the center O of a station (initial speed v 0 � 0) to the center D of another station (fnal speed v N � v(T) � 0).Te objective function of the ODSD model for the energy-efcient train is the total trip energy consumption.Since the maximum energy consumption of train traction systems can account for 85% of the total energy consumption [43], we aim to minimize the energy consumption of the train traction system.Terefore, the basic model for the energy consumption to be minimized is the work done by the traction power [3,50]: where v(t) is the speed of train; x(t) is the distance traveled over time; _ v(t) is the derivative of speed to time; r(v(t)) is the resistance experienced by a unit mass train traveling at speed v(t); and u(t) is the traction force or braking force, when u(t) ≥ 0 is the traction force, and when u(t) < 0 is the braking force; the integral only considers the positive traction force, i.e., u + (t) � max(u(t), 0).

Equal Time-Division Pattern.
Our aim is to fnd out a driving strategy that meets the above conditions and minimizes the energy consumption of the train.Tis strategy is composed of a driving regime sequence and its switching points.
For a free selection of driving regimes instead of a fxed sequence, we divide the total running time into N micro equal time subsegments (Figure 1), represented by SE i (i � 1, 2, . . ., N).
We name the structure in Figure 1 the equal timedivision pattern.Te total energy is the sum of the energy of each subsegment according to Figure 1.With this discrete time subsegment SE i , equation ( 1) is replaced by where E i � u + (t)v(t)∆t, which is the energy consumption of the train running in SE i .Te consumed energy in SE i depends on traction force under the railway line conditions of SE i .Besides, the regime r i , the inlet speed v i−1 (arriving at SE i ), and the time step size Δt make a diference in the amount of energy consumed in SE i .Let the equal time subsegment size ∆t � T/N.Tus, the running distance of the train in segment SE i is as follows: where a is the acceleration and v i−1 is the inlet speed of SE i .m denotes the mass of a train, F t is the traction force, F b is the braking force, and R is the rolling resistance.R g is the resistance caused by the slope, and R g � g • sin θ, where θ is the gradient of slope and g is gravitational acceleration.∆t is the equal time span corresponding to each subsegment, and there is always a certain amount of coasting time for the conversion from traction to braking, and vice versa.Consequently, the equal time subsegment size depends on the least regime-conversion time.
Tus, the sum of the running distance S i in each equal time subsegment is equal to the total trip length L: Te exit speed of SE i (expressed as v i,exit ) is equal to the inlet speed of SE i+1 (expressed as v i+1,inlet ): ( Te speed in any subsegment should not exceed the speed limit v max,i required by SE i : Te energy consumption of trains is controlled by the regimes and the line conditions (the gradient, slope length, speed limit, etc.).In a subsegment, the gradient, slope length, and speed limit are constants, but the regime is the only decision variable in the energy consumption function.Tat is to say, in a subsegment, the selection of regimes has nothing to do with changing line conditions.Terefore, with these subsegments, the equal time-division pattern ofers the opportunity to determine the driving strategy for a variety of line conditions, i.e., the advantage of the pattern is that the ODSD model can adapt to the changes of line conditions.Furthermore, the pattern is provided to the ACSd algorithm for regime selections: select only one regime in each subsegment and patch each one sequentially into a combination of regimes, i.e., a regime strategy (or a solution).

Solving the ODSD Model with ACSd
In this section, we propose an improved ant colony system for the resolution of the ODSD model, which mainly involves the regime representation, the regime choice rule, and the pheromone update rule.

Te Regime Representation.
Te train operation from one station to the next involves three stages.
Stage 1. Starting acceleration: traction regime is always adopted, and especially, the maximum traction force should be used to speed up in a short time.Stage 2. Energy-efcient driving stage: a combination of maximum traction and coasting is usually adopted at this stage.Stage 3. Parking brake stage: coasting and maximum braking are adopted successively, so that the speed is zero when the train reaches the station center.
From three stages, the choice on regimes at Stage 1 and Stage 3 is clear.As shown in Figure 1, Stage 1 takes the traction regime, and Stage 3 takes the coasting regime and then braking regime, which are usually certain.Tere is no need for regime decisions at these two stages.By contrast, regime decisions mainly happen at Stage 2, where there are three regimes provided for decision making (see Figure 2).According to the optimal train control theory, the optimal driving regime consists of maximum acceleration, cruising, coasting, and maximum braking.However, for urban rail trains with short travel distance (generally less than 5,000 m), the cruising regime is not generally contained [51].Besides, according to the study, driving strategies with or without cruise regime have advantages over each under 4

Stage 2 Energy-efficient Driving
Journal of Advanced Transportation diferent line environments and trainloads [31].Terefore, to reduce the complexity of train operation optimization, we consider only maximum traction, coasting, and maximum braking in this regime decision but ignore cruising.
Finding out an optimal regime sequence is the aim of solving the ODSD model for minimizing energy consumption.A regime sequence is an orderly combination of regimes, and it is here regarded as the route of an ant when we use the ACSd algorithm to solve the model. Let represent one of the three driving regimes of SE i , where the set I includes three regimes: maximum traction, coasting, and maximum braking, and they are, respectively, denoted as 1, 0, and −1.
In Figure 3, the edge R i,u (i � 1, 2, . . ., N; u � 1, 2, 3) represents the regime r i of SE i .Tere are three edges in each subsegment (SE i ) of the pattern, and they represent the three possible regimes in a subsegment.An ant selects one of the three edges which means that a driving regime is selected, and the successive selected regimes in each subsegment would construct a route.
Te decision-making process is embedded in the ACSdbased framework, where the driving regimes are selected in each subsegment.A route is composed of a sequence of edges from the start to the end.In other words, a route, as a combination of the regimes on each SE i , is a feasible solution for the energy minimization problem.A route is constructed with an ant passing through each SE i .
Te candidate set allowed i (allowed i ⊆ I) is a component of the ACSd algorithm, which stores the possible selected regimes of the subsegment SE i as follows: allowed i considers the continuity of the regime in adjacent subsegments, and there is always a coasting regime inbetween traction and braking for their conversion.

Te Regime Choice Rule in Equal Time Subsegments.
In the process of solution construction, an ant selects a regime in each equal time subsegment (SE i ) with probability, i.e., the state transition rule is of ACS is used when selecting regimes.Te rule is originated from heuristic information and pheromone level of the regime in the route graph.
In the ACS, an ant uses the transition rule (random proportional rule) to make probabilistic choices for its next node.Similarly, in the ODSD model, an ant iteratively chooses one regime r i (i.e., the edge u of the ACS algorithm) in allowed i (i � 1, 2, . . ., N) according to the following formula: where , and U is a selected regime produced by roulette wheel selection for ant k according to the following: where τ iu is the accumulating pheromone (i.e., the level of pheromone deposited) on edge R k i,u ; η iu is the heuristic information of edge R k i,u ; β is the parameter controlling the relative importance of pheromone τ iu versus heuristic information η iu (β > 0); and τ iu and η iu are considered the key for solving the ODSD model, and they are discussed in the next two subsections.

Heuristic Information. Heuristic information on the edge R k
i,u works for ants to decide which regime to be selected.Reducing the energy consumption and keeping the appropriate speed are the main considerations of the heuristic information for the ODSD model.Tus, we introduce two heuristic factors η 1 and η 2 to express the heuristic information for the edge section.

Energy-Efcient Heuristic
Factor.Te energyefcient heuristic factor η 1 refects the heuristic and guiding efect on the energy consumption when selecting a regime in SE i (i � 1, 2, . . ., N).We frst thought that 1/E ir may be used as the heuristic factor.However, it would malfunction because the energy consumption of the coasting is zero.Tus, a modifcation is considered as follows: where λ is a regulator for alleviating the excessive infuence of some regimes on the regime decision, and let represent the energy consumption of using traction, coasting, and braking in SE i , respectively.

Speed Heuristic Factor.
Although the energy-efcient heuristic factor η 1 is the main heuristic factor for reducing the energy consumption, the train speed will fall too low if this factor exerts excessive impact.Tis is because low consumption requires coasting.Nonetheless, it brings about low-speed operation and is prone to violate the timing requirements.
To overcome this problem, the reference speed of the slope is hereby introduced into the heuristic information as another factor (η 2 ) to guide the decision of regimes.Te reference speed of slope q(v q , q � 1, 2, . . ., Q) refers to a rough speed, as a reference baseline for the running speed (diferent slopes have diferent reference speeds).
Te reference speed v q is represented by the ratio of the slope length L q to the reference running time t ref,q (q � 1, 2, . . ., Q) on slope q: where t ref,q is the running time on slop q of the current best solution (i.e., current best regime combination).
To provide the guiding efect on the running speed and avoid overusing coasting, v q is introduced into the heuristic factor η 2 : where v q is the reference speed on slope q, which can be calculated by equation (11), and v ir is the speed in SE i under regime r. |v ir − v q | refects the diference between the current speed and the reference speed.Te closer v ir is to v q , the larger the denominator is.Tis indicates a higher probability of regime r being selected.A small number ε is added into the denominator (for example, let ε � 0.001) to avoid that the denominator is zero.In addition, the smaller the deviation, the greater the visibility of regime for an ant.In other words, the introduction of the reference speed v q provides a tradeof between timing requirement and energy-efcient consumption.
With the above process, we can get η 2 ; however, at the frst iteration, because no solution is constructed at this time, we cannot get t ref,q .For this case, t ref,q is estimated by equation (13).Suppose that t ref,q at the frst iteration is directly proportional to the time cost on slope q at the allowed speed v max,q : where L q is the length of slope q(q � 1,2, . . ., Q).For solving t ref,q , t ref,q+1 , . . ., t ref,Q of equation ( 13), we add the following formula to support it.Let SE p be the last subsegment on slope q − 1, and hence where  p i�1 t i is the time that the train has run on reaching the end of SE p .With equations ( 13) and ( 14), we can also attain v q at the frst iteration by equation (11).
Energy-efcient heuristic factor η 1 and speed heuristic factor η 2 , from two diferent views, provide heuristic information for an ant to choose a regime.For a synthetic efect, let η ir be a heuristic function in equations (10) and (12): 3.4.Te Pheromone Update Rule.Te local pheromone updating rule of the ACS algorithm can usually avoid the accumulation of pheromone on an edge, so that ants can fnd other new edges, whereas the global pheromone updating rule can result in those edges of the best route to be selected with high probability.However, there may be missing heuristic information in the diference edges between the best routes of two adjacent iterations.Terefore, based on the global pheromone updating rule, we propose the difference edge strategy.

Te Local Pheromone Updating Rule.
Te local pheromone updating rule means that whenever an ant moves from one subsegment to another, the pheromone on the edge is eliminated/evaporated, i.e., the pheromone of the edge (regime r i in SE i ) is calculated by equation (16).Te local adjustment and evaporation of pheromone can reduce where ξ ∈ [0, 1] is the evaporation rate of the local pheromone and τ 0 is the initial pheromone level on the route.

Te Global Pheromone Updating
Rule.After all ants have reached the end at iteration s(s � 1, 2, . . ., ite max), the global pheromone updating rule updates the pheromone of the best route of the current iteration (called the iterationbest route), and the pheromone of each edge is updated by the following equation: In equation ( 17), ρ ∈ [0, 1] is the evaporation rate of the global pheromone and ∆τ best iu is the pheromone increment on the iteration best edge.In equation ( 18), I Route s represents the iteration-best route at iteration s and E I Route s is the minimum energy consumption under the regimes corresponding to the iteration-best route.

Pheromone Updating Strategy on Diference Edges.
With the increase of the complexity of optimization problems, the original ACS algorithm should be revised to avoid the premature convergence and improve the solution quality.
During the pheromone update, the pheromone of the original ACS algorithm is deposited on the best route [39].With the increase of the iterations, more pheromones would accumulate on the best route, which would lead to search stagnation.However, the diference between two successive iteration-best solutions may suggest a new heuristic [52].Hence, we propose a new pheromone updating strategy for the ACS algorithm and introduce it into the ODSD model to improve the calculation performance for the optimal driving strategy.
During the iteration, the iteration-best route works in a state of change.Comparing the two iteration-best routes at iteration s and iteration s − 1, those diferent edges between them are defned as diference edges.Te diference edges come from the comparison between the current iterationbest route (represented by I Route s ) and the preceding iteration-best route (represented by I Route s−1 ).For exploiting diference information, we add reinforcement of pheromone to these diference edges in addition to the pheromone changes in equations ( 16) and (17).
For example, suppose that We add the diference edges to the set S Dif : where R A−B represents the edge linking node A to B. Tus, the new ACS (ACS with diference edges, i.e., ACSd) performs the additional pheromone updating by increasing the extra pheromone ∆τ Nbest iu to each edge in S Dif : Te diference edge strategy is illustrated in Figure 4. We assume that the initial pheromone on each edge is zero, and the pheromone increment is 1 on the edge once an ant passes through it.
Figure 4(a) shows the iteration-best route of iteration s − 1, and according to the global pheromone updating rule (see equation ( 17)), the pheromone amount increases to 1 on each edge.Te red route in Figure 4(b) is the iteration-best route of iteration s.For the two routes, many edges are the same.Tis is because under the induction of pheromone aggregation, ants always prefer to pass through those edges with high pheromone density; thus, the ants walk over some edges again, which results in many overlapped edges between the two routes.However, some edges do not overlap, and these nonoverlapped edges are the very diference edges.Tus, the pheromone increases to 2 for the same edges, whereas the pheromone increases to 1 for the diference edges.Obviously, the pheromone of diference edges is less than those same edges.
In fact, the two routes are iteration-best routes, so they are heuristic for further search, and the lower pheromone concentration on diference edges may reduce the heuristic efect.Terefore, as shown in Figure 4(c), pheromone is added to the diference edges.Adding pheromone on the diference edges may intensify the heuristic efect, as additional pheromone can provide guidance for exploiting search space.Tis may help to avoid premature convergence and provide better solution compared with the original ACS algorithm.
Te new pheromone updating strategy for diference edges is diferent from ACS.For the sake of distinction, we call the algorithm ACSd.Te experiment in Section 4 shows that ACSd has signifcant efect and performs better than ACS.

Constraint Violation and Repair.
Tere are some constraints in the problem of optimal energy-efcient train driving strategy, such as speed limits and reaching next station at the stipulated time.
Te penalty method may be the most common technique to solve constrained optimization problems.However, a large penalty value would result in premature convergence, while a small penalty would produce many infeasible solutions [53].Hence, we take a repair procedure for Journal of Advanced Transportation constraints to replace the penalty method.Tis procedure attempts to fx infeasible solutions by considering the ODSD model itself.

Speed Constraints.
During the route construction, at some positions such as steep downgrade, the speed of the train would reach the limit.To reduce the speed, the traction regimes of some subsegments should be substituted for braking or coasting, and the substitution is conducted in a trial-and-error manner.For example, assuming the speed violation happens in SE i , replace the traction regime in SE i−1 with coasting regime backward and then recalculate the speed in SE i .Ten, check if the speed limit is exceeded; if exceeded, change the regime with braking regime.
If the speed still exceeds the speed limit, the above process will be repeated ahead of subsegment SE i−1 (i.e., SE i−2 ).

Distance Constraints.
ACSd selects driving regimes in each subsegment SE i (i � 1, 2, . . ., N), on which the time is equal, but the length S i is not equal.S i varies with the diferent regimes of SE i , and  N i�1 S i is the sum length of all subsegments, theoretically equal to the total trip distance L. Equation ( 22) represents this relationship within the allowable error δ: Since the length of each subsegment varies with diferent regimes, we change the regimes used in some subsegments to reduce or increase the subsegment length when equation ( 22) is not satisfed.
According to the Davis equation [54], for the train in traction, energy consumption increases with the increase of speed on the uphill slope.Tus, for reducing the energy consumption, we modify the regimes in subsegments that produce the maximum speed.
Accordingly, if  N i�1 S i > L, the traction regimes of some subsegments should be modifed to coasting regime for reducing  N i�1 S i ; if  N i�1 S i < L, some selected coasting regimes should be substituted to traction regimes for increasing  N i�1 S i .Te case of  N i�1 S i > L is used as an example to explain the above process as follows.In Figure 5(a), SE 5 is a switching subsegment between two regimes (i.e., before point B is traction, and after point B is coasting).Moreover, SE 5 is a subsegment immediately preceding point B with the highest speed in the uphill process.We substitute the coasting for the traction in SE 5 and recalculate the sum of running distance and check whether it meets (22).During the process of repeating the above steps, subsegments may be too long for meeting the small δ.In this case, the subsegment is further divided into subsegments (see Figure 5(b)).Replace traction with coasting in the subsegments one by one until the results meet constraint (22).

Te Algorithm Framework for the ODSD Model.
Algorithm 1 lists the process of constructing a feasible solution, which is a route of an ant from the start to the end.A route is composed of N regimes (edges).Ant k at iteration s selects a regime (edge) in SE i using the state transition rule.For using the rule, the heuristic information including energy-efcient factor and speed factor is calculated in advance.In the process, the regime is replaced backwards when the speed exceeds the limits, and the distance constraint violation is checked when the ant reaches the terminal.
Algorithm 2 shows the process of ACSd in solving the ODSD model.Te step of constructing a feasible solution is a necessary component.Te diference edges are identifed by comparing the two adjacent iteration-best routes.

Experiments on TSP with the ACSd Algorithm
We add the diference edge strategy to update the global pheromone in Section 3, which is a new operation on the basic structure; therefore, it should be contrasted with the basic ACS to examine the efect of the diference edge strategy.

Procedure of the ACSd Algorithm.
Te independent procedure of the ACSd is described as follows (suppose there are n cities to be visited).
Step 2. For m ants, select m cities for a start randomly.
Step 3.For each ant, select the next city j in allowed i (i � 1, 2, . . ., n) according to equations ( 23) and (24), where allowed i is the set of cities that remain to be visited by the ant positioned on city i.
(3) Calculate v ir , S i and E ir in SE i ; (4) Calculate energy-efcient heuristic factor η 1 ; (5) if  i 1 S i <  q 1 S q then q + +; //come into next slope (6) Update the v q of the current slope q by using equation ( 11); (7) Calculate speed heuristic factor η 2 with v q ; (8) Calculate heuristic information η ir � η 1 × η 2 ; (9) Choose a regime u i in SE i by using transition rule equation ( 21); (10) if v ir > v q,max then repairing speed violation by regimes replacement backwards until v ir < v q,max ; end if (11) end for (12) If ∆S � |L −  N i�1 S i | > δ, then repair distance constraint violation; (13) return Route s .ALGORITHM 1: Te process of constructing a feasible solution.
(1) Input: t i , T, L, L q , v q,max , v q , I; (2) Initialize each τ iu of edge R i,r in SE i (i � 1, 2, . . ., N; u � −1,0,1); (3) Set allowed i for each SE i ; (4) Set iteration counter s � 0; (5) From m initial routes, choose the optimal route as BestRoute s (6) for (s � 1, s < ite max , s++) (7) for each ant k do (8) Construct a feasible solution Route F using Algorithm 1; I Route s ⟵Route F (11) end for (12) get S Dif by comparing I Route s with I Route s−1 ; (13) for each edge R i,u in I Route s do (14) if edge R i,u ∈ S Dif (15) τ iu � (1 − ρ) Journal of Advanced Transportation where q 0 is a parameter (0 < q 0 < 1), q is a random number uniformly distributed in [0, 1], and U is a random selected city according to the probability distribution given in the following equation: Step 4. After an ant builds its route, update the local pheromone of its tour according to the following equation: where ξ ∈ [0,1] is the evaporation rate of the local pheromone and τ 0 is the initial pheromone level on the route.
Step 6. Update the pheromone of the iteration-best route when it is found at iteration s.
where I Route s represents the edge set of iterationbest route.
Step 7. Update the pheromone of diference edges: compare the best route of the current iteration with that of its last iteration to fnd out diference edges and then update their pheromone.
where I Route s represents the edge set of the iteration-best route.S Dif is the diference edge set, which comes from the comparison between the two successive iteration-best routes.
Step 8.If iteration <ite max, go to Step 3; else output the global-best route (i.e., the fnal solution) and terminate the run.

Diference Edge Strategy Improves the Performance of the
ACSd Algorithm.TSP is one of the famous problems for testing a discrete optimization algorithm, and ant colony algorithm is usually based on this problem to test the performance; thus, to compare the performance between the new algorithm ACSd and the existing algorithm ACS, we test them with a few instances from TSPLIB standard library (https://comopt.if.uni-heidelberg.de/software/TSPLIB95).
Due to the stochastic nature of evolution, measuring the performance of algorithms is a challenging task.Traditionally, the arithmetic mean is used to measuring average performance of algorithms.Ivkovic et al. [55] demonstrated that the arithmetic mean was inadequate for measuring average performance based on the observed number of function evaluations, and they thought that the quantiles were more suitable for measuring average performance [56].Terefore, we replace the average value with the median (i.e., 0.5 quantile, Q 0.5 ) to measure the average performance of the algorithm and use 0.1 quantile (Q 0.1 ) and 0.9 quantile (Q 0.9 ) to measure the peak and bad-case performance of the algorithm.
Te parameter settings of ACS and ACSd are the same (see Table 1).We performed 20 independent runs at the iteration number NC max � 3,000.Te results include the maximum value, the median value (Q 0.5 ), the 0.1 quantile (Q 0.1 ), the 0.9 quantile (Q 0.9 ), the minimum value, and standard deviation (S.t.), which are shown in Table 2.Among them, the best results are highlighted in italic.
In Table 2, Opt.represents the best-known route length, and Err.represents the percentage of error between the minimum length and the best-known route length.
It can be clearly seen from this table that ACSd is signifcantly superior to ACS in all results except for the standard deviation of instance d493 (378.12 > 251.03).Also, from the column Err., we see that the minimum route length of the ACSd algorithm is almost close to the best-known results, though no other advanced operations are introduced.
Because the diference between the two algorithms lies in only the diference edges and other structures of the ACS are not changed, it indicates that the diference edge strategy does play a signifcant role in improving the performance of the ACS.

Analyzing Convergence of Diference Edges Strategy.
For analyzing the diference edge strategy, Figure 6 shows the convergence curves of the instances in Table 2 for the ACS and the ACSd algorithms.It is worth noting that the vertical axis is the length of the iteration-best route, instead of the best-so-far route.Comparing the two convergence curves from the two algorithms for the same instance, we can see some convergence characteristics of the two algorithms, which may provide some special information for explaining the efect of the diference edges.
For ACS, the iteration-best value fuctuates greatly up and down with the number of iterations.On the whole, its oscillation almost has no downward trend, and the bestso-far solution is just occasionally found.In contrast, for ACSd, the iteration-best value has an obvious downward trend in spite of some fuctuations, which shows that the convergence of ACSd is faster than ACS.

10
Journal of Advanced Transportation

Journal of Advanced Transportation
Te curve of ACSd is generally lower than that of ACS (i.e., the red curve is below the black curve), indicating that the search of ACSd is in a strong level.Even at the beginning phase of the search, it is easy for ACSd to fnd better solution than ACS except for d493.
Terefore, we can make the following conclusion: difference edge strategy does improve the performance of the ACS algorithm.Tat is to say, with the information carried by the diference edge being strengthened, the diference edge strategy has an additional heuristic efect on the subsequent search, and thus ACSd has better exploration ability and faster convergence speed.

Parameter Experiments.
Te values of parameters in the ACSd algorithms could cause diferent optimal results.For the suitable parameters, the number of ants (m), the weighted value of heuristic information to pheromone (β), the evaporation rate of the local pheromone (ξ), and the evaporation rate of the global pheromone (ρ) are tested.Before the experiment, with a preliminary test, we estimate ρ, ζ ∈ [0, 0.05], β ∈ [1,5] as the parameter interval of the ACSd algorithm, and for the number of ants, we test it in a general range (m � 10 ∼ 200).
In the experiments, only one parameter is changed in a trial for showing the efect of this factor.Te representative test cases for the parameter setting are done with 20 independent runs on a selected instance (d198), and the experimental results are shown in Figure 7.
From Figure 7, we gain the appropriate value (presented in Table 1) for the four parameters when the ftness evaluation value is the smallest.

Experiment on the ODSD Model with ACSd
Te proposed ODSD model and ACSd algorithm are applied to a case from a previous study for validation [57].Most existing methods or models for solving optimal driving strategy of energy-efcient train problem are tested or applied on diferent instances.Considering diferent scenes or constraints, it is not easy for them to make a fair comparison.In our work, we used the same instance of the real-world problem for their results are available [57].
In this case, the train is required to run a total distance of 20 km and a fxed time of 25 minutes.Te slope length (m) and the gradient (‰) of each slope section are shown in Figure 8. Te speed limit of the train is 80 km/h, and the turnout speed limit is 45 km/h (the distance from the turnout to the end point is 1,600 m).Te train is pulled by electric locomotive with the traction weight of 3,000 t, and the train conversion braking rate θ h � 0.3.According to the requirement of model discretization, the trip time is divided into 500 subsegments, i.e., the time subsegment size is set as 3 seconds in the equal time-division pattern.Additionally, the ACSd algorithm is applied in this instance, and its parameters ρ, β, ξ, and m are set according to Table 1, and the number of iterations is set to 1,000.
We aim to decide the driving regime in each time subsegment to obtain the fnal regime sequence and calculate the speed-time curve and speed-distance curve of the train operation to visually display the optimization results.9(a) and 9(b), respectively.Te energy consumption of this strategy is 604.90 kW•h.Additionally, to check the efect on the results, the maximum number of iterations is set as 300, and its energy consumption is 609.86 kw • h.Te diference between the results of two diferent iterations is not signifcant, which suggests that the convergence rate becomes small from iteration 300 to iteration 1,000.
For comparison, we consider the time-efcient driving strategy model and the target-speed control model [57].Te former is presented for minimizing running time, and the latter is a kind of energy-efcient model considering timing requirements, which allows a train to run within a preset target speed range according to energy-saving principles.Our ODSD model also involves the timing requirements; however, it provides the driving strategy that consumes the lowest energy consumption.From Table 3, compared with time-efcient model and target-speed control model, the energy consumption of our method is reduced by 13.5% and 7%, respectively.
In addition to the efciency brought by the improved algorithm, the reason for this result may be owing to the equal time-division pattern: selecting regimes in each subsegment by the ACSd may provide more fexible driving strategies for the ODSD model.Our approach adopts heuristic process with the short time subsegment, which provides more opportunities to fnd better results from more regime combinations.Also, with dividing the section into small subsegments, it can avoid the infuence of slope lengths, gradients, and speed limits on regime selection, and thus it is easy to adapt to diferent line conditions.Te maximum iteration number is 300 and 1000, respectively. 14

Journal of Advanced Transportation
In contrast, the other two models require the train to run within a preset target speed range for energy-efcient concerns, and only when the speed reaches the boundary of the target speed range or is close to the change point of slope, does the regime change.Terefore, this conversion is not fexible enough to get a good solution.

Conclusions
Tis paper develops a novel ODSD model for the of train energy consumption.In the ODSD model, the driving regimes can be directly selected and applied to control train operation, so as to avoid the speed tracking error.Besides, with the support of the equal time-division pattern, the model may produce the optimal strategy for a train to adapt to a wider range of railway line conditions.Tis pattern, which is constructed by discretizing the total running time into equal subsegments, provides a basis for selecting the fexible regimes.
In addition, an improved ant colony system with a new pheromone updating rule is proposed and used in the ODSD model, which considers the heuristic information of difference edges that comes from the comparison of two adjacent iteration-best solutions.Te comparison experiment between the ACS and the ACSd shows that the ACSd has better performance, which proves that the diference edge strategy provides more exploration to avoid premature convergence and improve the solution quality.
Furthermore, the ACSd has been embedded with new heuristic information considering energy-efcient heuristic factor and speed heuristic factor in favor of a compromise between energy consumption and timing in solving the model.A case study also demonstrates that the proposed model, in terms of improving the fexibility of regime selection, reduces the energy consumption compared with the other methods.
In this paper, we focus on the feasibility of the model itself; therefore, only three traditional regimes are considered.In fact, the cruise regime should be included in our further model in our next work to make the ODSD model suitable for most main line railways.Besides, we will improve the ODSD model to make it better to meet more constraints for actual operation and consider the factors such as regenerative braking, passenger comfort, and passenger fow change.In terms of the algorithm, the strategy of diference edges can not only be integrated into the original ACS algorithm but also allowed to be integrated into other advanced algorithms.

Variables and Functions E:
Total energy consumption I: Te set of driving regimes, I � r| − 1, 0, 1 { } i: Equal time subsegment index, i � 1, 2, . . ., N, where N is the total subsegment number ite_max: Te maximum iteration number L: Total trip length L q : Length of slope q m: Number of ants q: Slope index, q � 1, 2, . . ., Q R i,u : An edge of a route, which represents the regime u in SE i r i : Driving regime of the train, and r i ∈ −1, 0, 1 { }, where −1, 0, and 1 represent maximum braking, coasting, and maximum traction in SE i , respectively S Dif : Set of diference edges S i : Running distance of SE i s: Iteration counter, s � 1, 2, . . ., ite max SE i : Equal time subsegment i T: Total trip time v ir : Running speed at the end of SE i with regime r v q : Te reference speed of slope q v q, max : Speed limit of slope q v 0 , v N : Starting speed and terminal speed x i : Position at the end of SE i x 0 , x N : Starting position and terminal position of the section ∆S: Te error between the total calculated distance and total trip length ∆τ best iu : Pheromone increment on edge R i,u of the iteration best route ∆τ Nbest iu : Extra pheromone added to edge R i,u in S Dif β: Weighted value of heuristic information to pheromone δ: Allowable error between the total calculated distance and total trip length η iu : Heuristic information of R i,u η 1 : Energy-efcient heuristic factor η 2 : Speed heuristic factor ρ: Evaporation rate of the global pheromone τ iu : Pheromone accumulation on R i,u to the supervisor Prof. Kun Miao for his great support and guidance in this project.Tey also thank the research team for their collaboration and help during gathering data for the research.

Figure 3 :
Figure 3: Te route graph of the ants.

Figure 5 :
Figure 5: Two-stage correction process for an example when  N i�1 S i > L (A-D-B-C is the original speed-time curve, and A-D-H-C is the modifed one).

Figure 7 :
Figure 7: Te test of parameter sensitivity.

Figure 8 :
Figure 8: Te slope information of the example.

Table 2 :
Comparison of the results between ACS and ACSd.

Table 1 :
Te set of the parameters.

Table 3 :
Comparison of results of diferent methods.