Evolutionary Optimization of Multirendezvous Impulsive Trajectories

This paper investigates the use of evolutionary algorithms for the optimization of time-constrained impulsive multirendezvous missions. The aim is to find the minimum-ΔV trajectory that allows a chaser spacecraft to perform, in a prescribed mission time, a complete tour of a set of targets, such as space debris or artificial satellites, which move on the same orbital plane at slightly different altitudes. For this purpose, a two-level design approach is pursued. First, an outer-level combinatorial problem is defined, dealing with the simultaneous optimization of the sequence of targets and the rendezvous epochs. The suggested approach is first tested by assuming that all transfer legs last exactly the same amount of time; then, the time domain is discretized over a finer grid, allowing a more appropriate sizing of the time window allocated for each leg. The outer-level problem is solved by an in-house genetic algorithm, which features an effective permutation-preserving solution encoding. A simple, but fairly accurate, heuristic, based on a suboptimal four-impulse analytic solution of the single-target rendezvous problem, is used when solving the combinatorial problem for a fast guess at the transfer cost, given the departure and arrival epochs. The outer-level problem solution is used to define an inner-level NLP problem, concerning the optimization of each body-to-body transfer leg. In this phase, the encounter times are further refined. The inner-level problem is tackled through an in-house multipopulation self-adaptive differential evolution algorithm. Numerical results for case studies including up to 20 targets with different time grids are presented.


Introduction
A multirendezvous (MRR) trajectory concerns the motion of an active spacecraft, the chaser, which executes a sequence of maneuvers to rendezvous with multiple passive targets, such as space debris or artificial satellites in Earth's orbit. MRR trajectories are gradually increasing in popularity within the aerospace community, since missions based on a chaser spacecraft visiting multiple artificial bodies are becoming of practical interest. Typical operational scenarios include Active Debris Removal (ADR) and On-Orbit Servicing (OOS) missions. ADR missions are particularly significant, as they are deemed to be the only effective option to ensure the long-term operation of spacecraft, close to be compromised by the huge amount of Earth orbiting wreckage [1]. On the other hand, servicing missions have recently drawn the attention of several private companies [2]. Indeed, the possibility of extending the life span of already operative satellites by sending servicing spacecraft for refueling, maintenance, and orbital relocation appears to be economically advantageous [3]. In both mission scenarios, in order to make the program economically affordable, any single chaser is expected to visit (i.e., remove or service) as many targets as possible in a limited amount of time, making the best use of the on-board propellant. As a result, a minimum-propellant multirendezvous trajectory has to be designed for the chaser spacecraft.
A number of authors addressed the optimization of ADR trajectories in Low Earth Orbit (LEO) or Sun Synchronous Orbit (SSO). Most of these studies focus on long-term or time-free ADR missions, typically aimed at deorbiting the target debris at a rate of three to ten per year [4,5]. A typical mission strategy consists in exploiting the J 2 orbital perturbation for the alignment of the orbital planes of consecutive targets before starting the rendezvous maneuver in order to reduce the total mission cost [6,7], at the expense of a longer flight time. Conversely, when time-constrained or time-fixed missions are considered, the removal sequence and rendezvous epochs of the ADR mission must be optimized simultaneously, making the problem more complex to solve. For this reason, most of the literature deals with very small debris clusters [8,9], usually composed of less than five elements. Notable exceptions are Ref. [10,11], which consider clouds with dozens or hundreds of debris.
The optimal planning of OOS missions in the geosynchronous orbit was also investigated in several research papers; the same approach for the preliminary design of ADR mission, with a single active chaser aimed at multiple passive targets, can be also found in several works about OOS [12]. However, nonstandard, and even more complex, mission scenarios become more frequent in OOS literature, encompassing the presence of multiple servicing spacecraft [13], cooperative targets [14], a fuel depot [15], constellation reconfiguration maneuvers [16], or complex dynamical models [17]. As for ADR missions, the great part of the studies about OOS deals with small constellations of satellites or time-free missions, in order to reduce the problem complexity.
The targets of an MRR mission are usually selected out of a wide set, looking for favourable body positions (e.g., GTOC9 problem [18]); the expenditure to cancel out the phase error is negligible, and suitable solutions [19,20] can be used to estimate the cost of the time-free orbit transfer. However, most of the proposed approaches become impractical in the presence of long target sequences or tight time constraints, as the ones considered in this work. In the present manuscript, the rendezvous cost is mainly due to the recovery of the phase error.
The single-target time-constrained rendezvous is a wellknown problem in spaceflight mechanics. Several optimization methods have been proposed assuming either finite [21] or impulsive thrust [22,23]. In the latter case, four burns permit the achievement of the optimal solution [24]. The extension of the solution method to a series of consecutive rendezvous is straightforward if the sequence of targets is assigned a priori [25]. However, the combinatorial aspect introduced by allowing permutations of the target sequence changes radically the problem nature and increases drastically its complexity. In this respect, the MRR problem presents several analogies with the Traveling Salesman Problem (TSP) [26], a well-known problem in operational research, whose solution is the shortest tour which allows a salesman to visit once a prescribed set of cities, starting and ending the trip in his home city. For the classic TSP, branch-and-bound-based rapid exact solvers are available [27], which allow solving problem instances involving up to several hundred cities in a brief time. However, while similar, the problem here investigated is more complex than a standard TSP, as the cost associated with traveling between any two targets changes with time due to the orbital dynamics.
With this analogy in mind, several attempts have been made to find the optimal solution of the MRR problem by using algorithms developed in the context of operational research. This is the case of exact solution techniques such as brute-force approaches [28] and branch and bound search [29] which consist of a systematic enumeration of candidate solutions of the combinatorial problem. However, these solution methods are generally viable only in the presence of small sets of targets or in time-free scenarios, due to the curse of dimensionality, which causes the search to run out of time and/or memory very soon as the problem dimension increases [30]. For this reason, heuristic and metaheuristic approaches have rapidly gained popularity, as they usually allow to find good-quality solutions in a limited amount of time, even in the case of large-size problems. Greedy heuristics, such as the Series Method [31], are among the fastest algorithms, but they can provide solutions quite far from the global optimum. Heuristic and metaheuristic treesearch procedures, such as Beam Search [32], Ant Colony Optimization (ACO) [33], or A * search [34], have been widely exploited for both ADR/OOS and asteroid exploration mission planning. A hybridization between Beam Search and ACO was also discussed [35]. These methods have proven capable to achieve high-quality results in a reduced computational time, especially when the aim of the mission is to visit the largest number of targets in a wider set, given some propellant and/or time constraints. Indeed, during the search, they allow pruning all those branches of the search tree that lead to unfeasible solutions. On the other hand, the performance of tree-search algorithms tends to worsen when a complete tour of the targets is searched for, as they repeatedly explore the same branches while looking for the optimal target permutation. Conversely, other metaheuristic methods consist of the iterative refinement of a set of (possibly random) initial solutions and guarantee the tour completeness at any point in the optimization process. For this reason, they are particularly suited to plan MRR missions aimed at visiting all the targets of the chosen set. Prominent examples are Tabu Search (TS) [36], Simulated Annealing (SA) [37,38], and Genetic Algorithms (GA) [39,40] Among the aforementioned optimization methods, the genetic algorithm is the most flexible, as, in principle, it may handle any type of decision variable. GA performs a global optimization and, thanks to its stochastic selection and mutation operators, it has greater chances to evade from local optima than greedy methods. In addition, it may adopt an encoding that guarantees the intrinsic completeness of the tour. GA is a population-based method; hence, its execution time can be significantly reduced by an efficient parallel implementation that is not possible for single-solution methods like TS and SA. All these features make it a solid candidate for the solution of the complete tour problem here studied and motivate its choice.
This paper focuses on the optimization of a timeconstrained multi-target rendezvous mission, assuming an impulsive thrust model for the chaser. The aim is to minimize the overall mission ΔV, while performing a complete tour of a set of targets, which move on circular coplanar orbits with similar altitudes, in a prescribed amount of time. The problem is mathematically formulated as a Mixed-Integer Nonlinear Programming (MINLP) problem, involving the simultaneous optimization of both integer variables (that define the sequence of targets) and real-valued variables (which identify the rendezvous epochs and the spacecraft trajectory from a target to the next one). A two-level design procedure is proposed to simplify the MINLP problem solution, by defining (i) an outer-level combinatorial problem that concerns the determination of the optimal rendezvous sequence and epochs, discretized over a time grid, and (ii) an inner-level NLP problem, which optimizes each bodyto-body transfer leg and is also able to refine the encounter times, while keeping the body sequence that solved the outer-level problem. The task of solving the outer-level problem is entrusted to a genetic algorithm, which encodes the solution as a permutation of the target bodies and uses permutation-preserving crossover and mutation operators. Specifically, two variants of the outer-level problem, with different complexities, are considered in this paper, by assuming either equal or different (in a set of discrete values) time-lengths for the target-to-target transfers. In the latter case, the permutation of the targets is efficiently replaced by a permutation of the available rendezvous epochs. The solution of the outer-level problem is given as an input to the inner-level problem that is tackled through EOS, an in-house multi-population self-adaptive Differential Evolution algorithm. As case studies, different ADR missions are considered, including up to 20 targets in close coplanar and circular orbits.

Problem Statement
Let us consider a set of N prescribed target bodies that move on circular coplanar orbits at slightly different altitudes under a Keplerian dynamical model. For each target body i ∈ ½1, N, orbital radius r ðiÞ and initial anomaly θ ðiÞ ðt 0 Þ are known. The velocity of the i-th target is constant and equal to v ðiÞ = ffiffiffiffiffiffiffiffiffiffi μ/r ðiÞ q , while its angular position at any time t is given by θ ðiÞ ðtÞ = θ ðiÞ ðt 0 Þ + ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi ffi μ/ðr ðiÞ Þ 3 q ðt − t 0 Þ, where μ is the gravitational parameter of the central body. The corresponding position r ðiÞ ðtÞ and velocity v ðiÞ ðtÞ vectors can be easily derived. At the initial time, the chaser spacecraft is assumed to be on a circular orbit of radius r ð0Þ with anomaly θ ð0Þ ðt 0 Þ = 0, on the same orbital plane as all targets. The problem is thus planar. The goal is to design a minimum-ΔV impulsive trajectory that allows the chaser to rendezvous with all the N bodies within a specified mission time T M .
Let us assume that the body encounter sequence p = fp 1 , p 2 , ⋯, p N g and the corresponding set of (monotonically increasing) encounter times t = ft 1 , t 2 , ⋯, t N g are known, where the integer p j ∈ ½1, N identifies the target met at time t j , and t N ≤ T M . The overall trajectory of the chaser can be decomposed into N target-to-target transfer legs. The ðk + 1Þ -th leg departs from body p k (with p 0 = 0 denoting the initial state of the chaser) at time t k and arrives at body p k+1 at time t k+1 , for k = 0, ⋯, N − 1. The rendezvous condition requires that, at time t k , the position r and velocity v of the chaser are the same as the encountered target: where subscript "k" is used to identify the chaser state at time t k .
With four being the maximum number of impulses required for an optimal time-fixed planar rendezvous [24], each body-to-body transfer leg can be thought as a sequence of three ballistic arcs, named A, B, and C, joined by impulsive maneuvers, located at the departure point (k), at two internal points (with labels k + 1/3 and k + 2/3), and at the arrival point (k + 1), for the ðk + 1Þ-th leg, respectively. A sketch of this transfer leg is shown in Figure 1. A position formulation is here considered; that is, the ðk + 1Þ-th transfer leg is parameterized with respect to radii r k+1/3 and r k+2/3 and anomalies θ k+1/3 and θ k+2/3 of the chaser at the internal maneuvering points. Since the chaser state r, v at the endpoints of any trajectory leg is fixed because of the rendezvous conditions in Equations (1) and (2), the chaser position at any impulse is completely defined. Conversely, the unknown chaser velocity immediately before (superscript "-") and after (superscript "+") the impulsive maneuvers can be found by solving either a geometrical problem or a Lambert problem.

Geometrical Problem.
Consider the arc A between the points k and k + 1/3. As shown in Figure 2, for any given semimajor axis a > a m , two elliptical transfer orbits exist that connect r k with r k+1/3 , where a m = ðr k + r k+1/3 + cÞ/4 is the smallest possible semimajor axis and c = kr k − r k+1/3 k is the chord distance between the two endpoints [41]. An ellipse is known as the fast solution, since it is characterized by a transfer time Δt < t m ; the second ellipse, instead, represents the slow solution, since its transfer time is Δt > t m (Figure 2(a)). As a consequence, the knowledge of the semimajor axis is not sufficient to identify the elliptic arc connecting the two points.
A nondimensional parameter y ∈ ½0, 1 is introduced, which is implicitly defined by the relationship Note that the same value of the semimajor axis a is obtained by using either y or 1 − y. So, by pairing the fast solutions with values y < 0:5 and the slow solutions with values y > 0:5, the elliptic arc connecting the two assigned endpoints is uniquely identified for any choice of y. Figure 2(b) depicts the geometric construction that allows for an unambiguous definition of the transfer ellipse, that is, of its semimajor axis a, eccentricity e, and argument of pericenter ω, when y k A , r k , r k+1/3 , and Δθ k+1/3 are known. More precisely, the chord length and the lengths c 1 = 2a − r k and c 2 = 2a − r k+1/3 are evaluated first. Angle γ follows from 3 International Journal of Aerospace Engineering Hence, , if y > 0:5: The eccentricity can be found as Eventually, the initial true anomaly ν k and final true anomaly ν k+1/3 along the transfer orbit are evaluated by means of the following relations: and ambiguities are easily solved as the angle Δθ k+1/3 = ν k+1/3 − ν k is known.
Velocities at both endpoints (v + k and v − k+1/3 ) follow from the standard equations of the two-body problem. The transfer time Δt A can be also evaluated through the Kepler equa-tion, and consequently, the epoch of the intermediate maneuver t k+1/3 = t k + Δt A can be obtained.
The same procedure can be applied to the arc C, which connects point k + 2/3 to point k + 1. If this geometrical construction is defined as a procedure yArcðÞ, one has as a result The cost of the maneuvers performed at the departure and arrival points of the ðk + 1Þth leg are thus evaluated as 2.2. Multirevolution Lambert Problem. The central arc B, which connects points k + 1/3 and k + 2/3, cannot be dealt with in the same fashion. In fact, for a given choice of the parameters y k A and y k C , the maneuvering epochs t k+1/3 and t k+2/3 and, consequently, the travel time along arc B, Δt B = t k+2/3 − t k+1/3 , are assigned. Conversely, a multirevolution Lambert problem can be formulated, being the position vectors r k+1/3 and r k+2/3 and the travel time Δt B known. This problem admits 1 + 2n max solutions, where n max is the maximum possible number of revolutions: one solution for the 0revolution transfer arc and two solutions, namely, left and right branch, for each n-revolution transfer orbit. Let us introduce an integer parameter L ∈ ½−n max , n max , which indicates the solution corresponding to the jLj-revolution transfer orbit,  International Journal of Aerospace Engineering right branch for L > 0 and left branch otherwise. By knowing L, the velocity vector immediately after the second impulse v + k+1/3 and right before the third one v − k+2/3 can be evaluated according to the algorithm by Izzo [42], that is, Hence, the total cost of the two internal maneuvers is Instead of treating L as an optimization variable, an enumeration approach is used; that is, all the 2n max + 1 possible solutions are computed and the one with the lowest total Δ V is chosen. When considering transfers between close orbits, an educated guess safely restricts the analysis to just three scenarios; that is, the chaser performs the same number of revolution as the target, one more, or one less, corresponding to a total of six solutions (three right and three left branches) to be compared.

MINLP Formulation.
According to the proposed problem formulation, the overall multi-target trajectory can be parameterized by using a set of 8N design variables. Hence, the solution vector is where Eventually, the impulsive MRR trajectory optimization problem can be formulated as where the overall cost of the MRR trajectory is and x L and x U are the lower and upper bounds of the design variables, respectively. This problem involves the simultaneous optimization of both integer variables (defining the encounter sequence) and real-valued variables (radii and anomalies at the internal maneuvers, y parameters, and encounter epochs). For this reason, problem (15)

Two-Level Optimization Approach
The MINLP problem in Equation (15) belongs to the class of NP-hard problems [43]; hence, no deterministic algorithm is able to find the optimal solution in polynomial time. For this reason, a variety of stochastic metaheuristic techniques have been developed over the last decades with the aim of attaining (often suboptimal) good-quality solutions to the problem in a short time. However, as the problem dimension increases, also the computational time required to obtain a nearly optimal solution increases and may become prohibitive.
Instead of solving the problem as a whole, an alternative option is to decompose the MINLP into simpler subproblems, which could be, more or less easily, solved sequentially and whose solutions are eventually recomposed into a, hopefully near-optimal, solution to the original problem. For the problem at hand, a two-level approach can be pursued, by isolating (i) an outer-level combinatorial problem, which concerns the definition of the optimal rendezvous sequence and encounter epochs, discretized over an arbitrarily dense time-grid, while neglecting the details of each body-to-body transfer leg and (ii) an inner-level NLP problem, which deals with the accurate optimization of each body-to-body transfer, on the basis of the body sequence and the discrete encounter epochs just computed; these epochs may be left fixed or further refined at this level.
The two layers are interconnected: the outer level requires a measure of the cost associated with each transfer leg to evaluate the quality of any encounter sequence, even though the actual ΔV of each leg can be evaluated only by solving the NLP problem referred to this transfer, that is, the inner-level problem. On the other hand, the inner level requires the definition of the encounter sequence and rendezvous epochs, which, in turn, are the output of the combinatorial, outer-level problem. In practice, the two problems could be solved sequentially provided that an alternative way, that is, a "heuristic", exists for attaining a reasonable and, possibly, inexpensive estimation of any transfer cost during the solution of the outer-level problem, without the need of solving the complete NLP problem. Once the heuristic has been established, the outer-level combinatorial problem, or tour problem, is isolated and solved first; then, its solution is given as input to the inner-level problem, which returns the fullproblem solution as output.

Cost Estimate for a Single Rendezvous
Leg. This section presents an analytical, suboptimal, four-impulse transfer strategy that gives, in a short computational time, a conservative estimate of the optimal ΔV for a planar, time-fixed, impulsive rendezvous between circular orbits. When the allowed travel time is sufficiently large, this strategy fairly approximates the performance of the optimal solution [44]. Furthermore, with respect to other heuristics adopted in similar studies [45], the proposed approach represents a real, feasible, mission scheme.
Let θ 1,0 and θ 2,0 be the true anomaly at time t = 0 of the chaser and target, which fly on circular orbits of radius r 1 and r 2 , respectively. The two orbits are coplanar and have angular velocities ω 1 and ω 2 , respectively. Assuming that the two orbits are not extremely far apart, the minimum-Δ V solution is represented by a Hohmann transfer, of semimajor axis a 12 = ðr 1 + r 2 Þ/2 and duration T H 12 = π ffiffiffiffiffiffiffiffiffiffi ffi a 3 12 /μ p . The transfer may be preceded by a coasting arc on the departure orbit, which allows the two bodies to achieve the correct relative phasing γ = π − ω 2 T H 12 required by this kind of maneuver. After the phase error Δγ = θ 2,0 − θ 1,0 − γ has been evaluated and normalized between 0 and 2π, the time length T wait of the coast arc can be evaluated as By comparing the maximum available transfer time T max with the sum of the coasting time T wait and the time spent on the Hohmann transfer ellipse T H 12 , it is possible to obtain a condition for the viability of the Hohmann transfer Whenever Equation (18) is satisfied, the rendezvous has the cost ΔV H of the Hohmann transfer. Otherwise, the cost is evaluated according to the mission scheme depicted in Figure 3. The chaser spacecraft is injected into a circular (either internal or external) waiting orbit of radius r 3 with a Hohmann transfer "1-3," of semimajor axis a 13 = ðr 1 + r 3 Þ/2 and duration T H 13 = π ffiffiffiffiffiffiffiffiffiffi a 3 13 /μ p . Once the chaser has reached the correct phasing with respect to the target body, a second Hohmann transfer "3-2," of semimajor axis a 32 = ðr 2 + r 3 Þ/2 and duration T H 32 = π ffiffiffiffiffiffiffiffiffiffi a 3 32 /μ p , is performed to complete the rendezvous.
The problem now reduces to finding the radius r 3 that satisfies the rendezvous condition. The phase angle recovered by the chaser is with Θ > 0 for r 3 < r 1 and vice versa. If the initial phase angle Δθ 0 = θ 2,0 − θ 1,0 is normalized between 0 and 2π, the equality between the chaser and target positions at the end of the maneuver requires where k ext = 1 for an external maneuver and k ext = 0 when r 3 < r 1 . The cheaper solution is selected.
An alternate 3-impulse heuristic outlines the rendezvous maneuver as an n-revolution bielliptic transfer split into two legs, each performing 1/2 or ðn − 1/2Þ revolutions. In other words, the intermediate constant-radius arc of the 4impulse heuristic is replaced by adding ðn − 1Þ complete revolutions to either ellipse. The radius of the intermediate impulse is selected in order to eventually achieve the 6 International Journal of Aerospace Engineering rendezvous with the target body. The angular length of the transfer is constrained at an integer number of revolutions. The time length is lower than the allocated time and the maneuver is completed by adding a coasting either on the initial or the final orbit. The 3-impulse heuristic better resembles the optimal rendezvous maneuver than the 4-impulse heuristic, but other features advice against its use.

Outer-Level Problem.
Under the assumption that all bodies fly on coplanar circular orbits and that the allowed mission time is sufficiently large, the optimal transfer between any pair of bodies is a Hohmann transfer. The cost ΔVði, jÞ of the transfer from body i to body j is independent of the specific departure and arrival epochs. The problem reduces to the search of the body sequence that minimizes the total velocity change. Let p = fp 1 , p 2 , ⋯, p N g ∈ P N be a permutation of the N integers f1, 2, ⋯, Ng, which identify the target bodies. So, the set P N is made up of all the N! possible permutations of N integers. The time-free optimization problem can be written as where p 0 = 0 denotes the chaser at the initial time. In order to speed up the solution algorithm, the transfer cost for any pair of arrival and departure bodies can be preliminarily evaluated and collected in a two-dimensional cost tensor ΔV ij of dimension ðN + 1Þ × N. An attentive reader will notice that this problem closely resembles the classical and well-studied Traveling Salesman Problem (TSP). The solution of the previous problem provides a lower bound on the tour overall cost, but such a limited value may be quite far from the optimal solution of the original problem. In fact, due to the existing constraint on the overall mission duration, waiting for the perfect phasing required by the Hohmann solution may not be a viable option for missions involving many transfers.
In planar time-constrained rendezvous maneuvers, the transfer costs are highly sensitive to the initial phasing, that is, to the departure epoch and to the transfer time-length. The following sections present two formulations, of increasing complexity, of the outer-level combinatorial problem, which is aimed at determining the optimal body sequence and the rendezvous epochs, discretized over a grid in the admissible time window.
3.2.1. Uniform Time-Length Tour. Under the simplifying assumption that the duration of all transfers is exactly the same, a relaxed problem can be formulated, where the transfer cost is only dependent on the departure epoch. In this scenario, the epoch of the k-th rendezvous is readily available as t k = k T M /N. The problem reduces again to finding the optimal permutation p ⋆ ∈ P N of the N targets but, in this case, the cost of the transfer from body i to body j also depends on the position k of the leg within the sequence of transfers. The uniform time-length tour optimization problem can be written as where ΔVði, j, kÞ indicates the cost to move from target i to target j, departing at time t k−1 and arriving at time t k . Again, all transfer costs can be preliminarily evaluated and collected in a three-dimensional cost tensor ΔV ijk with dimension ðN + 1Þ × N × N. By extending the analogy between time-free tour and TSP, this novel combinatorial optimization problem is similar to a time-dependent version of TSP [46,47], named TDTSP-A by the authors in a previous work [38], where the cost of any city-to-city transfer also depends on its position in the whole tour.

Discrete Time-Length
Tour. If the possible rendezvous epochs are discretized over a finer time grid, one obtains a combinatorial optimization problem with a solution closer to that of the original time-continuous problem and a formulation which is still close, to a certain extent, to that of the uniform time-length tour defined in the previous section.
Let τ = fτ 1 , τ 2 , ⋯, τ N t g be a time grid composed of N t = ND equidistant points, such that τ k+1 − τ k = Δτ = T M /N t . Let ΔVði, j, τ h , τ h+m Þ denote the cost to move from body i to body j, departing at time τ h and arriving at time τ h+m . In this case, both the target sequence and the location of the rendezvous epochs over the grid are required for the computation of the tour overall cost. However, the problem can still be formulated using a single permutation Π ∈ P N t of the integers f1, 2, ⋯, N t g, i.e., with a number of elements equal to the discrete number of available rendezvous epochs. In fact, the elements of Π with a value lower or equal to N constitute the target sequence p = fp 1 , p 2 , ⋯, p N g, while their position in Π reveals which elements of τ make up the vector of rendezvous epochs t = ft 1 , t 2 , ⋯, t N g. As a result, the discrete time-length tour optimization problem can be written as An example of the adopted permutation encoding is presented in Figure 4 for N = 5 and D = 3, with the corresponding extraction of target sequence p = f1, 3, 2, 4, 5g and rendezvous epochs t = fτ 2 , τ 5 , τ 7 , τ 11 , τ 15 g.
For the sake of brevity, a tour with N targets and N t = ND possible rendezvous epochs will be named N × D tour in the following. Of course, if D = 1, a uniform time-length tour is obtained. For this reason, D is defined as a time division factor.
As in the other tour problems, all transfer costs can be precomputed to speed up the optimization process. A 4order tensor ΔV ijhm of dimension ðN + 1Þ × N × ND × NðD − 1Þ is needed in principle to store all the computed values. However, because of the monotonic, nonincreasing, behavior of the ΔV with the transfer time, one may decide to limit the calculus to transfers of duration lower than or equal to MΔτ, for an arbitrary M ∈ ½1, NðD − 1ÞÞ, and assume the same cost for longer transfers. In this case, a 4-order tensor with dimension ðN + 1Þ × N × ND × M would be sufficient. In addition to reducing the size of the tensor, this treatment has the additional advantage of guiding the solver towards a trajectory with more uniform travel times, which may be preferable from an operational point of view.
It is advisable to keep the time division factor D sufficiently small, because (i) a rough evaluation of the encounter epochs is sufficient, as the attained epochs can be further refined in the inner-level optimization step; (ii) a large number of divisions make the problem too similar to the original MINLP and hence more difficult to solve; and (iii) the proposed heuristic is reasonably accurate if the chaser has enough time to perform several revolutions around the central body; so, reducing the minimum allowed travel time Δτ makes the heuristic less reliable and may undermine our efforts.

Inner-Level
Problem. This step is aimed at improving the provisional trajectory provided by the solution of the outerlevel problem, based on suboptimal maneuvers and rendezvouses located at the nodes of a discrete time grid. Assuming that the optimal target sequence p ⋆ = fp ⋆ 1 , ⋯, p ⋆ N g and the discrete rendezvous epochs t = f t 1 , ⋯, t N g have been determined, the MINLP problem in Equation (15) reduces to an unconstrained NLP problem, where the design variables are with the continuous rendezvous epochs. Two scenarios can be investigated: (i) time-fixed: the encounter epochs t k are kept fixed at their nominal values t k (so, α k = 0, ∀k), and (ii) time-free: the encounter epochs are free to be optimized. In the first case, each body-tobody transfer can be optimized independently from the others, thus reducing the 6N-variable problem to solving N easier subproblems, each one of just 6 variables. In the second case, the encounter epochs may vary in a neighborhood of the reference values, leading to an improved solution, but the entire trajectory must be optimized as a single 7N-variable problem.
The lower and upper bounds of the design variables in Equation (26) The inner-level problem is tackled in this paper by EOS (Evolutionary Optimization at Sapienza) [48]. EOS is an inhouse optimization code for continuous-variable problems, which implements an improved self-adaptive, partially restarted Differential Evolution (DE) algorithm, with a synchronous island model to handle multiple populations in parallel. EOS was developed in the contest of the Global Trajectory Optimization Competitions [49,50] and has proven to be a flexible and high-performance algorithm, able to successfully cope with a broad range of real-world unconstrained and constrained space trajectory optimization problems [51,52]. The values of the hyperparameters used in this work are reported in Table 1. Interested readers are suggested to refer to Ref. [48] for further details about the algorithm.

Genetic Algorithm
A genetic algorithm (GA) is used in this paper for solving the outer-level combinatorial optimization problems described in Section 3.2. Genetic algorithms are well-known population-based metaheuristic optimization techniques inspired by the process of natural selection, firstly introduced by Holland in 1960 [53]. They have been successfully applied to a wide range of real-world problems of significant complexity. A brief overview of the algorithm is here presented. Interested readers can find further details in Ref. [54]. A block diagram of the basic algorithm is proposed in Figure 5, highlighting the presence of three fundamental genetic operators (selection, crossover, and mutation) which form the backbone of any GA implementation. An initial population, that is, a collection of n P candidate solutions to the optimization problem (often referred as individuals or chromosomes), is randomly generated, aiming at covering the search space as uniformly as possible. Each individual is composed of a number of consecutive elements, called genes, which identify specific values of the design variables. At the end of each iteration (or generation), the fitness f (that is, the value of the function to optimize) of each individual in the current population is evaluated. Then, the next generation starts with the creation of a mating population, by selecting a number of individuals from the current population according to a selection operator, which tries to promote individuals with a good fitness, without sacrificing diversity too much. Typical selection operators are Stochastic Roulette Wheel and Stochastic Tournament [54], the latter being the one used in the present application. Next, the individuals (or parents) in the mating population are randomly paired and their genes combined to produce new solutions (or offspring), according to a crossover operator. The underlying idea is that, by combining somehow good solutions, it is possible to create new solutions which hopefully are better than the previous ones. The process is repeated until a new offspring population (usually with the same dimension as the previous one) is created. Then, a few randomly chosen individuals of the offspring population undergo a mutation process, with the aim of increasing the population diversity. The mutation operator changes, in a random way, some genes of the chosen individuals. This new population eventually replaces the previous one, and the process is repeated until some termination criterion (e.g., a maximum number of gen-erations n G ) has been met. As a minor, yet important, tweak, elitism is enforced: at the end of each generation, the best individual of the parent population is copied into the offspring population, replacing the worst individual. In this way, the best solution is prevented from being accidentally lost in the evolution process.
While selection operators are usually problem-independent, crossover and mutation operators are tightly related to the adopted encoding, that is, the way the real-word problem is described in terms of numerical variables (that is, genes). On the basis of the features of the decision variables, several encodings have been defined, with the most common being the binary, the integer-valued, and the real-valued encodings. In order to match the permutation-based formulations developed in Section 3.2, a permutation encoding is here considered; that is, each individual is a permutation p ∈ P L , where L = N for the problems in Equations (21) and (22), while L = N t for the problem in Equation (23). So, each gene takes an integer value in the range f1, ⋯, Lg, and two genes of the same individual cannot have the same value at one time. Standard, general-purpose, crossover operators (such as one-point, two-point, or uniform crossover) cannot be applied to a permutation-encoded individual without producing unfeasible offspring, that is, child individuals whose genotype contains multiple copies of the same integer value. The same goes for classical mutation operators, such as random gene flip. Fortunately, an abundance of permutation-preserving genetic operators, specifically developed in the context of the TSP problem, is documented in the literature [55] and can be employed in solving the problem under investigation.   [56] has been adopted in the present paper, as it is aimed at preserving both the order and the position of as many elements as possible from the parent individuals. A pseudocode of this algorithm is presented in Algorithm 1. The first offspring c ð1Þ is created starting from a swath (or cut) s = fp Mutation operators are also of great relevance in the framework of GAs, as they allow escaping from situations of premature stagnation of the population on a suboptimal solution; thus, their impact on the effectiveness of the algorithm should not be underestimated. Once again, an abundance of permutation-preserving mutation operators is documented in the literature. In the present application, the mutation operator is randomly selected at each generation between three possible rules: insert, swap, and reverse. All operators begin by randomly selecting two indices a and b in ½1, L, with a < b. For the insert operator, the value in the ath position is transferred to the bth position and the other genes of the individual are shifted accordingly; the swap operator exchanges the two genes in position a and b; finally, the order of the genes from a-th to b-th is reverted when considering the reverse operator.
Some control over the population diversity is often needed to avoid an excessive uniformity between individuals during the optimization process, which would result in a very inefficient use of the crossover operator and, consequently, in a premature stagnation of the population in a local optimum. To overcome this issue, an active diversity control strategy is implemented: the fitness of each individual is scaled by a penalty factor, which measures how many other individuals in the current population are similar to it. Let d i,k be the distance in the state space between the i-th individual p ðiÞ and the k-th individual p ðkÞ . Specifically, the Hamming distance [57] is used to measure the difference between permutations, essentially counting the number of positions at which the corresponding elements are different between the two individuals: where ½x is the Iverson bracket function, or indicator function, which returns 1 if x is true, 0 otherwise. The augmented (i.e., penalized) fitness of the i-th individual is evaluated as where d i = ∑ k≠i d i,k is the diversity score of the i-th individual and δ dc is a control radius.
In the presence of a large population, the evaluation of the diversity score of each individual might be too expensive (as it goes with Oðn 2 Þ). Hence, the full distance d i is substituted by an approximationd i evaluated over a number of n dc individuals randomly selected in the population, that is,d where r k is a random integer in ½1, n P \ fig. In this work, the following values of the abovementioned hyperparameters showed the best overall results: n dc = 100 and δ dc = 1.
procedure PMXðp ð1Þ , p ð2Þ Þ ⊳ Input: parents ðp ð1Þ , p ð2Þ Þ choose randomly a, b so that ⊳ Accept the proposal end for returnc ð1Þ ⊳Output: child c ð1Þ end procedure Algorithm 1: Partially Matched Crossover (PMX) algorithm. 10 International Journal of Aerospace Engineering 4.1. GA Hyperparameters. The performance of a GA clearly depends not only on the choice of the selection, mutation, and crossover operators but also on the value of several hyperparameters, which drive the different phases of the evolution process. In addition to the population size n P and the maximum number of generations n G that are common to all population-based metaheuristics, typical GA hyperparameters are the crossover probability p c , which represents the probability that the parents are replaced by the generated offspring at the end of the crossover phase; the mutation probability p m , which is the probability that a (child) individual undergoes a random mutation during the mutation phase; and other operator-specific parameters, such as the depth of the reverse mutation operator. A proper tuning of these hyperparameters may greatly improve the effectiveness of a GA. However, the optimal tuning of a GA is documented to be problem-dependent and represents an NP-hard problem by itself. Therefore, a preliminary trial-and-error tuning is usually done on a simplified, possibly scaled, version of the optimization problem of interest, with the hope of capturing all the relevant features of the original problem. The results of the hyperparameter tuning performed in this study are reported in Section 5.2.

Numerical Results
In this paper, numerical results are presented for different ADR missions aimed at removing all the debris composing a predetermined cluster with up to 20 objects orbiting in close coplanar LEOs. ID and initial orbital parameters assumed for both the chaser and the targets composing the cluster are listed in Table 2. When a N × D tour is analyzed, only the first N debris of the cluster are considered, and the total mission time is set to T M = 7NT 0 , where T 0 is the initial orbital period of the chaser.

Analysis of the Cost Heuristic.
The effectiveness of the suboptimal solution proposed as heuristic in Section 3.1 has been verified by comparison with the solution obtained by EOS in meaningful test cases. In particular, the rendezvous with a target moving on a circular orbit of radius r 2 /r 1 = 1:02 is analyzed in this section. Figure 6(a) shows the ΔV estimated by the 4-impulse heuristic as a function of initial phase angle Δθ 0 ∈ ½0, 2π and available transfer time T max ∈ ½3T 1 , 10T 1 . One should note that the initial phase angle is a function of time, changing at constant but very slow rate for close orbits. As a matter of fact, this figure presents the cost of the rendezvous maneuver as a function of initial time and transfer time length. Depending on the initial time, ΔV may be much greater than the minimum value corresponding to the time-free Hohmann transfer. For the sake of comparison, Figure 6(b) also shows the ΔV provided by the 3-impulse heuristic, which features abrupt changes (i.e., a sudden reduction) when the greater time-length permits an additional revolution around the Earth.

12
International Journal of Aerospace Engineering The cost in this class of mission is ruled more by the propellant needed to recover an incorrect initial phasing than by the basic ΔV required by the passage to a different radius. The outer-level combinatorial problem looks for and privileges target sequences that imply transfers starting with a favourable phase angle, i.e., in a narrow range close to values permitting a Hohmann transfer. Therefore, the initial time of each rendezvous is dictated by the solution of the outer-level problem. When the initial time (i.e., the initial phasing) does not permit a Hohmann transfer, the required ΔV is significantly dependent on the time which is available to complete the maneuver. An analysis of the rendezvous cost is better carried out by investigating the influence of the available time for significant assigned values of the initial phase angle. Figure 7 compares, from this perspective, the performance estimated by the two heuristics with the optimal transfer (as optimized by EOS). Even though the 3-impulse heuristic is closer to the features of the optimal solution, the 4-impulse heuristic provides a sufficient guess at the rendezvous cost. On the other hand, the trend of the 4-impulse heuristic solution is quite regular, without the abrupt changes that make the 3-impulse heuristic less apt to use when the rendezvous epochs are discretized over a coarse time grid. It is worthwhile to remark the quality of the EOS optimal solution:  13 International Journal of Aerospace Engineering performance is never lower than the ones estimated by heuristics. In particular, the same performance is numerically achieved by EOS when either heuristic corresponds to the theoretically best strategy. Figure 8 presents the chaser trajectory in a few significant cases. The circular orbit of the target is shown as a solid black line, starting from its initial position. The 4-impulse heuristic always uses the entire available time. On the contrary, the angular length of the 3-impulse heuristic is constrained to an integer number of revolutions, and the maneuver is completed by a coast arc, which is either initial or final. An internal maneuver is presented in Figure 8(a). The optimal solution does not present the constant-radius arc: nevertheless, by means of four impulses, it is able to make use of all the available time, thus improving performance. The features of the optimal maneuver are better highlighted by the three cases of increasing time length that are illustrated in Figures 8(b)-8(d) for an external maneuver (Δθ 0 = 300 deg). The 3-impulse heuristic solution for T max / T 1 = 3:3 has a very short final coasting and is very close to the optimal solution that, in this case, has also three impulses, but its second impulse is displaced from the apogee to allow using the entire angle range with a lower apogee. For greater values of T max , the optimal rendezvous requires four impulses and gradually approaches the corresponding heuristic solution (Figure 8(c)) that, for just one value of available time, is itself optimal. The optimal maneuver hereafter moves away from the 4-impulse heuristic resembling an impracticable 3-impulse heuristic solution with one more lap. The latter solution can only occur when the available time permits the replacement of the long coast arc with an additional revolution and is, just at that moment, optimal (Figure 8(d)). These considerations explain the trend of the optimal solution in Figure 7 that periodically and alternately coincides with the 3-and 4impulse heuristic solutions.

Tuning of the GA Hyperparameters.
A preliminary tuning of the GA hyperparameters has been carried out by analyzing several instances of either the uniform or discrete time-length tour, keeping the problem size sufficiently small. This allows for an exhaustive search to be completed in a reasonable amount of time; thus, the optimal solution of the considered problems can be easily retrieved and used for comparison with the results of the GA. In order to mitigate the effects due to the intrinsic stochastic nature of the algorithm, 100 independent runs   The first analysis concerned the influence of the population size n P and the number of generations n G on the problem solution. The aim is to identify, if possible, a putative optimal allocation of trials, that is, a pair n P and n G that achieves the best performance for a known, limited, amount of function evaluations FES = n G · n P . Figure 9 presents the success rate (percentage of runs that achieve the optimal solution) as a function of the population size n P , for a few threshold values of FES, for the 8 × 2 (Figure 9(a)) and 8 × 3 (Figure 9(b)) missions. Here, the crossover and mutation probability are set as p c = 0:9 and p m = 0:15, respectively, and the diversity control is active with its default values. The success rate obviously increases when both the population size and the number of generations increase. However, when the number of available FES is limited, the results are not so easy to interpret. An even split between number of individuals and iterations, that is, n P = n G ≈ ffiffiffiffiffiffiffi ffi FES p , seems close to the optimal configuration. It is also worthwhile noting that, for a given number of FES, the success rate is higher in the 8 × 3 mission than in the 8 × 2 mission, even though the total number of variables is larger (24 vs. 16), and reaches 100% with a population of 1024 individuals. The reason is probably related to the fact the the fitness landscape is smoother with a higher D, and this favours the chosen genetic operators.
The effects of crossover and mutation probability on the success rate have been also investigated. Figure 10 reports the success rate for different values of p c (Figure 10(a)) and p m (Figure 10(b)) against the number of generations n G . A 12-target mission is considered for this test, using a population of n P = 64 individuals; the diversity control is active with default values. According to Figure 10(a), the performance of GA seems to improve faster with n G when small crossover probabilities are used (p c = 0:5 and p c = 0:7), but, at the same time, tends to stall earlier, thus showing an apparent premature-convergence behavior. Conversely, higher crossover probabilities (p c = 0:9 and p c = 0:95) lead to slower, yet constant, improvements. Thus, with n G = 2 15 , the best results in terms of success rate are obtained with both p c = 0:7 (96%) and p c = 0:9 (95%). Figure 10(b) shows that when the number of generations is less than or equal to 500, the success rate is almost independent on p m . Only when a very high number of iterations are carried out, the effect of increasing the mutation probability becomes apparent. This is an expected behavior, as mutation is primarily devoted to avoiding stagnation over long time horizons.
Finally, Figure 11 shows the effect of the diversity control (dc) on the GA performance in both a uniform and a discrete time-length tour. In these plots, the solid-line curves represent the median fitness along the evolution process, while the shaded regions around them represent the interquartile range. It is possible to note that, in all reported cases, the use of diversity control leads to an improved solution, preventing GA from a premature stagnation. The beneficial effect of diversity control is more evident in problem instances with a larger number of targets N (Figure 11(a)), since, on average, in these cases, the GA is more likely to get trapped in a local minimum.

Problem Solutions.
For the sake of brevity, only the numerical results concerning two study cases, that is, the 15 -and 20-target missions, are here presented. For each of these scenarios, several independent optimizations have been carried out using the GA code, and the best solution found was elected as putative optimum of the outer-level problem. Then, this solution was given as input to the inner-level optimization algorithm, with the goal of approaching the  Table 3 presents the best solutions, in terms of ΔV tot , obtained by GA at the end of the outer-level optimization and by EOS after the inner-level optimization, with either fixed or free rendezvous times. For either mission scenario (i.e., for assigned values of number of targets and total transfer time), ΔV tot decreases as the time-division factor D increases, whereas the target sequence remains essentially unaltered, with just a few exceptions. The maneuvers of the GA solutions reflect the suboptimal 4-impulse heuristic and can be improved by EOS. From Table 4, it is possible to note that the differences between GA solutions and corresponding EOS time-fixed solutions are always in the order of few percent, which is the expected error of the devised heuristic, thus confirming its effectiveness and accuracy.

16
International Journal of Aerospace Engineering Starting from the GA solution with D = 1, any further improvement is due to the allocation of more time to the legs that cannot be executed according to a Hohmann transfer. This allocation can be achieved either by increasing D or by using EOS in its time-free version. It is a curious coincidence that in both scenarios, the GA solution (D = 4) and the EOS time-free solution (D = 1) show approximately the same Δ V tot . The rendezvous maneuvers of the best GA solution are suboptimal, and the target sequence is slightly different from that of the uniform time-length tour. The application of EOS to this new sequence provides a further reduction of the required propellant.
The solutions for the 15-and 20-target scenarios, obtained after the time-free optimization, are reported in Tables 4 and 5, for each time-division factor D. Several details (i.e., optimal target sequence and, for each rendezvous, arrival time and required ΔV) are presented. The last column, referring to the N × 4 solution, compares the ΔV of each rendezvous with the cost ΔV H of the optimal timefree transfer. The best trajectories are composed, for the greater part, of Hohmann transfers; 3-and 4-impulse legs are anyway flown by the chaser when the target is unfavourably phased during the available time window. This result is evident in Figure 12, which shows the radial position of the chaser during the whole 20 × 3 mission. The Hohmann transfer is usually exploited to move between distant orbits. In these cases, the phase angle changes more rapidly, and it is easier to find a favourable rendezvous opportunity.
The overall improvement with respect to the first GA solution, which assumes uniform time-lengths, is significant (close to 20%). The overall cost ΔV tot of the best missions (see the last row in Tables 4 and 5) is 20 − 25% greater than the value of the time-free trajectory having the same target sequence. These evidences let one argue that this two-step procedure is able to attain a solution very close to that of the original MINLP problem.

Conclusion
A two-level optimization procedure for the design of planar multirendezvous trajectories has been devised and tested. The goal is to minimize the overall propellant consumption, while completing, within a prescribed amount of time, the tour all the bodies in a prescribed set. Satisfactory numerical solutions have been obtained for an active debris removal mission comprising up to 20 target bodies orbiting in coplanar LEOs. The first, outer-level step concerns the definition of the optimal target sequence together with a preliminary evaluation of the rendezvous epochs. This step results in a permutation-optimization problem. Two different formulations, of increasing complexity, have been used, by supposing uniform time-lengths of the transfer legs or by selecting the rendezvous epochs in a discrete time grid. The growth in the problem complexity is rewarded by the attainment of a solution closer to the global optimum of the complete mixed-integer nonlinear programming problem. A genetic algorithm, with a peculiar permutation encoding and effective permutation-preserving genetic operators, is able to solve the combinatorial optimization problem. A simple, but fairly accurate, analytic solution of the single-target rendezvous maneuver is adopted as heuristic for a fast evaluation of the ΔV associated to each leg, without the need for an accurate optimization.
The solution attained at the end of the outer-level optimization procedure is further refined in the inner-level optimization step, by assuming the body sequence fixed and by optimizing the multi-impulse rendezvous maneuvers. Two formulations of increasing complexity are proposed even in this case. The first one maintains the rendezvous epochs found in the first step and only replaces each suboptimal trajectory described by the 4-impulse heuristic with an improved maneuver. More precisely, each body-to-body transfer is described by means of a peculiar parameterization based on the position of the impulses: the total velocity change is minimized. The second formulation keeps the target sequence but can modify the rendezvous epochs to achieve a more appropriate sizing of the time allocated for each leg. The contemporary optimization of the entire trajectory is necessary.
Numerical results provided by this two-level procedure make the authors confident about the attainment of a trajectory which is very close to the solution of the full mixed-integer nonlinear programming problem. The overall computational effort is low enough to be compatible with desktop personal computers.
The two levels of the optimization procedure appeared of equal importance in the investigated scenarios. Instead, if the number of bodies and the average time available for a single rendezvous are large, the correct target phasing occurs frequently, and a solution based on many Hohmann transfers and very few 4-impulse low-ΔV maneuvers can be soon achieved by the outer-lever optimization. The inner-level step becomes more important in problems with few targets and narrow time windows.

Data Availability
The data that support the findings of this study are contained in the article.

Conflicts of Interest
The authors declare that they have no conflicts of interest.