Approximate Solutions to Nonlinear Optimal Control Problems in Astrodynamics

A method to solve nonlinear optimal control problems is proposed in this work. The method implements an approximating sequence of time-varying linear quadratic regulators that converge to the solution of the original, nonlinear problem. Each subproblem is solved by manipulating the state transition matrix of the state-costate dynamics. Hard, soft, and mixed boundary conditions are handled. The presented method is a modified version of an algorithm known as “approximating sequence of Riccati equations.” Sample problems in astrodynamics are treated to show the effectiveness of the method, whose limitations are also discussed.


Introduction
Optimal control problems are solved with indirect or direct methods.Indirect methods stem from the calculus of variations [1,2]; direct methods use a nonlinear programming optimization [3,4].Both methods require the solution of a complex set of equations (Euler-Lagrange differential equations or Karush-Kuhn-Tucker algebraic equations) for which iterative numerical methods are used.These iterative procedures implement some form of Newton's method to find the zeros of a nonlinear function.They are initiated by providing an initial guess solution.Guessing an appropriate initial solution is not trivial and requires a deep knowledge of the problem at hand.In indirect methods, the initial value of the Lagrange multiplier has to be provided, whose lack of physical meaning makes it difficult to formulate a good guess.In direct methods, the initial trajectory and control have to be guessed at discrete points over the whole time interval.
This paper presents an approximate method to solve nonlinear optimal control problems.This is a modification of the method known as "approximating sequence of Riccati equations" (ASRE) [5,6].It transforms the nonlinear dynamics and objective function into a pseudolinear and quadraticlike structure, respectively, by using state-and controldependent functions.At each iteration, these functions are evaluated by using the solutions at the previous iteration, and therefore, a series of time-varying linear quadratic regulators is treated.This sequence is solved with a state transition matrix approach, where three different final conditions are handled: final state fully specified, final state not specified, and final state not completely specified.These define hard, soft, and mixed constrained problems, respectively.
The main feature of the presented method is that it does not require guessing any initial solution or Lagrange multiplier.In fact, iterations start by evaluating the state-and control-dependent functions using the initial condition and zero control, respectively.The way the dynamics and objective function are factorized recalls the state-dependent Riccati equations (SDRE) method [7][8][9].These two methods possess some similarities, although the way they solve the optimal control problem is different.As the method is approximated, suboptimal solutions are derived.These could be used as first guess solutions for either indirect or direct methods.

The Nonlinear Optimal Control Problem
The optimal control problem requires that, given a set of  first-order differential equations the  control functions u() must be determined within initial, final time   ,   , such that the performance index is minimized while satisfying  +  two-point conditions The problem consists in finding a solution that represents a stationary point of the augmented performance index where  is the vector of costate and ^is the multiplier of the boundary condition.The necessary conditions for optimality, also referred to as Euler-Lagrange equations, are where , the Hamiltonian, is The differential-algebraic system (5) must be solved together with the final boundary conditions (3) and the transversality conditions which define a differential-algebraic parametric two-point boundary value problem whose solution supplies ^and the functions x(), (), u(),  ∈ [  ,   ].

The Approximating Sequence of Riccati Equations
Let the controlled dynamics (1) be rewritten in the form and let the objective function (2) be rearranged as where the operators , , , , and  have appropriate dimensions.The nonlinear dynamics (8) and the performance index (9) define an optimal control problem.The initial state, x  , is assumed to be given, while the final condition ( in (3)) can assume three different forms (see Section 4).The problem is formulated as an approximating sequence of Riccati equations.This method reduces problem (8)- (9) to a series of time-varying linear quadratic regulators that are defined by evaluating the state-and control-dependent matrices using the solution of the previous iteration (the first iteration considers the initial condition and zero control).The initial step consists in solving problem 0, which is defined as follows: Problem 0 is a standard time-varying linear quadratic regulator (TVLQR), as the arguments of , , , , and  are all given except for the time.This problem is solved to yield where the superscript denotes the problem that the solution refers to.At a generic, subsequent iteration, problem  has to be solved.This is defined as follows: Problem  is again a TVLQR; note that x (−1) and u (−1) are the solutions of problem  − 1 achieved at previous iteration.Solving problem  yields x () () and u Iterations continue until a certain convergence criterion is satisfied.In the present implementation of the algorithm, the convergence is reached when where  is a prescribed tolerance.That is, iterations terminate when the difference between each component of the state, evaluated for all times, changes by less than  between two consecutive iterations.

Solution of the Time-Varying Linear Quadratic Regulator by the State Transition Matrix
With the approach sketched in Section 3, a fully nonlinear optimal control problem is reduced to a sequence of timevarying linear quadratic regulators.These can be solved a number of times to achieve an approximate solution of the original, nonlinear problem.This is done by exploiting the structure of the problem as well as its state transition matrix.This scheme differs from that implemented in [5,6], and, in part, is described in [1].Suppose that the following dynamics are given: together with the quadratic objective function where , , and  are positive semidefinite and positive definite time-varying matrices with appropriate dimensions, respectively.The Hamiltonian of this problem is and the optimality conditions (5) read From (18), it is possible to get which can be substituted into ( 16)-(17) to yield In a compact form, (20) can be arranged as Since (21) is a system of linear differential equations, the exact solution can be written as where the functions   ,   ,   , and   are the components of the state transition matrix, which can be found by integrating the following dynamics: with the initial conditions If both x  and   were given, it would be possible to compute x() and () through ( 22)-( 23), and therefore the optimal control function u() with (19).The initial condition is assumed to be given, whereas the computation of   depends on the final condition, which, in the present algorithm, can be defined in three different ways.

Hard Constrained Problem.
In a hard constrained problem (HCP), the value of the final state is fully specified, x(  ) = x  , and therefore, (14) does not account for .The value of   can be found by writing (22) at final time and by solving for   ; that is, where the dependence of the state transition matrix components on   ,   is omitted for brevity.From the first row of (31), it is possible to get which can be substituted in the second row of (31) to yield Equations ( 33)-( 34), together with the transversality condition (  ) = (  )z(  ), can be substituted in the second row of (32) to compute the component of the initial costate where Once   is know, the remaining part of the initial costate,   , is computed through (33), and therefore, the full initial costate is obtained as a function of the initial condition, given final condition, initial and final time; that is,   (x  , y  ,   ,   ) = (  (x  , y  ,   ,   ),   (x  , y  ,   ,   )).

Numerical Examples
Two simple problems with nonlinear dynamics are considered to apply the developed algorithm.These correspond to the controlled relative spacecraft motion and to the controlled two-body dynamics for low-thrust transfers.

Low-Thrust
Rendezvous.This problem is taken from the literature where a solution is available, for comparison's sake [10,11].Consider the planar, relative motion of two particles in a central gravity field expressed in a rotating frame with normalized units: the length unit is equal to the orbital radius, the time unit is such that the orbital period is 2, and the gravitational parameter is equal to 1.In these dynamics, the state, x = ( 1 ,  2 ,  3 ,  4 ), represents the radial, tangential displacements ( 1 ,  2 ) and the radial, tangential velocity deviations ( 3 ,  4 ), respectively.The control, u = ( 1 ,  2 ), is made up by the radial and tangential accelerations, respectively.
Soft Constrained Rendezvous.The SCP considers the following objective function: with  = diag(25, 15, 10, 10),   = 0 and   = 1 (x  is free).The differential equations (37) are factorized into the form of (8) as with ( 1 ,  2 ) = −1/[( 1 + 1) 2 +  2 2 ] 3/2 + 1.Thus, the problem is put into the pseudo-LQR form ( 8)-( 9) by defining (x) and  as in (40) and by setting  = 0 4×4 and  =  2×2 .The two problems have been solved with the developed method.Table 1 reports the details of the HCP and SCP, whose solutions are shown in Figures 1 and 2, respectively.In Table 1,  is the objective function at the final iteration, "Iter" is the number of iterations, and the "CPU time" is the computational time (this refers to an Intel Core 2 Duo 2 GHz with 4 GB RAM running Mac OS X 10.6).The termination tolerance  in ( 12) is 10 −9 .The optimal solutions found replicate those already known in the literature [10,11], indicating the effectiveness of the developed method.

Low-Thrust Orbital Transfer.
In this problem, the controlled, planar Keplerian motion of a spacecraft in polar coordinates is studied.The dynamics are written in scaled coordinates, where the length unit corresponds to the radius of the initial orbit, the time unit is such that its period is 2, and the gravitational parameter is 1.The state, x = ( 1 ,  2 ,  3 ,  4 ), is made up by the radial distance from the attractor ( 1 ), the phase angle ( 2 ), the radial velocity ( 3 ), and the transversal velocity ( 4 ), whereas the control, u = ( 1 ,  2 ), corresponds to the radial and transversal accelerations, respectively [12,13].The equations of motion are and the objective function is with   = 0 and   = .The initial state corresponds to the conditions at the initial orbit; that is, x  = (1, 0, 0, 1).Two different HCPs are solved, which correspond to the final states  x  = (1.52,, 0, 1.52 −3/2 ) and x  = (1.52,1.5, 0, 1.52 −3/2 ), respectively.This setup mimics an Earth-Mars low-thrust transfer.The dynamics (41) and the objective function (42) are put in the form ( 8)-( 9) by defining  = 0 4×4 ,  =  2×2 , and The two HCPs have been solved with the developed method.The solutions' details are reported in Table 2, whose columns have the same meaning as in Table 1.It can be seen that more iterations and an increased computational burden are required to solve this problem.The solution with  2, = 1.5 is reported in Figure 3.

Conclusion
In this paper, an approximated method to solve nonlinear optimal control problems has been presented, with applications to sample cases in astrodynamics.With this method, the nonlinear dynamics and objective function are factorized in a pseudolinear and quadratic-like forms, which are similar to those used in the state-dependent Riccati equation approach.Once in this form, a number of time-varying linear quadratic regulator problems are solved.A state transition matrix approach is used to deal with the time-varying linear quadratic regulators.The results show the effectiveness of the method, which can be used to either have suboptimal solutions or to provide initial solutions to more accurate optimizers.