ISRN.AEROSPACE.ENGINEERING ISRN Aerospace Engineering 2314-6427 Hindawi Publishing Corporation 950912 10.1155/2013/950912 950912 Research Article Approximate Solutions to Nonlinear Optimal Control Problems in Astrodynamics http://orcid.org/0000-0002-5369-6887 Topputo Francesco Bernelli-Zazzera Franco Baoyin H. Bigelow C. Yu D. Department of Aerospace Science and Technology Politecnico di Milano Via La Masa 34 20156 Milano Italy polimi.it 2013 1 10 2013 2013 25 06 2013 27 08 2013 2013 Copyright © 2013 Francesco Topputo and Franco Bernelli-Zazzera. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

A method to solve nonlinear optimal control problems is proposed in this work. The method implements an approximating sequence of time-varying linear quadratic regulators that converge to the solution of the original, nonlinear problem. Each subproblem is solved by manipulating the state transition matrix of the state-costate dynamics. Hard, soft, and mixed boundary conditions are handled. The presented method is a modified version of an algorithm known as “approximating sequence of Riccati equations.” Sample problems in astrodynamics are treated to show the effectiveness of the method, whose limitations are also discussed.

1. Introduction

Optimal control problems are solved with indirect or direct methods. Indirect methods stem from the calculus of variations [1, 2]; direct methods use a nonlinear programming optimization [3, 4]. Both methods require the solution of a complex set of equations (Euler-Lagrange differential equations or Karush-Kuhn-Tucker algebraic equations) for which iterative numerical methods are used. These iterative procedures implement some form of Newton’s method to find the zeros of a nonlinear function. They are initiated by providing an initial guess solution. Guessing an appropriate initial solution is not trivial and requires a deep knowledge of the problem at hand. In indirect methods, the initial value of the Lagrange multiplier has to be provided, whose lack of physical meaning makes it difficult to formulate a good guess. In direct methods, the initial trajectory and control have to be guessed at discrete points over the whole time interval.

This paper presents an approximate method to solve nonlinear optimal control problems. This is a modification of the method known as “approximating sequence of Riccati equations” (ASRE) [5, 6]. It transforms the nonlinear dynamics and objective function into a pseudolinear and quadratic-like structure, respectively, by using state- and control-dependent functions. At each iteration, these functions are evaluated by using the solutions at the previous iteration, and therefore, a series of time-varying linear quadratic regulators is treated. This sequence is solved with a state transition matrix approach, where three different final conditions are handled: final state fully specified, final state not specified, and final state not completely specified. These define hard, soft, and mixed constrained problems, respectively.

The main feature of the presented method is that it does not require guessing any initial solution or Lagrange multiplier. In fact, iterations start by evaluating the state- and control-dependent functions using the initial condition and zero control, respectively. The way the dynamics and objective function are factorized recalls the state-dependent Riccati equations (SDRE) method . These two methods possess some similarities, although the way they solve the optimal control problem is different. As the method is approximated, suboptimal solutions are derived. These could be used as first guess solutions for either indirect or direct methods.

2. The Nonlinear Optimal Control Problem

The optimal control problem requires that, given a set of n first-order differential equations (1)x˙=f(x,u,t), the m control functions u(t) must be determined within initial, final time ti, tf, such that the performance index (2)J=φ(x(tf),tf)+titfL(x,u,t)dt is minimized while satisfying n+q two-point conditions (3)x(ti)=xi,ψ(x(tf),tf)=0.

The problem consists in finding a solution that represents a stationary point of the augmented performance index (4)J¯=φ(x(tf),tf)+νTψ(x(tf),u(tf),tf)+titf[L(x,u,t)+λT(f(x,u,t)-x˙)]dt, where λ is the vector of costate and ν is the multiplier of the boundary condition. The necessary conditions for optimality, also referred to as Euler-Lagrange equations, are (5)x˙=Hλ,λ˙=-Hx,Hu=0, where H, the Hamiltonian, is (6)H(x,λ,u,t)=L(x,u,t)+λTf(x,u,t). The differential-algebraic system (5) must be solved together with the final boundary conditions (3) and the transversality conditions (7)λ(tf)=[φx+(ψx)Tν]t=tf, which define a differential-algebraic parametric two-point boundary value problem whose solution supplies ν and the functions x(t), λ(t), u(t), t[ti,tf].

3. The Approximating Sequence of Riccati Equations

Let the controlled dynamics (1) be rewritten in the form (8)x˙=A(x,t)x+B(x,u,t)u, and let the objective function (2) be rearranged as (9)J=12xT(tf)S(x(tf),tf)x(tf)+12titf[xTQ(x,t)x+uTR(x,u,t)u]dt, where the operators A, B, S, Q, and R have appropriate dimensions. The nonlinear dynamics (8) and the performance index (9) define an optimal control problem. The initial state, xi, is assumed to be given, while the final condition (ψ in (3)) can assume three different forms (see Section 4). The problem is formulated as an approximating sequence of Riccati equations. This method reduces problem (8)-(9) to a series of time-varying linear quadratic regulators that are defined by evaluating the state- and control-dependent matrices using the solution of the previous iteration (the first iteration considers the initial condition and zero control).

The initial step consists in solving problem 0, which is defined as follows: (10)x˙(0)=A(xi,t)x(0)+B(xi,0,t)u(0),J=12x(0)T(tf)S(xi,tf)x(0)(tf)+12titf[x(0)TQ(xi,t)x(0)+u(0)TR(xi,0,t)u(0)]dt. Problem 0 is a standard time-varying linear quadratic regulator (TVLQR), as the arguments of A, B, S, Q, and R are all given except for the time. This problem is solved to yield x(0)(t) and u(0)(t), t[ti,tf], where the superscript denotes the problem that the solution refers to.

At a generic, subsequent iteration, problem k has to be solved. This is defined as follows: (11)x˙(k)=A(x(k-1)(t),t)x(k)+B(x(k-1)(t),u(k-1)(t),t)u(k),J=12x(k)T(tf)S(x(k-1)(tf),tf)x(k)(tf)+12titf[x(k)TQ(x(k-1)(t),t)x(k)ccccccccccc+u(k)TR(x(k-1)(t),u(k-1)(t),t)u(k)]dt. Problem k is again a TVLQR; note that x(k-1) and u(k-1) are the solutions of problem k-1 achieved at previous iteration. Solving problem k yields x(k)(t) and u(k)(t), t[ti,tf].

Iterations continue until a certain convergence criterion is satisfied. In the present implementation of the algorithm, the convergence is reached when (12)x(k)-x(k-1)=maxt[ti,tf]{|xj(k)(t)-xj(k-1)(t)|,j=1,,n}ɛ, where ɛ is a prescribed tolerance. That is, iterations terminate when the difference between each component of the state, evaluated for all times, changes by less than ɛ between two consecutive iterations.

4. Solution of the Time-Varying Linear Quadratic Regulator by the State Transition Matrix

With the approach sketched in Section 3, a fully nonlinear optimal control problem is reduced to a sequence of time-varying linear quadratic regulators. These can be solved a number of times to achieve an approximate solution of the original, nonlinear problem. This is done by exploiting the structure of the problem as well as its state transition matrix. This scheme differs from that implemented in [5, 6], and, in part, is described in .

Suppose that the following dynamics are given: (13)x˙=A(t)x+B(t)u, together with the quadratic objective function (14)J=12xT(tf)S(tf)x(tf)+12titf[xTQ(t)x+uTR(t)u]dt, where Q, R, and S are positive semidefinite and positive definite time-varying matrices with appropriate dimensions, respectively. The Hamiltonian of this problem is (15)H=12[xTQ(t)x+uTR(t)u]+λT[A(t)x+B(t)u], and the optimality conditions (5) read (16)x˙=A(t)x+B(t)u,(17)λ˙=-Q(t)x-AT(t)λ,(18)0=R(t)u+BT(t)λ. From (18), it is possible to get (19)u=-R-1(t)BT(t)λ, which can be substituted into (16)-(17) to yield (20)x˙=A(t)x-B(t)R-1(t)BT(t)λ,λ˙=-Q(t)x-AT(t)λ. In a compact form, (20) can be arranged as (21)[x˙λ˙]=[A(t)-B(t)R-1(t)BT(t)-Q(t)-AT(t)][xλ]. Since (21) is a system of linear differential equations, the exact solution can be written as (22)x(t)=ϕxx(ti,t)xi+ϕxλ(ti,t)λi,(23)λ(t)=ϕλx(ti,t)xi+ϕλλ(ti,t)λi, where the functions ϕxx, ϕxλ, ϕλx, and ϕλλ are the components of the state transition matrix, which can be found by integrating the following dynamics: (24)[ϕ˙xxϕ˙xλϕ˙λxϕ˙λλ]=[A(t)-B(t)R-1(t)BT(t)-Q(t)-AT(t)][ϕxxϕxλϕλxϕλλ], with the initial conditions (25)ϕxx(ti,ti)=ϕλλ(ti,ti)=In×n,ϕxλ(ti,ti)=ϕλx(ti,ti)=0n×n.

If both xi and λi were given, it would be possible to compute x(t) and λ(t) through (22)-(23), and therefore the optimal control function u(t) with (19). The initial condition is assumed to be given, whereas the computation of λi depends on the final condition, which, in the present algorithm, can be defined in three different ways.

4.1. Hard Constrained Problem

In a hard constrained problem (HCP), the value of the final state is fully specified, x(tf)=xf, and therefore, (14) does not account for S. The value of λi can be found by writing (22) at final time (26)xf=ϕxx(ti,tf)xi+ϕxλ(ti,tf)λi and by solving for λi; that is, (27)λi(xi,xf,ti,tf)=ϕxλ-1(ti,tf)[xf-ϕxx(ti,tf)xi].

4.2. Soft Constrained Problem

In a soft constrained problem (SCP), the final state is not specified, and thus S in (14) is an n×n positive definite matrix. The transversality condition (7) sets a relation between the state and costate at final time (28)λ(tf)=S(tf)x(tf), which can be used to find λi. This is done by writing (22)-(23) at final time and using (28) (29)x(tf)=ϕxx(ti,tf)xi+ϕxλ(ti,tf)λi,S(tf)x(tf)=ϕλx(ti,tf)xi+ϕλλ(ti,tf)λi. Equations (29) represent a linear algebraic system of 2n equations in the 2n unknowns {x(tf),λi}. The system can be solved by substitution to yield (30)λi(xi,ti,tf)=[ϕλλ(ti,tf)-S(tf)ϕxλ(ti,tf)]-1×[S(tf)ϕxx(ti,tf)-ϕλx(ti,tf)]xi.

4.3. Mixed Constrained Problem

In a mixed constrained problem (MCP), some components of the final state are specified and some are not. Without any loss of generality, let the state be decomposed as x=(y,z), where y are the p known components at final time, y(tf)=yf, and z are remaining n-p elements. The costate is decomposed accordingly as λ=(ξ,η). With this formalism, S in (14) is (n-p)×(n-p), and it is pre- and postmultiplied by z(tf). The transversality condition (7) is η(tf)=S(tf)z(tf).

The MCP is solved by partitioning the state transition matrix in a suitable form such that, at final time, (22)-(23) read (31)[y(tf)z(tf)]=[ϕyyϕyzϕzyϕzz][yizi]+[ϕyξϕyηϕzξϕzη][ξiηi],(32)[ξ(tf)η(tf)]=[ϕξyϕξzϕηyϕηz][yizi]+[ϕξξϕξηϕηξϕηη][ξiηi], where the dependence of the state transition matrix components on ti, tf is omitted for brevity. From the first row of (31), it is possible to get (33)ξi=ϕyξ-1[yf-ϕyyyi-ϕyzzi]-ϕyξ-1ϕyηηi, which can be substituted in the second row of (31) to yield (34)z(tf)=[ϕzy-ϕzξϕyξ-1ϕyy]yi+[ϕzz-ϕzξϕyξ-1ϕyz]zi+ϕzξϕyξ-1yf+[ϕzη-ϕzξϕyξ-1ϕyη]ηi. Equations (33)-(34), together with the transversality condition η(tf)=S(tf)z(tf), can be substituted in the second row of (32) to compute the component of the initial costate (35)ηi(xi,yf,ti,tf)=[ϕ~ηη]-1w(xi,yf,ti,tf), where (36)ϕ~ηη=ϕηη-ϕηξϕyξ-1ϕyη-S(ϕzη-ϕzξϕyξ-1ϕyη),w(xi,yf,ti,tf)=[S(ϕzy-ϕzξϕyξ-1ϕyy)-ϕηy+ϕηξϕyξ-1ϕyy]yi+[S(ϕzz-ϕzξϕyξ-1ϕyz)+ϕηz+ϕηξϕyξ-1ϕyz]zi+[S(ϕzξϕyξ-1)-ϕηξϕyξ-1]yf. Once ηi is know, the remaining part of the initial costate, ξi, is computed through (33), and therefore, the full initial costate is obtained as a function of the initial condition, given final condition, initial and final time; that is,  λi(xi,yf,ti,tf)=(ξi(xi,yf,ti,tf),ηi(xi,yf,ti,tf)).

5. Numerical Examples

Two simple problems with nonlinear dynamics are considered to apply the developed algorithm. These correspond to the controlled relative spacecraft motion and to the controlled two-body dynamics for low-thrust transfers.

5.1. Low-Thrust Rendezvous

This problem is taken from the literature where a solution is available, for comparison’s sake [10, 11]. Consider the planar, relative motion of two particles in a central gravity field expressed in a rotating frame with normalized units: the length unit is equal to the orbital radius, the time unit is such that the orbital period is 2π, and the gravitational parameter is equal to 1. In these dynamics, the state, x=(x1,x2,x3,x4), represents the radial, tangential displacements (x1,x2) and the radial, tangential velocity deviations (x3,x4), respectively. The control, u=(u1,u2), is made up by the radial and tangential accelerations, respectively.

The equations of motion are (37)x˙1=x3,x˙2=x4,x˙3=2x4-(1+x1)(1r3-1)+u1,x˙4=-2x3-x2(1r3-1)+u2, with r=(x1+1)2+x22. The initial condition is xi=(0.2,0.2,0.1,0.1). Two different problems are solved to test the algorithm in both hard and soft constrained conditions.

Hard Constrained Rendezvous. The HCP consists in minimizing (38)J=12titfuTudt with the final, given condition xf=(0,0,0,0) and ti=0, tf=1.

Soft Constrained Rendezvous. The SCP considers the following objective function: (39)J=12xT(tf)Sx(tf)+12titfuTudt, with S=diag(25,15,10,10), ti=0 and tf=1 (xf is free).

The differential equations (37) are factorized into the form of (8) as (40)[x˙1x˙2x˙3x˙4]=[00100001f(x1,x2)(1+1x1)0020f(x1,x2)-20]A(x)×[x1x2x3x4]+B[u1u2], with f(x1,x2)=-1/[(x1+1)2+x22]3/2+1. Thus, the problem is put into the pseudo-LQR form (8)-(9) by defining A(x) and B as in (40) and by setting Q=04×4 and R=I2×2.

The two problems have been solved with the developed method. Table 1 reports the details of the HCP and SCP, whose solutions are shown in Figures 1 and 2, respectively. In Table 1, J is the objective function at the final iteration, “Iter” is the number of iterations, and the “CPU time” is the computational time (this refers to an Intel Core 2 Duo 2 GHz with 4 GB RAM running Mac OS X 10.6). The termination tolerance ɛ in (12) is 10-9. The optimal solutions found replicate those already known in the literature [10, 11], indicating the effectiveness of the developed method.

Rendezvous solutions details.

Problem J Iter CPU time (s)
HCP 0.9586 5 0.375
SCP 0.5660 6 0.426

Hard constrained rendezvous.

x 1 versus x2

x 3 versus x4

u 1 versus u2

Soft constrained rendezvous.

x 1 versus x2

x 3 versus x4

u 1 versus u2

5.2. Low-Thrust Orbital Transfer

In this problem, the controlled, planar Keplerian motion of a spacecraft in polar coordinates is studied. The dynamics are written in scaled coordinates, where the length unit corresponds to the radius of the initial orbit, the time unit is such that its period is 2π, and the gravitational parameter is 1. The state, x=(x1,x2,x3,x4), is made up by the radial distance from the attractor (x1), the phase angle (x2), the radial velocity (x3), and the transversal velocity (x4), whereas the control, u=(u1,u2), corresponds to the radial and transversal accelerations, respectively [12, 13]. The equations of motion are (41)x˙1=x3,x˙2=x4,x˙3=x1x42-1x12+u1,x˙4=-2x3x4x1+u2x1, and the objective function is (42)J=12titfuTudt, with ti=0 and tf=π. The initial state corresponds to the conditions at the initial orbit; that is, xi=(1,0,0,1). Two different HCPs are solved, which correspond to the final states xf=(1.52,π,0,1.52-3/2) and xf=(1.52,1.5π,0,1.52-3/2), respectively. This setup mimics an Earth-Mars low-thrust transfer. The dynamics (41) and the objective function (42) are put in the form (8)-(9) by defining Q=04×4, R=I2×2, and (43)A(x)=[00100001x-1x1300x1x400-2x4x10],B(x)=[00001001x1].

The two HCPs have been solved with the developed method. The solutions’ details are reported in Table 2, whose columns have the same meaning as in Table 1. It can be seen that more iterations and an increased computational burden are required to solve this problem. The solution with x2,f=1.5π is reported in Figure 3.

Earth-Mars transfer details.

Problem J Iter CPU time (s)
x 2 , f = π 0.5298 22 5.425
x 2 , f = 1.5 π 4.8665 123 41.831

Orbital transfer with x2,f=1.5π.

Transfer trajectory

Control profile

6. Conclusion

In this paper, an approximated method to solve nonlinear optimal control problems has been presented, with applications to sample cases in astrodynamics. With this method, the nonlinear dynamics and objective function are factorized in a pseudolinear and quadratic-like forms, which are similar to those used in the state-dependent Riccati equation approach. Once in this form, a number of time-varying linear quadratic regulator problems are solved. A state transition matrix approach is used to deal with the time-varying linear quadratic regulators. The results show the effectiveness of the method, which can be used to either have suboptimal solutions or to provide initial solutions to more accurate optimizers.

Bryson A. E. Ho Y. C. Applied Optimal Control 1975 New York, NY, USA John Wiley & Sons Pontryagin L. S. Boltyanskii V. G. Gamkrelidze R. V. Mishchenko E. F. The Mathematical Theory of Optimal Processes 1962 New York, NY, USA John Wiley & Sons Betts J. T. Practical Methods for Optimal Control and Estimation Using Nonlinear Programming 2010 Philadelphia, Pa, USA SIAM Conway B. Spacecraft trajecory optimization using direct transcription and nonlinear programming Spacecraft Trajectory Optimization 2010 Cambridge, UK Cambridge University Press 37 78 Çimen T. Banks S. P. Global optimal feedback control for general nonlinear systems with nonquadratic performance criteria Systems and Control Letters 2004 53 5 327 346 2-s2.0-7444271527 10.1016/j.sysconle.2004.05.008 Çimen T. Banks S. P. Nonlinear optimal tracking control with application to super-tankers for autopilot design Automatica 2004 40 11 1845 1863 2-s2.0-4444320765 10.1016/j.automatica.2004.05.015 Mracek C. P. Cloutier J. R. Control designs for the nonlinear benchmark problem via the state-dependent Riccati equation method International Journal of Robust and Nonlinear Control 1998 8 4-5 401 433 2-s2.0-0032478466 Pearson J. D. Approximation methods in optimal control Journal of Electronics and Control 1962 13 453 469 Wernli A. Cook G. Suboptimal control for the nonlinear quadratic regulator problem Automatica 1975 11 1 75 84 2-s2.0-0016435369 Park C. Guibout V. Scheeres D. J. Solving optimal continuous thrust rendezvous problems with generating functions Journal of Guidance, Control, and Dynamics 2006 29 2 321 331 2-s2.0-33645566125 10.2514/1.14580 Park C. Scheeres D. J. Determination of optimal feedback terminal controllers for general boundary conditions using generating functions Automatica 2006 42 5 869 875 2-s2.0-33645158414 10.1016/j.automatica.2006.01.015 Owis A. Topputo F. Bernelli-Zazzera F. Radially accelerated optimal feedback orbits in central gravity field with linear drag Celestial Mechanics and Dynamical Astronomy 2009 103 1 1 16 2-s2.0-57849094464 10.1007/s10569-008-9161-6 Topputo F. Owis A. H. Bernelli-Zazzera F. Analytical solution of optimal feedback control for radially accelerated orbits Journal of Guidance, Control, and Dynamics 2008 31 5 1352 1359 2-s2.0-53349129235 10.2514/1.33720