Most of the direct methods solve optimal control problems with nonlinear programming solver. In this paper we propose a novel feedback control method for solving for solving affine control system, with quadratic cost functional, which makes use of only linear systems. This method is a numerical technique, which is based on the combination of Haar wavelet collocation method and successive Generalized Hamilton-Jacobi-Bellman equation. We formulate some new Haar wavelet operational matrices in order to manipulate Haar wavelet series. The proposed method has been applied to solve linear and nonlinear optimal control problems with infinite time horizon. The simulation results indicate that the accuracy of the control and cost can be improved by increasing the wavelet resolution.
1. Introduction
Optimal control is an important branch of mathematics and has been widely applied in a number of fields, including engineering, science, and economics. Although, the necessary and sufficient conditions for optimality have already been derived for H2 and H∞ optimal controls, they are only useful for finding analytical solutions for quite restricted cases. If we assume full-state knowledge, and if the optimal control problem is linear, then the optimal control is a linear feedback of the state, which is obtained by solving a matrix Riccati equation. However, if the system is nonlinear, then the optimal control is a state feedback function, which depends on the solution to a Hamilton-Jacobi-Bellman equation (HJB) or a Hamilton-Jacobi-Issac equation (HJI) for H2 or H∞ optimal control problem, respectively [1], and is usually difficult to solve analytically. Feng et al. [2] have solved an HJI equation iteratively by solving a sequence of HJB equation. In this paper, we are more concerned with approximate solution for HJB equation. Among numerous computational approach for solution of HJI equation, we refer in particular to [3–5]. Robustness of nonlinear state feedback is discussed in [6].
Broadly speaking, and in general, numerical methods for solving optimal control problem are divided into two categories: direct and indirect methods. The direct methods reduce optimal control problem to a nonlinear programming problem, by parameterizing or discretizing the infinite-dimensional optimal control problem, into finite-dimensional optimization problem. On the other hand, the indirect methods solve HJB equation or the first order necessary condition for optimality, which are obtained from Pontryagin minimum principle. Both these methods are important for solving optimal control problems; however, the difference between them is that the indirect methods are believed to yield more accurate result, whereas the direct methods tend to have better convergence properties. von Stryk and Bulirsch [7] have used both direct and indirect methods to solve optimal control problem for trajectory optimization in Apollo capsule. Beard et al. [8] have introduced Generalized Hamilton-Jacobi-Bellman equation to successively approximate solution of the HJB equation. Given an arbitrary stabilizing control law, their method can be used to improve the performance of the control. Moreover, Jaddu [9] has reported some numerical methods to solve unconstrained and constrained optimal control problems, by converting optimal control problems into quadratic programming problem. He has used a parameterization technique using the Chebyshev polynomials. Meanwhile, Beeler et al. [10] have performed a comparison study of five different methods for solving nonlinear control systems and studied the performance of the methods on several test problems. Park and Tsiotras [11] have proposed a successive wavelet collocation algorithm which used interpolating wavelets, to iteratively solve the Generalized Hamilton-Jacobi-Bellman equation and the corresponding optimal control law.
Wavelet basis that has compact support allows us to better represent functions with sharp spikes or edges than other bases. This property is advantageous in many applications in signal or image processing. In addition, the availability of fast transform makes it attractive as a computational tool. Numerical solutions of integral and differential equations have been discussed in many papers, which basically fall either in the class of spectral Galerkin and Collocation methods or finite element and finite difference methods.
Haar wavelet is the simplest orthogonal wavelet with a compact support. Chan and Hsiao [12] have used the Haar operational matrix method to solve lumped and distributed parameter systems. Hsiao and Wang [13] have solved optimal control of linear time-varying systems via Haar wavelets. Dai and Cochran Jr. [14] have considered a Haar wavelet technique to transform optimal control problems into nonlinear programming (NLP) parameters at collocation points. This NLP can be solved using nonlinear programming solver such as SNOPT.
In the present paper we have considered the method of Beard et al. [8] to successively approximate the solution of HJB equation. Instead of using the Galerkin method with polynomial basis, we have used collocation method with Haar wavelet basis to solve the Generalized Hamilton-Jacobi-Bellman equation. Galerkin method requires the computation of multidimensional integrals which makes the method impractical for higher order systems [15]. The main advantage of using collocation method in general is that computational burden of solving Generalized Hamilton-Jacobi-Bellman equation is reduced to matrix computation only. Our new successive Haar wavelet collocation method is used to solve linear and nonlinear optimal control problems. In the process of establishing the method we have to define new operational matrices of integration for a chosen stabilizing domain and new operational matrix for the product of two dimensions Haar wavelet functions.
2. Haar Wavelets
The orthogonal set of the Haar wavelets hi(x) is a group of square wave over the interval x∈[τ1,τ2) defined as follows:
(1)h0(x)={1,τ1≤x<τ2,0,elsewhere,h1(x)={1,τ1≤x<12(τ1+τ2),-1,12(τ1+τ2)≤x<τ2,0,elsewhere.
Other wavelets can be obtained by dilation and translation of the mother wavelet h1(x). In general, hi(x)=h1(2jx-k), where i=2j+k, j,k∈N∪{0}, and 0≤k<2j.
Each f(x)∈L2([τ1,τ2)) can be expanded into Haar series of infinite terms:
(2)f(x)=c0h0(x)+c1h1(x)+c2h2(x)+⋯.
If f(x) is approximated as piecewise constants then it can be decomposed as
(3)f(x)=∑i=0m-1cihi(x),
where i=2j+k, j=0,1,2,…,log2m, and k=0,1,2,…,2j-1.
The Haar coefficients that are
(4)ci=2jτ2-τ1∫τ1τ2f(x)hi(x)dx
can be obtain by minimizing the integral square error ∫τ1τ2(f(x)-∑i=0m-1cihi(x))2dx.
The sum in (3) can be compactly written in the form
(5)f(x)=cmThm(x),
where cmT=[c0c1⋯cm-1] is called the coefficient vector and hm(x)=[h0(x)h1(x)⋯hm-1(x)]T is the Haar function vector.
At collocation points xj=(τ1+((τ2-τ1)/2m)(2j-1)), j=1,2,3,…,m, the Haar function vector can be expressed in matrix form as
(6)(Hm)i,j=hi(xj).
For instance, the fourth Haar wavelet matrix H4 can be represented in matrix form as follows:
(7)H4=[111111-1-11-100001-1].
3. Haar Wavelet Operational Matrices
The integration of hi(x) in the interval of [0,τ) can also be expanded into a Haar series, that is,
(8)∫0xhm(x)dx≅Pmhm(x),
where the m×m matrix Pm is called the operational matrix of integration obtain recursively as
(9)Pm=12m[2mPm/2-τHm/2τHm/2-10m/2],P1=[τ2].
The formula in the interval of [0,1) was first given by Chen and Hsiao [12].
In order to solve nonlinear optimal control problem, it is essential to have the product of h(x) and hT(x). The product of two functions f(x)=cTh(x) and g(x)=dTh(x) can be expanded into a Haar series with a Haar coefficient matrix Mm as
(10)dTh(x)hT(x)c=dTMmh(x),
where Mm is an m×m matrix referred to as the product operational matrix. It was first given by Hsiao and Wu [16] as
(11)Mm=[Mm/2Hm/2diag(cb)diag(cb)Hm/2-1diag(caTHm/2)],
where M1=c0 and ca=[c0,…,cm/2-1]T, cb=[cm/2,…,cm-1]T.
Two-dimensional Haar wavelets basis can be formed by taking a tensor product of hn(x) and hm(x). Let the basis be {hi(x1)hj(x2)}, i=1,2,…,n, j=1,2,…,m. Then the two dimensions Haar function vector can be expressed as
(12)H(x1,x2)=[h1(x1)h1(x2)⋮h1(x1)hm(x2)h2(x1)h1(x2)⋮hn(x1)hm(x2)].
Any function f∈L2([-τ1,τ1)×[-τ2,τ2)) can be written as
(13)f(x1,x2)=CTH(x1,x2),
where CT=[c11⋯c1nc21⋯c2n⋯cm1⋯cmn]. Subsequently, we assume that n=m and τ1=τ2=τ, so that the operation matrix will be a square matrix. Let C=vec(Cˇ) where Cˇ is a m×m matrix. By using the Haar wavelet matrix in (6), the coefficient CT in (13) can be obtained from Cˇ as follows:
(14)Cˇ=(Hm-1)T·f·Hm-1
and fi,j=[f(xi,xj)], i,j=1,2,…,m.
The integration of two dimensions Haar function vectors in [-τ,τ)×[-τ,τ) is
(15)∫0xiH(x1,x2)dxi=(Qi-τEi)H(x1,x2),
where Qi and Ei for i=1,2 are the m2×m2 operational matrices given as follows:
(16)Q1=Pm⊗Im,Q2=Im⊗Pm,E1=Am⊗Im,E2=Im⊗Am,
where ⊗ denotes the Kronecker product [17], Im denotes m×m identity matrix, and
(17)(Am)i,j={1,i=1,2,j=1,0,otherwise.
As in (10), we also required the product of H(x1,x2) and HT(x1,x2). Let
(18)H(x1,x2)HT(x1,x2)C=NCH(x1,x2).
The algorithm to obtain NC is as follows.
Step 1.
Let Cˇ be a matrix of C, or equivalently C=vec(Cˇ).
Step 2.
Compute MCˇi, i=1,2,…,m according to (11) using the column Cˇi as the coefficient vector.
Step 3.
For i=1,2,…,m, compute vec(MCˇi).
Step 4.
Form a big matrix by concatenating all vectors from Step 3; that is, S=[vec(MCˇ1)vec(MCˇ2)⋯vec(MCˇm)].
Step 5.
For each row k of matrix S, compute Ni,j according to (11) using the row Sk as the coefficient vector.
Step 6.
Form the matrix NC˘ as follows:
(19)NCˇ=[N11N12…N1mN21N22…N2m⋮⋮…⋮Nm1Nm2…Nmm].
Step 7.
End.
4. Problem Statement
The system to be controlled is given by the nonlinear differential equation of the form
(20)x˙(t)=f(x)+g(x)u(x),x(0)=x0,
where x(t)∈Ω⊂ℝn is the state vector, u:Ω→ℝm is the control, f:Ω→ℝn and g:Ω→ℝn×m are continuously differentiable with respect to all its arguments, x0 is the initial condition vector, and Ω is domain of attraction.
The problem is to find the optimal control u*(x) that minimizes the following performance index:
(21)J(x0,u)=∫0∞(xTQx+uTRu)dt,
where Q∈ℝn×n is a positive semidefinite matrix and R∈ℝm×m is a positive definite matrix. Given an arbitrary control u, the performance of the control at x∈Ω⊂ℝn is given by a Lyapunov function for the system [8]
(22)V(x,u)=∫0∞(l(x(t))+∥u(x(t))∥R2)dt,
where, ∥u∥R2=uTRu and l(x)=xTQx. The optimal controller in feedback form is presented as follows [8]:
(23)u*(x)=-12R-1gT(x)∂V*(x)∂x,
where V*(x) is the solution to the following Hamilton-Jacobi-Bellman (HJB) equation
(24)∂V*T(x)∂xf(x)+l(x)-14∂V*T(x)∂xg(x)R-1×g(x)T∂V*(x)∂x=0
with boundary condition V*(0)=0; that is V(x*,u*)≤V(x,u) for all u, and x*(t) is the solution of x˙=f(x)+g(x)u*(t). Basically, it is not so easy to solve the nonlinear partial differential equation in (24) for the purpose of obtaining V*(x) and consequently u*(x) from (23); rather the following two linear equations have been iterated by the algorithm proposed by [8]
(25)∂V(i)T(x)∂x(f(x)+g(x)u(i)(x))+l(x)+∥u(i)(x)∥R2=0
with initial condition V(i)(0)=0 and
(26)u(i+1)(x)=-12R-1gT(x)∂V(i)(x)∂x.
Equation (25) is called the Generalized Hamilton-Jacobi-Bellman (GHJB) equation in [8]. In case of moderate presumptions, it has been established in [8] that the iteration between the GHJB (25) and the control (26) coincide with original HJB equation solution (24). If we can find a stabilizing control u(0)(x) to start off, it is possible to iteratively enhance the performance of this controller using (25), (26), and finally the optimal controller can be optimally approximated. Moreover, at each iteration step the controller u(i) is a stable control.
5. The Successive Haar Wavelet Collocation Method
The following section describes the successive Haar wavelet collocation method (SHWCM) used for obtaining the two dimensional numerical solution to the HJB equation. In every step of this algorithm, an approximate solution to the GHJB equation (25) has been identified, namely, ∂V(i)/∂x, V(i), and u(i); all can be approximately expressed in term of Haar wavelets. As i→∞, V(i) and u(i) will approach the optimal solution V* and u*, respectively.
Let us consider the following two-dimensional optimal feedback control problem
(27)minV(x0,u)=∫0∞(xTQx+uTRu)dt
subject to the dynamics
(28)x˙=f(x)+g(x)u(x),x(0)=x0,
where x=[x1x2], f(x)=[f1(x1,x2)f2(x1,x2)], g(x)=[g1(x1,x2)g2(x1,x2)], and u:Ω→ℝ.
Without loss of generality, the domain of attraction has been selected as Ω=[-τ,τ]×[-τ,τ] for the sake of convenience. The following equations express the pair of GHJB equation and the control law:
(29)∂V(i)T(x)∂x(f(x)+g(x)u(i)(x))+xTQx+u(i)TRu(i)=0,
with initial condition V(i)(0)=0 and
(30)u(i+1)(x)=-12R-1gT(x)∂V(i)(x)∂x.
For (28), if initially u(0) is a stabilizing control, then from (29) the solution to GHJB equation affiliated with u(0) becomes a Lyapunov function for the system and equals to the cost associated with u(0) as follows:
(31)∂V(0)T(x)∂x(f(x)+g(x)u(0)(x))+xTQx+u(0)TRu(0)=0.
According to (13), function approximation for f1(x)+g1(x)u(0)(x), f2(x)+g2(x)u0(x) and xTQx+u(0)T(x)Ru(0)(x), can be written as
(32)f1(x)+g1(x)u(0)(x)=θTH(x1,x2),f2(x)+g2(x)u(0)(x)=μTH(x1,x2),xTQx+u(0)T(x)Ru(0)(x)=kTH(x1,x2),
where the coefficient vectors, θT, μT, and kT, can be calculate from (14). Since it is not possible to differentiate Haar functions, and as (29) only involves first-order derivatives of V, we assume that second-order partial derivative of V exists; that is,
(33)∂2V∂x1∂x2=ωTH(x1,x2)
for some coefficient vector ω.
With the assumption
(34)∂2V∂x1∂x2=∂2V∂x2∂x1,
the first-order partial derivative can be obtained by integrating (33), with respect to x1 and x2, respectively,
(35)∂V∂x1=ωT(Q2-τE2)H(x1,x2)+α1TH(x1,x2)∂V∂x2=ωT(Q1-τE1)H(x1,x2)+α2TH(x1,x2),
where α1T=[α11,…,α1m,0,…,0] and α2T=[α21,0,…,α22,0,…,0,…,α2m,0,…0].
It should be noted that ωT has m2 unknown variables while α1T and α2T have only m unknown variables each. Now substituting (32) and (35) into (29), we have
(36)ωT{(Q2-τE2)Nθ+(Q1-τE1)Nμ}+α1TNθ+α2TNμ=-kT.
Equation (36) is a system of underdetermined linear equations with m2 equations and (m2+2m) unknown variables which can solve for the unknown vectors ωT, α1T, and α2T using Moore-Penrose pseudoinverse [18]. The underdetermined equation is expected because the Lyapunov function is not unique. The Moore-Penrose solution is the particular solution whose vector 2-norm is minimal.
By using the solution of GHJB equation (29), a feedback control law u(1) is constructed using (30), which improves the efficiency of u(0). The solution of the Hamilton-Jacobi-Bellman equation is uniformly approximated by repeating the above process.
Knowing that
(37)V(x)=∫0x∇VTdx
depends only on the initial and final points, not on the path followed, we can calculate the Lyapunov function V(x) by integrating parallel to the axes [19] as follows:
(38)V(x1,x2)=∫0x1∂V∂x1(x1,0)dx1+∫0x2∂V∂x2(x1,x2)dx2.
This gives
(39)V(x1,x2)=(βT(Q1-τE1)+ωT(Q1-τE1)(Q2-τE2)+α2T(Q2-τE2))H(x1,x2),
where βT=(ωT(Q2-τE2)+α1T)H(x1,0).
6. Numerical Examples
To show the efficiency of the proposed method, we applied our method to a linear quadratic optimal control problem and two nonlinear quadratic optimal control problems.
Example 1.
Consider the following linear quadratic regulator (LQR):
(40)J=12∫0∞x12(t)+u2(t)dt
subject to
(41)x˙=[0100]x+[01]u.
To solve this problem we take the initial stabilizing control u(0)(x)=-x1-x2. Tables 1 and 2 show sample iteration results for u(i) and V(i), respectively, when m=8, x1=-1/8. The iteration is terminated when the difference between two successive controls is less than ϵ=0.001. Subsequent, in order to display two-dimensional plots, we fix the value for x1 at x1[m/2]=-τ/m and x2∈[-1,1). Figure 1 shows that for the particular LQR problem, the usage of m=16 is enough to approximate the exact optimal feedback control u*(x)=-x1-2x2; however, to approximate the exact cost function we require higher value of m as shown in Figure 2.
Iteration results u(i) for Example 1 when m=8 and x1=-1/8.
x2
u(0)
u(1)
u(2)
u(3)
u(4)
uexact
-7/8
1.0000
1.4463
1.3772
1.3786
1.3793
1.3624
-5/8
0.7500
1.0636
1.0114
1.0130
1.0136
1.0089
-3/8
0.5000
0.68889
0.6548
0.6548
0.6550
0.6553
-1/8
0.2500
0.3135
0.3027
0.3017
0.3015
0.3018
1/8
0
-0.0615
-0.0515
-0.0519
-0.0520
-0.0518
3/8
-0.2500
-0.4397
-0.4080
-0.4053
-0.4049
-0.4053
5/8
-0.5000
-0.8137
-0.7584
-0.7571
-0.7572
-0.7589
7/8
-0.7500
-1.1880
-1.1123
-1.1130
-1.1135
-1.1124
Iteration results V(i) for Example 1 when m=8 and x1=-1/8.
x2
V(0)
V(1)
V(2)
V(3)
Vexact
-7/8
0.7051
0.6709
0.6712
0.6714
0.6618
-5/8
0.3914
0.3723
0.3722
0.3723
0.3654
-3/8
0.1723
0.1640
0.1637
0.1637
0.1574
-1/8
0.0470
0.0444
0.0442
0.0441
0.0377
1/8
0.0155
0.0130
0.0130
0.0130
0.0065
3/8
0.0781
0.0704
0.0701
0.0701
0.0636
5/8
0.2348
0.2162
0.2154
0.2153
0.2091
7/8
0.4850
0.4500
0.4492
0.4492
0.4431
Optimal feedback control for Example 1 via the SHWCM with m=8,16 and x1=-0.1250,-0.0625, respectively.
Value cost function for Example 1 via the SHWCM with m=8,16,32 and x1=-0.1250,-0.0625,-0.0313, respectively.
Example 2.
Consider the following nonlinear optimal control problem [15]:
(42)J=∫0∞x22+u2dt
subject to
(43)x˙=[x2-x1(π2+tan-1(5x1))-5x122(1+25x12)+4x2]+[03]u.
The optimum solution for this problem is u*(x)=-3x2 and V*=x12(π/2+tan-1(5x1))+x22. To solve this nonlinear optimal control problem, we started with the initial stabilizing control u(0)(x)=-1.8x2. Figure 3 shows approximate optimal feedback control law u* for m=8,16, and 32. The graph for m=64 overlaps with the exact optimal feedback control, and Figure 4 shows that the approximate cost function converges to the exact cost function as we increase the resolution. Figure 5 compares the exact state trajectories with approximate trajectories.
Optimal feedback control for Example 2 via the SHWCM with m=8,16,32 and x1=-0.1250,-0.0625,-0.0313, respectively.
Value cost function for Example 2 via the SHWCM with m=8,16,32 and x1=-0.1250,-0.0625,-0.0313, respectively.
State trajectory comparison for Example 2.
Example 3.
Consider the following optimal control problem [8]:
(44)J=∫0∞(x12+x22+u2)dt
subject to
(45)x˙=[-x13-x2x1+x2]+[01]u.
The initial stabilizing control u(0)(x)=0.4142x1-1.3522x2 can be obtained using feedback linearization method as outlined in [20]. The optimal feedback control and cost function obtained using SHWCM for various resolution m = 8, 16, and 32 are illustrated in Figures 6 and 7, respectively. We believe that, by increasing Haar wavelet resolution, the SHWCM will be capable of yielding more accurate results. Figure 8 shows simulation of the system trajectories.
Optimal feedback control for Example 3 via the SHWCM with m=8,16,32 and x1=-0.1250,-0.0625,-0.0313, respectively.
Value cost function for Example 3 via the SHWCM with m=8,16,32 and x1=-0.1250,-0.0625,-0.0313, respectively.
Some state trajectories for Example 3.
7. Conclusion
In this paper we had proposed a new numerical method for solving the Hamilton-Jacobi-Bellman equation, which appears in the formulation of optimal control problems. Our approach uses a combination of successive Generalized Hamilton-Jacobi-Bellman equation and Haar wavelets operational matrix methods. The proposed approach is simple and stable and has been tested on linear and nonlinear optimal control problem in two-dimensional state space. Generally, by using our method, the approximate solutions for optimal feedback control require lower resolution, than the approximate solutions for the cost function. However, in both cases, it is clear that more accurate results can be obtained by increasing the resolution of Haar wavelet.
Acknowledgments
The authors are very grateful to the referees for their valuable comments and suggestions, which greatly improved the presentation of this paper. This research has been funded by University of Malaya, under Grant No. RG208-11AFR.
BeardR. W.McLainT. W.Successive Galerkin approximation algorithms for nonlinear optimal and robust controlFengY.AndersonB. D. O.RotkowitzM.A game theoretic algorithm to compute local stabilizing solutions to HJBI equations in nonlinear H∞ controlHuangJ.LinC.-F.Numerical approach to computing nonlinear H∞ control lawsAliyuM. D. S.An approach for solving the Hamilton-Jacobi-Isaacs equation (HJIE) in nonlinear ℋ_{∞} controlAbu-KhalafM.LewisF. L.HuangJ.Policy iterations on the Hamilton-Jacobi-Isaacs equation for H∞ state feedback control with input saturationGladS. T.Robustness of nonlinear state feedback—a surveyvon StrykO.BulirschR.Direct and indirect methods for trajectory optimizationBeardR. W.SaridisG. N.WenJ. T.Galerkin approximations of the generalized Hamilton-Jacobi-Bellman equationJadduH. M.BeelerS. C.TranH. T.BanksH. T.Feedback control methodologies for nonlinear systemsParkC.TsiotrasP.Approximations to optimal feedback control using a successive wavelet collocation algorithm3Proceedings of the American Control ConferenceJune 2003195019552-s2.0-0142215883ChenC. F.HsiaoC. H.Haar wavelet method for solving lumped and distributed parameter systemsHsiaoC. H.WangW. J.Optimal control of linear time-varying systems via Haar waveletsDaiR.CochranJ. E.Jr.Wavelet collocation method for optimal control problemsCurtisJ. W.BeardR. W.Successive collocation: an approximation to optimal nonlinear control5Proceeding of the American Control ConferenceJune 2001348134852-s2.0-0034848079HsiaoC. H.WuS. P.Numerical solution of time-varying functional differential equations via Haar waveletsBrewerJ. W.Kronecker products and matrix calculus in system theoryCourrieuP.Fast computation of Moore-Penrose inverse matricesSlotineJ.-J.LiW.IsidoriA.