Discrete-Time Indeﬁnite Stochastic LQ Control via SDP and LMI Methods

This paper studies a discrete-time stochastic LQ problem over an inﬁnite time horizon with state-and control-dependent noises, whereas the weighting matrices in the cost function are allowed to be indeﬁnite. We mainly use semideﬁnite programming (cid:2) SDP (cid:3) and its duality to treat corresponding problems. Several relations among stability, SDP complementary duality, the existence of the solution to stochastic algebraic Riccati equation (cid:2) SARE (cid:3) , and the optimality of LQ problem are established. We can test mean square stabilizability and solve SARE via SDP by LMIs method.


Introduction
Stochastic linear quadratic LQ control problem was first studied by Wonham 1 and has become a popular research field of modern control theory, which has been extensively studied by many researchers; see, for example, 2-12 . We should point out that, in the most early literature about stochastic LQ issue, it is always assumed that the control weighting matrix R is positive definite and the state weight matrix Q is positive semi-definite. A breakthrough belongs to 9 , where a surprising fact was found that for a stochastic LQ modeled by a stochastic Itô-type differential system, even if the cost-weighting matrices Q and R are indefinite, the original LQ optimization may still be well-posed. This finding reveals the essential difference between deterministic and stochastic systems. After that, follow-up research was carried out and a lot of important results were obtained. In 10-12 , continuous-time stochastic LQ control problem with indefinite weighting matrices was studied. The authors in 10 provided necessary and sufficient conditions for the solvability of corresponding generalized differential Riccati equation GDRE . The authors introduced LMIs whose feasibility is shown to be equivalent to the solvability of SARE and developed 2 Journal of Applied Mathematics a computational approach to the SARE by SDP in 11 . Furthermore, stochastic indefinite LQ problems with jumps in infinite time horizon and finite time horizon were, respectively, studied in 13, 14 . Discrete-time case was also studied in [15][16][17] . Among these, a central issue is solving corresponding SARE. A traditional method is to consider the so-called associated Hamiltonian matrix. However, this method does not work on when R is indefinite.
In this paper, we use SDP approach introduced in 11, 18 to discuss discrete-time indefinite stochastic LQ control problem over an infinite time horizon. Several equivalent relations between the stabilization/optimality of the LQ problem and the duality of SDP are established. We show that the stabilization is equivalent to the feasibility of the dual SDP. Furthermore, we prove that the maximal solution to SARE associated with the LQ problem can be obtained by solving the corresponding SDP. What we have obtained extend the results of 11 from continuous-time case to discrete-time case and the results of 15 from finite time horizon to infinite time horizon.
The organization of this paper is as follows. In Section 2, we formulate the discretetime indefinite stochastic LQ problem in an infinite time horizon and present some preliminaries including some definitions, lemmas, and SDP. Section 3 is devoted to the relations between stabilization and dual SDP. In Section 4, we develop a computational approach to the SARE via SDP and characterize the optimal LQ control by the maximal solution to the SARE. Some numerical examples are presented in Section 5.

Problem Statement
Consider the following discrete-time stochastic system: Ax t Bu t Cx t Du t w t , x 0 x 0, t 0, 1, 2, . . . , 2.1 where x t ∈ R n , u t ∈ R m are the system state and control input, respectively. x 0 ∈ R n is the initial state, and w t ∈ R is the noise. A, C ∈ R n×n and B, D ∈ R n×m are constant matrices. {w t , t 0, 1, 2, . . .} is a sequence of real random variables defined on a filtered probability space Ω, F, F t , P with F t σ{w s : s 0, 1, 2, . . . , t}, which is a wide sense stationary, second-order process with E w t 0 and E w s w t δ st , where δ st is the Kronecker function. u t belongs to L 2 F R m , the space of all R m -valued, F t -adapted measurable processes satisfying We assume that the initial state x 0 is independent of the noise w t .

Journal of Applied Mathematics 3
We first give the following definitions.
x 0 x 0, t 0, 1, 2, . . . , 2.4 is asymptotically mean square stable; that is, the corresponding state x · of 2.4 satisfies lim For system 2.1 , we define the admissible control set U ad u t ∈ L 2 F R m , u t is mean square stabilizing control. 2.5 The cost function associated with system 2.1 is where Q and R are symmetric matrices with appropriate dimensions and may be indefinite. The LQ optimal control problem is to minimize the cost functional J x 0 , u over u ∈ U ad . We define the optimal value function as Since the weighting matrices Q and R may be indefinite, the LQ problem is called an indefinite LQ control problem.
If there exists an admissible control u * such that V x 0 J x 0 , u * , the LQ problem is called attainable and u * is called an optimal control. 4

Journal of Applied Mathematics
Stochastic algebraic Riccati equation SARE is a primary tool in solving LQ control problems. Associated with the above LQ problem, there is a discrete SARE: Definition 2.4. A symmetric matrix P max is called a maximal solution to 2.9 if P max is a solution to 2.9 and P max ≥ P for any symmetric solution P to 2.9 . Throughout this paper, we assume that system 2.1 is mean square stabilizable.

Some Definitions and Lemmas
The following definitions and lemmas will be used frequently in this paper.

2.10
Definition 2.6. Suppose that V is a finite-dimensional vector space and S is a space of block diagonal symmetric matrices with given dimensions. A: V → S is a linear mapping and A 0 ∈ S. Then the inequality is called a linear matrix inequality LMI . An LMI is called feasible if there exists at least one x ∈ V satisfying the above inequality and x is called a feasible point.

Lemma 2.7
Schur's lemma . Let matrices M M , N and R R > 0 be given with appropriate dimensions. Then the following conditions are equivalent:

Semidefinite Programming
Definition 2.10 see 19 . Suppose that V is a finite-dimensional vector space with an inner product ·, · V and S is a space of block diagonal symmetric matrices with an inner product ·, · S . The following optimization problem is called a semidefinite programming SDP . From convex duality, the dual problem associated with the SDP is defined as In the context of duality, we refer to the SDP 2.12 as the primal problem associated with 2.13 .
Remark 2.11. Definition 2.10 is more general than Definition 6 in 11 . Let p * denote the optimal value of SDP 2.12 ; that is, 14 and let d * denote the optimal value of the dual SDP 2.13 ; that is, Let X opt and Z opt denote the primal and dual optimal sets; that is, About SDP, we have the following proposition see 20, Theorem 3.1 .

Mean Square Stabilization
The stabilization assumption of system 2.1 is basic for the study on the stochastic LQ problem for infinite horizon case. So, we will cite some equivalent conditions in verifying the stabilizability. 1 There are a matrix K and a symmetric matrix P > 0 such that Moreover, the stabilizing feedback control is given by u t Kx t .
2 There are a matrix K and a symmetric matrix P > 0 such that Moreover, the stabilizing feedback control is given by u t Kx t .
3 For any matrix Y > 0, there is a matrix K such that the following matrix equation has a unique positive definite solution P > 0. Moreover, the stabilizing feedback control is given by u t Kx t .
4 For any matrix Y > 0, there is a matrix K such that the following matrix equation has a unique positive definite solution P > 0. Moreover, the stabilizing feedback control is given by u t Kx t .
5 There exist matrices P > 0 and U such that the following LMI holds. Moreover, the stabilizing feedback control is given by u t UP −1 x t .

Journal of Applied Mathematics 7
Below, we will construct the relation between the stabilization and the dual SDP. First, we assume that the interior of the set P {P ∈ S n | R P ≥ 0, R B PB D PD > 0} is nonempty; that is, there is a P 0 ∈ S n such that R P 0 > 0 and R B P 0 B D P 0 D > 0.
Consider the following SDP problem: By the definition of SDP, we can get the dual problem of 3.6 .
Theorem 3.2. The dual problem of 3.6 can be formulated as Proof. The objective of the primal problem can be rewritten as maximizing I, P S n . Define the dual variable Z ∈ S 2n m as where S, T, W, U, Y ∈ S n × S m × S n × R m×n × R n× n m . The LMI constraint in the primal problem can be represented as According to the definition of adjoint mapping, we have A P , Z S 2n m P, A adj Z S n , that is, Tr A P Z Tr P A adj Z . It follows A adj Z −S ASA CSC BUA DUC AU B CU D BTB DTD W. By Definition 2.10, the objective of the dual problem is to 8 Journal of Applied Mathematics On the other hand, we will state that the constraints of the dual problem 2.13 are equivalent to the constraints of 3.7 . Obviously, A adj Z −I is equivalent to the equality constraint of 3.7 . Furthermore, notice that the matrix variable Y does not work on in the above formulation and therefore can be treated as zero matrix. So, the condition Z ≥ 0 is equivalent to This ends the proof.

Remark 3.3.
This proof is simpler than the proof in 11 because we use a more general dual definition.
The following theorem reveals that the stabilizability of discrete stochastic system can be also regarded as a dual concept of optimality. This result is a discrete edition of Theorem 6 in 11 .

Solutions to SARE and SDP
The following theorem will state the existence of the solution of the SARE 2.9 via SDP 3.6 .
Theorem 4.1. The optimal set of 3.6 is nonempty, and any optimal solution P * must satisfy the SARE 2.9 .
Proof. Since system 2.1 is mean square stabilizable, by Theorem 3.4, 3.7 is strictly feasible. Equation 3.6 is strictly feasible because P 0 is a interior point of P. By Proposition 2.12, 3.6 is nonempty and P * satisfies A P * Z 0; that is, From the above equality, we have the following equalities: Then it follows that R P * R P * ≤ 0. It yields R P * 0 due to R P * ≥ 0.
The following theorem shows that any optimal solution of the primal SDP results in a stabilizing control for LQ problem.

4.10
By Lemma 3.1, the above inequality is equivalent to the mean square stabilizability of system 2.1 with u t US −1 x t − R B P * B D P * D −1 B P * A D P * C x t . This ends the proof.

Theorem 4.3.
There is a unique optimal solution to 3.6 , which is the maximal solution to SARE 2.9 .
Proof. The proof is similar to Theorem 9 in 11 and is omitted. Proof. The first assertion is Theorem 4 in 16 . As to the second part, we proceed as follows. By Lemma 2.7, P max satisfies the constraints in 4.11 . P max is an optimal solution to 4.11 due to the maximality. Next we prove the uniqueness. Assume that P max is another optimal solution to 4.11 . Then we have Tr P max Tr P max . According to Definition 2.4, P max ≥ P max . This yields P max P max . Hence, the proof of the theorem is completed.

Remark 4.5.
Here we drop the assumption that the interior of P is nonempty.
Remark 4.6. Theorem 4.4 presents that the maximal solution to SARE 2.9 can be obtained by solving SDP 4.11 . The result provides us a computational approach to the SARE. Furthermore, as shown in 16 , the relationship between the LQ value function and the maximal solution to SARE 2.9 can be established; that is, assuming that P is nonempty, then the value function V x 0 x 0 P max x 0 and the optimal control can be expressed as The above results represent SARE 2.9 may exist a solution even if R is indefinite even negative definite . To describe the allowable negative degree, we give the following definition to solvability margin. Definition 4.7 see 11 . The solvability margin r * is defined as the largest nonnegative scalar r ≥ 0 such that 2.9 has a solution for any R > −r * I.
By Theorem 4.4, the following conclusion is obvious.

Mean Square Stabilizability
In order to test the mean square stabilizability of system 2.

Conclusion
In this paper, we use the SDP approach to the study of discrete-time indefinite stochastic LQ control. It was shown that the mean square stabilization of system 2.1 is equivalent to the strict feasibility of the SDP 3.7 . In addition, the relation between the optimal solution of 3.6 and the maximal solution of SARE 2.9 has been established. What we have obtained can be viewed as a discrete-time version of 11 . Of course, there are many open problems to be solved. For example, R B PB D PD > 0 is a basic assumption in this paper. A natural question is whether or not we can weaken it to R B PB D PD ≥ 0. This problem merits further study.