Optimal Control with Partial Information for Stochastic Volterra Equations

In the first part of the paper we obtain existence and characterizations of an optimal control for a linear quadratic control problem of linear stochastic Volterra equations. In the second part, using the Malliavin calculus approach, we deduce a general maximum principle for optimal control of general stochastic Volterra equations. The result is applied to solve some stochastic control problem for some stochastic delay equations.


Introduction
Let Ω, F, F t , P be a filtered probability space and B t , t ≥ 0 a F t -real valued Brownian motion.Let R 0 R \ {0} and ν dz a σ-finite measure on R 0 , B R 0 .Let N dt, dz denote a stationary Poisson random measure on R × R 0 with intensity measure dtν dz .Denote by N dt, dz N dt, dz − dtν dz the compensated Poisson measure.Suppose that we have a cash flow where the amount X t at time t is modelled by a stochastic delay equation of the form:

International Journal of Stochastic Analysis
Here h > 0 is a fixed delay and A 1 t , A 2 t , A 0 t, s , C 1 t , C 2 t, z , and η are given bounded deterministic functions.Suppose that we consume at the rate u t at time t from this wealth X t , and that this consumption rate influences the growth rate of X t both through its value u t at time t and through its former value u t − h , because of some delay mechanisms in the system determining the dynamics of X t .
With such a consumption rate u t the dynamics of the corresponding cash flow X u t is given by Suppose that the consumer wants to maximize the combined utility of the consumption up to the terminal time T and the terminal wealth.Then the problem is to find u • such that is maximal.Here U t, • and U 2 • are given utility functions, possibly stochastic.See Section 4. This is an example of a stochastic control problem with delay.Such problems have been studied by many authors.See, for example, 1-5 and the references therein.The methods used in these papers, however, do not apply to the cases studied here.Moreover, these papers do not consider partial information control see below .
It was shown in 6 that the system 1.2 is equivalent to the following controlled stochastic Volterra equation:

1.6
So the control of the system 1.2 reduces to the control of the system 1.4 .For more information about stochastic control of delay equations we refer to 6 and the references therein.
Stochastic Volterra equations are interesting on their own right, also for applications, for example, to economics or population dynamics.See, for example, Example 1.1 in 7 and the references therein.
In the first part of this paper, we study a linear quadratic control problem for the following controlled stochastic Volterra equation: where u t is our control process and ξ t is a given predictable process with E ξ 2 t < ∞ for all t ≥ 0, while K i , D i are bounded deterministic functions.In reality one often does not have the complete information when performing a control to a system.This means that the control processes is required to be predictable with respect to a subfiltration {G t } with G t ⊂ F t .So the space of controls will be U is a Hilbert space equipped with the inner product || • || will denote the norm in U. Let A G be a closed, convex subset of U, which will be the space of admissible controls.Consider the linear quadratic cost functional and the value function In Section 2, we prove the existence of an optimal control and provide some characterizations for the control.
In the second part of the paper from Section 3 , we consider the following general controlled stochastic Volterra equation: 12 where ξ t is a given predictable process with E ξ 2 t < ∞ for all t ≥ 0. The performance functional is of the following form: for any u ∈ A G , the space of admissible controls.The problem is to find u ∈ A G such that Using the Malliavin calculus, inspired by the method in 8 , we will deduce a general maximum principle for the above control problem.
Remark 1.1.Note that we are off the Markovian setting because the solution of the Volterra equation is not Markovian.Therefore the classical method of dynamic programming and the Hamilton-Jacobi-Bellman equation cannot be used here.
Remark 1.2.We emphasize that partial information is different from partial observation, where the control is based on noisy observations of the current state.For example, our discussion includes the case G t F t−δ δ > 0 constant , which corresponds to delayed information flow.This case is not covered by partial observation models.For a comprehensive presentation of the linear quadratic control problem in the classical case with partial observation, see 9 , with partial information see 10 .

Linear Quadratic Control
Consider the controlled stochastic Volterra equation 1.7 and the control problem 1.10 , 1.11 .We have the following Theorem.
Theorem 2.1.Suppose that R 0 K 2 4 t, s, z ν dz is bounded and Q 2 s ≥ 0, a 1 ≥ 0 and Q 1 s ≥ δ for some δ > 0. Then there exists a unique element u ∈ A G such that Proof.For simplicity, we assume D 3 t, s, z 0 and K 5 t, s, z 0 in this proof because these terms can be similarly estimated as the corresponding terms for Brownian motion B • .By 1.7 we have Similar arguments also lead to for some constant C 2 .Now, let u n ∈ A G be a minimizing sequence for the value function, that is, lim n → ∞ J u n J. From the estimate 2.3 we see that there exists a constant c such that Thus, by virtue of the assumption on Q 1 , we have, for some constant M, This implies that {u n } is bounded in U, hence weakly compact.Let u n k , k ≥ 1 be a subsequence that converges weakly to some element u 0 in U. Since A G is closed and convex, the Banach-Sack Theorem implies u 0 ∈ A G .From 2.4 we see that The same conclusion holds also for Z u t : X u t − X 0 t .Since Z u is linear in u, we conclude that equipped with the weak topology both on U and are continuous with respect to the weak topology of U and L 2 Ω .Since the functionals of X u involved in the definition of J u in 1.10 are lower semicontinuous with respect to the weak topology, it follows that which implies that u 0 is an optimal control.The uniqueness is a consequence of the fact that J u is strictly convex in u which is due to the fact that X u is affine in u and x 2 is a strictly convex function.The proof is complete.
To characterize the optimal control, we assume D 1 t, s 0 and D 3 t, s, z 0; that is, consider the controlled system:

2.10
For a predictable process h s , we have t 0 h s dF t, s :

2.12
Lemma 2.2.Under our assumptions, the above series converges at least in L 1 Ω .Thus M i , i 1, 2, 3 and L are well-defined.
Proof.We first note that for some constant R T .This implies that n − 1 ! .

2.16
Thus, we have The following theorem is a characterization of the optimal control.Theorem 2.3.Assume that R 0 K 2 4 t, s, z ν dz and R 0 K 2 5 t, s, z ν dz are bounded and Let u be the unique optimal control given in Theorem 2.1.Then u is determined by the following equation:

2.18
almost everywhere with respect to m ds, dω : ds × P dω .

International Journal of Stochastic Analysis
Proof.For any w ∈ U, since u is the optimal control, we have This leads to for all w ∈ U.By virtue of 2.9 , it is easy to see that satisfies the following equation:

2.22
Remark that Y w is independent of u.Next we will find an explicit expression for X u .Let dF t, s be defined as in 2.10 .Repeatedly using 2.9 we have

2.23
Similarly, we have the following expansion for Y w :

2.24
Interchanging the order of integration,

2.25
Now substituting Y w into 2.20 we obtain that for all w ∈ U. Interchanging the order of integration and conditioning on G s we see that 2.26 is equivalent to

2.27
International Journal of Stochastic Analysis 13 Since this holds for all w ∈ U, we conclude that

2.28
m-a.e.Note that X u t can be written as

2.31
Suppose G t {Ω, ∅}, meaning that the control is deterministic.In this case, we can find the unique optimal control explicitly.Noting that the conditional expectation reduces to expectation, the 2.18 for the optimal control u becomes

2.32
where we have used the fact that E M 2 t 0, M 1 t ξ t , L t, s K 3 t, s in this special case.Put b T 0 u t K 3 T, t dt.

2.34
where Substitute the expression of u into 2.34 to get

2.37
Together with 2.35 we arrive at ds-a.e.

A General Maximum Principle
In this section, we consider the following general controlled stochastic Volterra equation: Consider a performance functional of the following form: The purpose of this section is to give a characterization for the critical point of J u .First, in the following two subsections we recall briefly some basic properties of Malliavin calculus for B • and N •, • which will be used in the sequel.For more information we refer to 11 and 12 .

Integration by Parts Formula for B •
In this subsection, F T σ B s , 0 ≤ s ≤ T .Recall that the Wiener-Ito chaos expansion theorem states that any F ∈ L 2 F T , P admits the representation for a unique sequence of symmetric deterministic function f n ∈ L 2 0, T ×n and Moreover, the following isometry holds: Let D 1,2 be the space of all F ∈ L 2 F T , P such that its chaos expansion 3.4 satisfies

3.7
For F ∈ D 1,2 and t ∈ 0, T , the Malliavin derivative of F, D t F, is defined by where I n−1 f n •, t is the n − 1 times iterated integral to the first n − 1 variables of f n keeping the last variable t n t as a parameter.We need the following result.
Theorem A Integration by parts formula duality formula for 3.9

Integration by Parts Formula for N
In this section F T σ η s , 0 ≤ s ≤ T , where η s s 0 R 0 z N dr, dz .Recall that the Wiener-Ito chaos expansion theorem states that any F ∈ L 2 F T , P admits the representation for a unique sequence of functions Moreover, the following isometry holds:

3.12
Let D 1,2 be the space of all F ∈ L 2 F T , P such that its chaos expansion 3.18 satisfies

3.13
For F ∈ D 1,2 and t ∈ 0, T , the Malliavin derivative of F, D t,z F, is defined by where I n−1 f n •, t, z is the n − 1 times iterated integral with respect to the first n − 1 pairs of variables of f n keeping the last pair t n , z n t, z as a parameter.We need the following result Theorem B Integration by parts formula duality formula for N .Suppose h t, z is 3.15

Maximum Principles
Consider 3.1 .We will make the following assumptions throughout this subsection.

3.18
H.5 For all u ∈ A G , the Malliavin derivatives D t g X u T and D t,z g X u T exist.
In the sequel, we omit the random parameter ω for simplicity.Let J u be defined as in 3.2 .

International Journal of Stochastic Analysis
Theorem 3.1 Maximum principle I for optimal control of stochastic Volterra equations .(1) Suppose that u is a critical point for J u in the sense that d/dy J u yβ | y 0 0 for all bounded β ∈ A G .Then 2 Suppose that 3.19 holds for some u ∈ A G .Running the arguments in the proof of 1 backwards, we see that 3.20 holds for all bounded β ∈ A G of the form β s αχ t,T s .This is sufficient because the set of linear combinations of such β is dense in A G .
Next we consider the case where the coefficients are independent of x.The maximum principle will be simplified significantly.Fix a control u ∈ A G with corresponding state process X t .Define the associated Hamiltonian process H t, u by

3.32
Theorem 3.2 Maximum principle II for optimal control of stochastic Volterra equations .Suppose that f, b, σ, θ are all independent of x.Then the followings are equivalent.
i u is a critical point for J u .
ii For each t ∈ 0, T ,

3.35
Taking the right derivative with respect to v at the point t we obtain 3.33 .

Applications to Stochastic Delay Control
We now apply the general maximum principle for optimal control of Volterra equations to the stochastic delay problem 1.2 -1.3 in the Introduction, by using the equivalence between 1.2 and 1.4 .Note that in this case we have see A 2 t X u t − h C 2 t, z N dt, dz ; t ∈ −h, 0 , X u t η t ; t ≤ 0, 1.2where B 1 t and B 2 t are deterministic bounded functions.
Because of the similarity, let us prove only that M 1 is well-defined.Repeatedly using 2.13 , we have f n is symmetric with respect to the pairs of variables t 1 , z 1 , t 2 , z 2 , . . ., t n , z n .Here I n f n is the iterated integral: and g : R × Ω → R are continuously differentiable with respect to x ∈ R and u ∈ R.H.2 For all t ∈ 0, T and all bounded G t -measurable random variables α the control H.3 For all u, β ∈ A G with β bounded, there exists δ > 0 such that u yβ ∈ A G ∀y ∈ −δ, δ .
As in the proof of Lemma 2.2, we can check that the above series converges in L 1 Ω under the assumption H.6 .We substitute 3.28 into 3.23 to get 3.1 , 3.2 and compare with 1.2 , 1.3Hence the system 1.4 satisfies the conditions of Theorem 3.2.By 3.32 we get the Hamiltonian Therefore by Theorem 3.2 we get the following condition for an optimal harvesting rate u t : Now suppose that U 1 and U 2 are stochastic utilities of the form where γ t ω > 0 is F t -adapted, ζ ω is F T -measurable, and U 1 , U 2 are concave, C 1 -functions on 0, ∞ and R, respectively.The 4.3 simplifies toU 1 t, u t E γ t G t −K T, t E ζ U 2 X T G t .4.6This gives a relation between the optimal control u t and the corresponding optimal terminal wealth X T .In particular, if The optimal consumption rate u t for the stochastic delay system 1.2 , 4.4 , 4.5 , 4.7 and the performance functional i ; i 1, 2. t is given by 4.8 .