Numerical Optimal Control for Problems with Random Forced SPDE Constraints

A numerical algorithm for solving optimization problems with stochastic diffusion equation as a constraint is proposed. First, separation of randomanddeterministic variables is done viaKarhunen-Loeve expansion.Then, the problem is discretized, in spatial part, using the finite element method and the polynomial chaos expansion in the stochastic part of the problem.This process leads to the optimal control problem with a large scale system in its constraint. To overcome these difficulties the adjoint technique for derivative computation to implementation of the optimal control issue in preconditioned Newton’s conjugate gradient method is used. By some numerical simulation, it is shown that this hybrid approach is efficient and simple to implement.


Introduction
Physical problems, in many cases, can be formulated as optimization problems.These problems are utilized to gain a more widely understanding of physical systems.Typically, these problems depend on some models which, in many cases, are deterministic.Uncertainty might plague everything from modeling assumptions to experimental data.As such, in order to accommodate for these uncertainties, many practitioners have developed stochastic models.In order to make sense of and solve these models, in addition to randomness of the models, some additional theory is required to the resulting optimization problems.This paper proposes an adjoint based approach for solving optimization problems governed by stochastic diffusion equations.In order to deal with the stochastic partial differential equations (SPDE) as a constraint in an optimization problem one may first solve the SPDE.Since the provided systems by this approach are random and nonlinear, such kind of problems are difficult to handle and very challenging.One of the most challenging examples in this area is the control of stochastic diffusion equations with random forcing [1].Polynomial chaos expansion (PCE) [2] provides a good direction for solution of nonlinear SPDEs numerically.In the presence of the random forcing, PCE seems to be more accurate and efficient numerical method than Monte Carlo simulation.In fact, the PCE can be interpreted as the Fourier expansion in the probability space.Particularly, the aim of this work is the numerical solution for the distributed control problems involving the stochastic diffusion equations in the form: −∇ ⋅ ( (x, ) ∇ (x, )) =  (x) in  × Ω (1) where  is the spatial domain,  is the boundary of the spatial domain, Ω is probability space, x ∈ ,  ∈ Ω,  is the solution of the SPDE,  is the source function, and (x, ) is the permeability field of the problem.As an important assumption for the stochastic diffusion equation (1), it is assumed that the random coefficient (x, ) satisfies the elliptic condition [3]; that is, there exists a constant  min such that 0 <  min ≤  (x, ) ∀ (x, ) ∈  × Ω.
Karhunen-Loeve expansion (KLE) of correlated random functions is used to separate the random and deterministic parts in random coefficient [4,5].Therefore, a finite dimensional approximation   (x, ) is performed by truncating the 2 ISRN Applied Mathematics Karhunen-Loeve expansion of the permeability field (x, ).Then, (1) is approximated by ∇ (  (x, ) ∇) = . (4) For approximation of the function  with random variables and the Gaussian random variables of the highest degree , the number of PCE coefficients are( + )!/!!, [6].Now, weak formulation of (4) and then Galerkin finite element method are used to solve the SPDE problem approximately.Generally in nature, optimization with SPDEconstraint is infinite dimensional, large, and complex.There are two numerical approaches for solving this problem [2,7]: (i) discretize then optimize (DO): for problems that can be trivially discretized first, (ii) optimize then discretize (OD): for problems that are differentiable.
Our intention is to work with discretizethen optimization algorithms for smooth functions.Thus, we follow the DO approach.Galerkin finite element method incorporated with polynomial chaos expansion is used for discretizing the SPDE.By computing gradient and Hessian of the Lagrangian augmented function, preconditioned Newton's conjugate gradient method is used to compute the best right hand side vector (control values at grid points), where the corresponding solution of SPDE (state values) according to this vector, stays as near as to the desired value approximation, and has lowest cost in the objective function.The organization of this paper is as follows: in Section 2, problem formulation in addition to KLE, PCE, and stochastic Galerkin method is presented.In Section 3, distributed control of random forced diffusion equation with some numerical examples is proposed.

Problem Formulation
We consider optimal control problems of the form min ∈, ∈Z  (, ) where is an operator which stands as the constraint,  is the spatial domain, Ω is the probability space, and ⨂ is the tensor product.
To provide a concrete setting for our discussion, the following problem is considered as the constraint operator where (x, ) is the correlated random fields.Hence, the solutions (x, ) of the SPDE ( 6) are also random fields.It is assumed that the control space (x) ∈  2 ().The weak form of the constraint function ( 6) can be defined for all V ∈  as It is assumed that the objective or cost functional is as where (⋅) denotes the expected value, û is a given desired function, and  is the regularization parameter.Obviously, controlling the solution of this problem leads to controlling the statistics of the solution, for example, the expected value of the solution, given by So, the optimal control problem can be interpreted as follows: given the random field (x, ), minimize the objective function (8) over all  ∈ Z, subject to  ∈ , such that for the given random field (x, ) satisfies in the weak formulation (7) [1].
2.1.Basic Definitions.In (5),  and  are assumed to be continuously F-differentiable and that for each  ∈ Z the state equation (, ) = 0 possesses a unique corresponding solution () ∈ .Similar to the deterministic PDEs, existence and uniqueness of the solution for these kind of SPDEs can be established using the Lax-Milgram theorem.Thus, there is a solution operator  ∈ Z → () ∈ .In addition, it is assumed that   ((), ) ∈ L(, Z) is continuously invertible.Then, existence and uniqueness of the solution for the above optimal control problems as well as the continuously differentiability of () are ensured by the implicit function theorem [8].
Using the KLE (15), the stochastic process can be represented as a series of uncorrelated random variables.Since the basis functions   (x) are deterministic, the spatial dependence of the random process can be resolved by them.The convergence property of the KLE to the random process (x, ) can be represented in the mean square sense lim where is a finite term KLE [4,5].

Polynomial Chaos Expansion (PCE).
There are problems, as the solution of a PDE with random inputs, that the covariance function of a random process (x, ), x ∈  is not known.The solution of such problems can be represented using a polynomial chaos expansion (PCE) given by where the functions   (x) are deterministic coefficients,  is a vector of orthonormal random variables, and Ψ  () are multidimensional orthogonal polynomials that satisfy in the following properties: The convergence property of PCE for a random quantity in  2 is ensured by the Cameron-Martin theorem [6,9]; that is, Hence, this convergence justifies a truncation of PCE to a finite number of terms, where the value of  is determined by the highest degree of polynomial , used to represent , and the number  of random variables-the length of -with the formula  + 1 = ( + )!/!!.Generally, the value of  is the same as the number of uncorrelated random variables in the system or equivalently the truncation length of the truncated KLE.
Typically, the value of  is chosen by some heuristic method.Indeed, in the case of  = 1 and  random variables, the KLE is a special case of the PCE [9,10].

Distributed Optimal Control Problem
Using the stochastic Galerkin method for general objective functions, computing the derivatives becomes very tedious and rather messy.Consider the following optimal control problem: The solutions to linear elliptic SPDEs live in the space  =  1 0 () ⨂  2 (Ω) and the control space Z =  2 ().Denoting the constraint equation by (, ) = 0, usually, it is possible to invoke the implicit function theorem to find a solution () to the constraint equation.This leads to an implicitly defined objective function Ĵ() := ((), ).Now we focus on computing derivatives of Ĵ [11].For any direction  ∈ Z, ⟨ Ĵ () , ⟩ Z * ,Z = ⟨  ( () , ) ,   () ⟩  * , where  * and Z * are the dual space of  and Z, respectively.Now, computing the derivative of the constraint  with respect to , in the direction , and applying it to the implicit solution to ((), ) = 0 yield The weak form of the constraint function is defined as follows: for all V ∈ , Thus, the constraint function  :  × Z →  * and  =  * .Since the constraint function  acts linearly with respect to  and , thus the corresponding derivative of  with respect to  in the direction  ∈ Z, for all V ∈ , is given by Similarly, the derivative of  with respect to , for any direction  ∈  and for all V ∈ , is given by Both of these derivatives are symmetric, so   (, ) * =   (, ) and   (, ) * =   (, ).From here, the adjoint can be computed as   ( () , )  = −  ( () , ) .By the chain rule, the derivative of  with respect to  in the direction  is Similarly, the derivative of  with respect to  in the  ∈ Z direction is With these computations, one can compute the first derivative of Ĵ() via the adjoint approach.The aim is to derive the discrete versions of the objective function, gradient, and Hessian times a vector calculation corresponding to the stochastic Galerkin solution technique for SPDE; that is, (, ) = 0. Suppose  ℎ ⊂  1 0 () is a finite dimensional subspace of dimension  and  ℎ ⊂  2 (Ω) is a finite dimensional subspace of dimension ; then  ℎ ⨂  ℎ is a finite dimensional subspace of the state space .Similarly, let Z ℎ ⊂ Z be a finite dimensional subspace of the control space (with dimension).For the stochastic Galerkin method, it is assumed that  ℎ = P −1 , the space of polynomials with highest degree  − 1, and  ℎ is any finite element space; here,  ℎ is the space of linear functions built on a given mesh T ℎ .We choose the system of polynomials that are -orthonormal to be a basis for  ℎ ; that is,  ℎ = span{Ψ 1 (), . . ., Ψ  ()}, where The discretized optimization problem in the stochastic Galerkin framework is min where is the stochastic Galerkin solution to the state equation The blocks of the above  matrix have the form First, in order to compute the derivatives of the objective function, we attempt to compute the adjoint state [8].Indeed, the adjoint state, in the infinite dimensional formulation, solves the following equations: in  × Ω,  (x, ) = 0 on  × Ω.
(58) Thus, as in (37), the block system for (58) can be written in the form  ⃗  = , where  = (  1 , . . .,    )  and   is defined as in which ⃗   = {∫ û  } and   = {∫     }.Thus, where b = [ ⃗  1 , . . ., ⃗   ] and the matrix  = (  ) is of order  × .Now, the derivative of Ĵ() in the direction   (x) for all  = 1, . . ., , with the computed adjoint state, is Therefore, the gradient can be computed as follows: Now, for any vector V ∈ Z, it is possible to write V as In order to compute the Hessian times a vector, that is, multiply the Hessian of the Ĵ with a some vector V ∈ Z, the equation   ((), ) =   ((), )V for  must be solved.Similar to the adjoint computation,  solves the linear system  ⃗  =  where ⃗  = ( ⃗   1 , . . ., ⃗    ) and ⃗   for  = 1, . . .,  is for all  = 1, . . ., .Thus, Now, solving   ((), ) =   ((), ) for  requires the solution to the linear system  ⃗  = , where or equivalently ⃗   =  ⃗   for all  = 1, . . ., .Hence, in the direction   for  = 1, . . ., , Ĵ ()V can be approximated by Therefore, Having ( 62) and (68), we can use the iterative solvers to the Newton equation Now, by using the preconditioned conjugate gradient (PCG) method, the Newton equation ( 69) is solved approximately.When we access to the sufficiently small residual of Newton system, the PCG method is truncated.In real world computation, it is possible to employ some globalization technique for Newton's method.Considering the case of implementation and relatively low computational cost, line search techniques are popular choices in this way.Finding an optimal step size   and using this step size to generate the iterate  +1 =   +     , are the mission of a line search algorithm.The Armijo condition (or sufficient decrease condition) that the step size is required to satisfy is where  ∈ (0, 1) and typically is quite small, for example,  = 10 −4 [12,13].
(1) Given  0 and tol > 0, set  = 0 (2) Compute , , , ⃗ , and ⃗ The strategy here is to introduce P = ⨂ 0 as a preconditioner, such that where as mentioned in PCE,  is equal with the degree of polynomial chaos truncation, and ‖ ⋅ ‖  denotes the Frobenius norm.The closed form of the solution can be written as follow [14]: Here, tr(    0 ) = ∑   =1 [  ] , [ 0 ] , and hence, the coefficients in the above equality can be computed straightforward.In addition, since Â and  0 are symmetric and positive definite, thus  and P =  ⨂  0 have also these properties [15,16].and the desired function is given by The random field (x, ) is characterized by its mean and covariance function Following Section 2.2, the truncated KLE can be represented as: The aim is to calculate  opt and  opt such that for all  ∈  1 0 () ⨂  2 (Ω) and all ∈  2 ()  ( opt ,  opt ) ≤  (, ) .
The eigenpairs {  ,   } in truncated KLE solve the integral equation For this special case of the covariance function, we have explicit expressions for   and   , [6].Let   even and   odd solve the equations Then the even and odd indexed eigenfunctions are given by
First of all, we find numerical approximations of   even and   odd with bisection method, and then, with the eigenpairs evaluated by this   even and   odd , by choosing dimension  = 3 and order  = 5, which emphasize that  = 55, we construct the KLE.Taking  = 0.00005, Algorithm 1 is used to compute  opt and  opt .Illustration of computed optimal control and corresponding state in addition to the global matrix sparsity, standard deviation and expected values of the solution, adjoint values, and initial state approximation of the desired function is shown in Figure 1.
In Figure 1(a), the desired function û (, ) is plotted for   =   = −1 + 0.04 * ,  = 0, 1, 2, . . ., 50.Initial state is computed and plotted in Figure 1(b) by solving (37) where ⃗  = 0. Approximated  opt corresponding to the optimal control  opt is calculated by Algorithm 1 and its graph is depicted in Figure 1(c).As it was expected, by using the proposed Algorithm 1, even by very far initial state, the approximated state solution reaches to the desired function.
In Figure 1(d), the  opt is depicted.Serious changes during the computation for initial ⃗  = 0 happened reach to the optimal control function  opt .The adjoint state is computed and plotted in Figure 1(e).As it is expected from (58), the adjoint state corresponding to the   must approximate the initial state (see Figures 1(b) and 1(c)).Figures 1(f) and 1(g) represent counter plots of the optimal state expectation and standard deviation.Finally, in Figure 1(h), the representation of coefficient matrix sparsity with the number of nonzero component in the computation is plotted in - plane.
In Example 1 we considered an exponential desired function, with v-sharp points.In the next example the smooth desired function and boundary conditions are considered.This is the case that Algorithm 1 is more efficient to use.
Example 2. Consider the optimal control of Example 1, in which the right hand side function as well as boundary conditions is replaced by û(, ) = 2 2 sin() sin(),  = 3, and  = 4, which emphasize that  = 34.Taking  = 0.00001, the illustration of computed optimal control and corresponding state in addition to the global matrix sparsity, standard deviation and expected values of the solution, adjoint values and initial state approximation of the desired function is shown in Figure 2.

Conclusion
The solution of SPDE-constrained optimization problems is a recently challenging computational task.Here, we consider distributed control problems in which stochastic diffusion equation is the SPDE.Since the saddle point system extracted from using KKT optimality condition of the problem is a very large system and more expensive to solve, hence, we use the strategy of adjoint technique and preconditioned Newton's conjugate gradient method, which iteratively solve the problem and has low computational cost.By separating the stochastic and deterministic parts of the SPDE using KLE and discretizing each part by WCE and Galerkin finite element method, respectively, we adjoint technique to compute the gradient and Hessian of the discretized optimization problem.By using preconditioned Newton's conjugate gradient method the optimal control and state of the problem are calculated numerically.Two numerical examples are given to illustrate the correspondence between theoretical and numerical approaches.