Optimal control with time-delays via the penalty method

We prove necessary optimality conditions of Euler-Lagrange type for a problem of the calculus of variations with time delays, where the delay in the unknown function is different from the delay in its derivative. Then, a more general optimal control problem with time delays is considered. Main result gives a convergence theorem, allowing to obtain a solution to the delayed optimal control problem by considering a sequence of delayed problems of the calculus of variations.


Introduction
Over the past years, there has been an increasing interest in time-delay problems of the calculus of variations and control [2,5,7,13]. Such interest is explained for their importance in control and engineering [3,4,10,11]. Indeed, time delays are inherent in various real systems, such as control systems and optimal control problems in engineering [8,9].
In this paper we improve recent optimality conditions for time-delay variational problems. In [6] necessary optimality conditions of Euler-Lagrange, DuBois-Reymond and Noether type were obtained for problems of the calculus of variations with a time delay. The results of [6] were then extended to delayed variational problems with higher order derivatives in [5]. Here we model time-delay variational problems in a more realistic way: while in [5,6] the delay on functions and their derivatives (and control variables) is always the same, here we consider different delays for the functions and derivatives/controls.
The text is organized as follows. In Section 2 we formulate the delayed problem of the calculus of variations, where the delay in the unknown functions is different from the delay in their derivatives. The main result in this section is Theorem 2.4, which provides necessary optimality conditions of Euler-Lagrange type. Control strategies via an exterior penalty method are then investigated in Section 3. The idea is to replace the optimal control problem with time-delays by a series of delayed problems of the calculus of variations. The main result gives a convergence theorem that allows to obtain a solution to delayed optimal control problems with linear delayed control systems, by considering a sequence of variational problems with time-delays of the type considered before in Section 2 (see Theorem 3.3). We end with Section 4 of conclusions.

Calculus of variations with time delays
We consider the following fundamental problem of the calculus of variations with time delays, where the delay in the function we are looking for is different from the delay in its derivative: subject to and where L : , is the Lagrangian, ⊤ > 0 is fixed in R, τ 1 and τ 2 are two given positives real numbers such that τ 2 < τ 1 < ⊤, and θ 1 (·) and θ 2 (·) are given piecewise smooth functions. Let I := [0, ⊤], L 2 I, R N be the Lebesgue space of measurable functions such that and H 1 I, R N be the Sobolev space of functions having their weak first derivative lying in L 2 I, R N and represented by for all τ and t in I. We denote • H the space of all functions x : [−τ 1 , ⊤] → R N such that x /I1 ∈ L 2 I 1 , R N , x /I2 ∈ H 1 I 2 , R N and x /I ∈ H 1 (I, R N ), which is a Hilbert space with the norm • D := x (·) ∈ H : x /I1 = θ 1 , x /I2 = θ 2 , and x (⊤) = α ; Our problem (1)-(3) takes then the following form: We make the following assumptions on the data of problem (4): (A 1 ) Lagrangian L is a C 1 Carathéodory mapping, i.e., it is of class C 1 in a,ā, b,b for almost all t ∈ [0, ⊤] and is measurable in t for every a,ā, b,b ; (A 2 ) there exist γ i (·) ∈ L 2 (I, R + ), i = 1, . . . , 5, such that a.e. in t ∈ I L t, a,ā, b,b ≤ γ 1 (t) , where ∂ i L is the partial derivative of L with respect to its ith argument.
Definition 2.1 (Cone of tangents). Let Z be a normed space, A ⊂ Z, and a ∈ A. The cone of tangents T (A, a) is the set of all z ∈ Z with the property that there is a sequence (a n ) in A converging strongly to a and a sequence of non-negative numbers (α n ) such that α n (a n − a) → z.
a function not depending on λ, and |Ψ λ (t)| ≤ g (t) + 1 for almost all t ∈ [0, ⊤] and λ sufficiently small. Since [0, ⊤] has finite measure, Lebesgue's theorem yields that This is the directional derivative of J in the direction v. To finish the proof, we need to show that J ′ (x (·) ; v (·)) is linear and bounded in v and continuous in x. The linearity is obvious. We begin by proving that J ′ (x(·); ·) is bounded from H to R: We still need to prove the continuity of J ′ (·). Let x n (·) → x (·) in H. Then, where On the other hand, Hence,ẋ Since L (t, ·, ·, ·) is C 1 -Carathéodory, assumption (A 2 ) assures from Lebesgue's theorem that This implies that I 1 + I 2 + I 3 + I 4 → 0. Then, J ′ (x k (·)) → J ′ (x(·)). The proof is complete.

Optimal control with time delays
Now we prove existence of an optimal solution to more general problems of optimal control with time delays. The result is obtained via the exterior penalty method [1,12] and Theorem 2.4. The optimal control problem with time delays is defined as follows: and where The final time ⊤ > 0 is fixed in R, τ 1 and τ 2 are two given positive real numbers such that τ 2 < τ 1 < ⊤ and, as before, θ 1 (·) and θ 2 (·) are given piecewise smooth functions. In the sequel, we denote by θ(·) the function defined by θ(t) = θ 1 (t), t ∈ I 1 , and θ(t) = θ 2 (t), t ∈ I 2 . We make the following assumptions on the data of the problem: (H 1 ) The mapping l is a C 1 -Carathédory mapping, i.e., l is C 1 in a,b, c for almost all t ∈ [0, T ] and is measurable in t for every a,b, c ∈ R N × R N × R m ; (H 2 ) there exist γ i (·) ∈ L 2 (I, R + ), i = 1, . . . , 5, such that l t, a,b, c ≤ γ 1 (t) and Using the exterior penalty function method, we consider the following sequence of unconstrained optimal control problems corresponding to (8)- (12): x(t) = θ 1 (t) a.e. t ∈ I 1 , The sequence of unconstrained optimal control problems takes then the following form: Lemma 3.1. The cone of tangents T (U 0 , u (·)) is given by Proof. Similar to the proof of Lemma 2.2.
It is well known that the penalty function method is a very effective technique for solving constrained optimization problems via unconstrained ones. The main question is the convergence of the sequence of solutions of the unconstrained optimal control problems to the original/constrained problem. Before giving the convergence theorem, we begin with some preparatory results, which are a direct consequence of the necessary optimality conditions given by Theorem 2.4. Proposition 3.2. For every n, if (x n (·) , u n (·)) ∈ D × U 0 is an optimal solution to (P n ), then x n (t) ,ẋ n (t − τ 2 ) , u n (t)) ; 2. there exists M > 0 such that φ n (t) X ≤ M for all t ∈ [0, ⊤] and all n sufficiently large.
Proof. 1) Let (x n (·) , u n (·)) ∈ D × U 0 be an optimal solution to (P n ). Then, by Lemma 2.2, Lemma 3.1 and Theorem 2.4, we obtain the necessary conditions of item 1 for problem (P n ).
Since R n → 0, there exists M > 0 such that for all t ∈ [0, ⊤] and for all n large.
We are now ready to prove the convergence theorem, which reads as follows. has a finite value, then the sequence (x n (·) , u n (·)) n of solutions to (P n ) contains a subsequence (x k (·) , u k (·)) k such that with (x (·) , u (·)) a solution to problem (8)- (12).
As before, we can assert that By (16) and (17), there exists η > 0 such that , R N for n sufficiently large. Therefore, there exists a subsequence (ẋ k (·)) k ofẋ n (·) n converging to σ (·) ∈ L 2 I, R N . Since x n (t) = θ (0) + t 0ẋ n (τ ) dτ for all t ∈ I, by the use of (15), the sequence (x n (·)) n is equi-bounded and equi-continuous (because (ẋ n (·)) n is bounded in L 2 I, R N ). Ascoli's theorem implies that strongly in C I, R N .

Conclusion
New optimality conditions for problems of the calculus of variations and optimal control with time delays, where the delay in the unknown function differs from the delay in its derivative/control, were obtained. The proofs are first given in the simpler context of the delayed calculus of variations, and then extended to delayed optimal control problems by using a penalty method. New results include a convergence theorem (see Theorem 3.3), which is of great practical interest because it allows to obtain a solution to a delayed optimal control problem by considering a sequence of simpler problems of the calculus of variations. Previous results in the literature [5,6,7] consider the delay in the unknown function to be the same as the delay in its derivative. There is, however, no justification for the delays to be the same. In contrast with those results, here we consider the case of multiple time delays. Moreover, the procedure of our proofs is completely different from the case of one time delay only, which relies in the the Lagrange multiplier method. Such approach introduces a new unknown function, the Lagrange multiplier, for which it is hard to set the interpolation space. Indeed, the Lagrange multiplier must be carefully selected in order to be possible to obtain an accurate solution. Otherwise, the resulting system of equations my become singular, in particular if the number of degrees of freedom is too large. Here we use a penalty method, which requires only the choice of one scalar parameter. Big values of this parameter are used in order to impose the boundary conditions in a proper manner. Furthermore, in our case the use of the penalty method replaces a constrained optimization problem (the delayed optimal control problem) by a sequence of unconstrained problems of the calculus of variations with time delay whose solutions converge to the solution of the original constrained problem. Similarly to [6], our results can be easily extended for controls with time delay.