ON THE CONVERGENCE OF MIN / SUP POINTS IN OPTIMAL CONTROL PROBLEMS

In case f0 and C are convex, a variety of primal/dual numerical methods can be used to find the saddle points of a Lagrangian L associated with this problem [4, 7]. These methods take advantage of the fact that the search for the saddle points of L is unconstrained, or conducted over sets much simpler than C. Furthermore, we can introduce penalties on f0 in a way that regularizes L and makes it smoother. This in turn leads to a more convenient optimality condition of the form (0,0) ∈ ∂L. In the case of a non-convex problem, similar methods can be used to find the saddle points of an augmented Lagrangian, and these points can be used to obtain a solution to the original problem (see [7, 8], and [11, Chapter 10, Sections I and K∗]). The primal/dual methods are often combined with approximating L. The notion of epi/hypo convergence introduced by Attouch and Wets in [2] provides a setting for constructing a sequence Ln in such way that the saddle points (x̄n, ȳn) of Ln converge to the saddle points of L. In this paper, however, we are interested in problems where L has a min/sup point rather than a saddle point,


Introduction
Let X be a topological space and consider the following problem: Find x such that f 0 ( x) = inf x∈X f 0 (x), subject to x ∈ C ⊂ X. (1.1) In case f 0 and C are convex, a variety of primal/dual numerical methods can be used to find the saddle points of a Lagrangian L associated with this problem [4,7].These methods take advantage of the fact that the search for the saddle points of L is unconstrained, or conducted over sets much simpler than C. Furthermore, we can introduce penalties on f 0 in a way that regularizes L and makes it smoother.This in turn leads to a more convenient optimality condition of the form (0, 0) ∈ ∂L.In the case of a non-convex problem, similar methods can be used to find the saddle points of an augmented Lagrangian, and these points can be used to obtain a solution to the original problem (see [7,8], and [11, Chapter 10, Sections I and K * ]).The primal/dual methods are often combined with approximating L. The notion of epi/hypo convergence introduced by Attouch and Wets in [2] provides a setting for constructing a sequence L n in such way that the saddle points ( xn , ȳn ) of L n converge to the saddle points of L. In this paper, however, we are interested in problems where L has a min/sup point rather than a saddle point, and we are also interested in approximating L with a sequence of Lagrangians of simpler forms.
The notion of lopsided convergence of bivariate functionals defined on X×Y , where X and Y are topological vector spaces, was introduced by Attouch and Wets in [1] in order to study the stability of min/sup points.A very similar notion was also introduced and studied in details by Lignola, Loridan, and Morgan in [5,6].In this paper, we will use the concept of lopsided convergence to approximate control problems.However, the original definition of lopsided convergence in [1] cannot be used directly due to the lack of compactness of the perturbation space for most control problems.Therefore, we modify the definition of lopsided convergence to better suit our applications.
In Section 3, we review different notions of convergence for bivariate functions.In Section 4, we show how the modified lopsided convergence can provide a simple way to recover a number of stability results that already exist in the literature.Section 5, contains the main application; we develop a scheme to approximate non-convex optimal control problems with a sequence of finite dimensional problems.The modified definition of lopsided convergence we introduce in Section 3 has further applications in two level programming and problems of existence of Stackelberg equilibria in a non compact setting.These applications, however, will be left for a subsequent paper.

Preliminaries
In this section, we review some basic definitions of set convergence that we will use in the following sections.Most of these definitions can be found in [4,11].Let (X, τ ) be a topological vector space.Let C n be subsets of X.We define the limit inferior and the limit superior of the collection We define the following notions of set convergence We say C n converge to C 0 in the Painlevé-Kuratowski sense, and we write Let f : X → R, where R is the set of extended real numbers.The function f is sequentially lower semi-continuous (lsc), if for all x ∈ X and x n τ (2.4) Similarly, f is sequentially upper semi-continuous (usc), if for all x ∈ X and The We say f n τ -epi-converge to f on the topological space (X, τ ), and we write (2.9) The set of minimizers of f is denoted by argmin Theorem 2.1 is the main theorem regarding epi-convergence.
Theorem 2.1 (see [11,Theorem 7.31]).Let (X, τ ) be a topological space and let f n : X → R be a collection of functions that τ -epi-converge to a proper function f 0 .Then, Let X and Y be topological spaces and suppose that (2.11) Consider the functions F 0 : X → R and G : X → R, where G is lsc and convex. Let , where : X → X is a continuous map.Then, a Lagrangian of F , with respect to a dual space Y , can be defined by where The bivariate function L is a Lagrangian in the sense that We now consider a certain class of integral functionals on L p spaces.Consider first the function f : [0, T ] × R n → R. We say f is a normal integrand, if f is measurable in the first variable and lsc in the second.We define the integral functional I f over the space L r ([0, T ], R n ) for r ∈ [1, +∞], (2.15) Suppose f is a normal integrand that is also convex in the second argument, then I f is a proper, convex, lsc (and weakly lsc) function on L r .Furthermore, if r ∈ [1, +∞), then the conjugate of I f is defined over (L r ([0, T ], R n )) * by the following equation ).Note that for r = +∞, and a conjugate space (L ∞ ) * , the above formula will not be valid.We now list the definitions that we need regarding bivariate functions.Let K : X × Y → R where X and Y are topological spaces.We say (2.17) We write arg sp K to denote the set of saddle points of K, and we write argminsup K to denote its set of min/sup points.
Finally, in all what follows, we will use the standard convention of ∞−∞ = ∞.This will allow us to deal with constraints on the variables x and y in a consistent manner.

Lopsided convergence
We start by reviewing some notions of convergence for bivariate functions.Let (X, τ ), (Y, σ ) be topological spaces and consider a sequence of bivariate functionals K n : X ×Y → R. The sequence {K n } epi/hypo converges to K 0 [2] Note that in the definition of epi/hypo convergence there is a symmetry between parts (i) and (ii).The term "lopsided" is meant to emphasize the lack of such symmetry in Definition 3.2.Theorem 3.3 is the main theorem regarding lopsided convergence that concerns us.
It is clear from the definitions that lopsided convergence implies epi/hypo convergence but the converse is not true.The compactness of Y is essential in the proof of Theorem 3.3 even though the compactness of Y is not required for the existence of a min/sup point [1].In this paper, however, we will investigate control problems where the space Y is a space of perturbations for the state variables which is not compact in most cases (or not compact in a topology that is compatible with the continuity properties the problem has).Therefore, we introduce the following modification of Definition 3.2.Definition 3.4.Consider two topologies τ 1 and τ 2 on the space X and assume that τ 1 is stronger than τ 2 .Similarly, consider two topologies σ 1 and σ 2 on Y and assume that σ 1 is stronger than σ 2 .The sequence (ii) For all x ∈ X, there exists x n τ 1 − − → x such that for any sequence {y n } such that K n (x n , y n ) is bounded below, there exists a subsequence n k , there exists − − → 0, and that eventually Note that when τ 1 = τ 2 , σ 1 = σ 2 , and Y is compact, the two definitions of lopsided convergence are equivalent.Now we can prove the following theorem. (3.9) The proof of Theorem 3.5 follows immediately from the following lemma and Theorem 2.1.Lemma 3.6.Let X, Y be topological spaces as in Definition 3.4 (3.11) (ii) For every x ∈ X, there exists x n τ 1 − − → x such that for any sequence {y n }, where K n (x n , y n ) is bounded below, there exists a subsequence n k and there exist n k 0 and w n k ∈ Y such that eventually Then, We also claim that for all x ∈ X, there exists x n (3.17) Assume not, and assume that V 0 (x) < +∞.Then, for every sequence x n Thus, there exists a subsequence of V n (x n ) that converges to β.To simplify the notation, we will also use n to index this subsequence.Hence, there exist > 0, y ∈ Y , a sequence {y n } in Y , and there exists n 0 such that for all n ≥ n 0 , K n x n , y n > K 0 (x, y) + .
(3.18) Assumption (ii) implies that there exist w n k and n k 0 such that eventually Hence, for some n k > n 0 , and for some > 0, we have − − → 0 in part (ii) were not used in the proof of Lemma 3.6.In fact, these requirements are not needed for the conclusion of Theorem 3.5.However, they are needed to make the notion of lopsided convergence stable with respect to perturbation of some classes of functions (e.g., the class of τ 2 × σ 2 continuous biaffine functions).

Remark 3.8.
There is a number of variations of the definition of epi/hypo convergence.These variations consider more than one topology on the spaces X and Y (see [3,12]) in a way that is very similar to what we have in Definition 3.4.Remark 3.9.At this point, it may seem that the original definition of lopsided convergence can be directly used in problems defined over infinite dimensional spaces when the Lagrangian is coercive in the y variable.It may seem that all what is needed in this case is to restrict the Lagrangian to a bounded set of Y and to choose a weak enough topology that will make this bounded set compact.This, however, is not true, as the last remark of Section 5 will illustrate.In fact, in some of our applications, a topology that is weak enough to make the modified space Y compact will make it very difficult for us to verify condition (ii) of Definition 3.2.

Applications I
In our first application, we will consider problems with Lagrangians of the following form (for examples and details see [4, Chapter VII]): where L : U ×V → R and U and V are Hilbert spaces, J : U → R is a convex, lsc, Gâteau differentiable function.A and B are nonempty closed convex subsets of U and V , respectively.δ A and δ B are the indicator functions of these sets (0 on the set and +∞ outside it).The map : A → V is Lipschitzian in the sense that And for every v ∈ V , the function u → v, φ(u) is convex and lsc on A, where • , • is the inner product of V .Note that our assumptions imply that for all v, for all u n w − → u, we have A number of primal/dual methods that take advantage of the special form of the Lagrangian can be used to attack this problem (cf.[4,Chapter VII]).Most of these methods, however, require the following additional conditions on J , A, and B (a) B is bounded (b) J is coercive in the sense that for all u 1 , u 2 ∈ A, In this section, we are interested in problems that do not satisfy the above conditions.Therefore, we perturb the original Lagrangian in such a way that the Adib Bagh 43 resulting Lagrangians will be where B n = B ∩ r n Ꮾ, where Ꮾ is the unit ball in V , and r n increases to infinity.Clearly, L n satisfies conditions (a) and (b).We now show that L n lopsided converges to L. Theorem 4.1.Let τ 1 and τ 2 be, respectively, the norm and the weak topology on U .Similarly, let σ 1 and σ 2 be, respectively, the norm and weak topology on V .Let L 0 and L n be defined by (4.1) and (4.5), then which is condition (i) of Definition 3.4.
To verify condition (ii), we let u ∈ U .For any sequence v n , there exists n such that Remark 4.2.The above approximation method can be used when B is unbounded.Moreover, in some control problems, the set B has the following form: where is a bounded open set in R n and M is some constant.The set B is bounded in L p due to the Poincaré inequality.However, the bound of B (the Pointcaré constant for the region ) may not be available.In this case, it is convenient numerically to use the above approximation and replace B with an increasing sequence of balls in L p even though B is bounded.
Finally, we show how the modified lopsided convergence can be used to recover some stability results regarding some classical control problems.

Consider the following problem: Minimize
where is a subset in The Lagrangian of the above problem can be expressed as (see [4, Section 3.1] for details) where φ(t, x) = y(t, x; 0), z(t, x; u) = y(t, x; u)−y(t, x; 0), z d (t, x) = y d (t, x) − y(t, x; 0), and φ = y(0) (see [4, Section 3.1, Chapter VII]).Solving the original control problem corresponds to finding u, where the following infimum is attained: where Note that, despite the fact that new formulation of the problem involves a supremum and an infimum, the format of L and the simplicity of the set B make the new formulation of the problem easier to solve numerically.The Lagrangian in (4.13), however, does not satisfy requirement (b) for the direct application of numerical methods.Furthermore, (4.5) may not have a saddle point under our current assumptions, and the standard methods of approximating saddle points (via epi/hypo convergence) cannot be used.Therefore, for > 0, we introduce Now for every > 0, the Lagrangian L (u, v) satisfies conditions (a) and (b), and a standard numerical method, such as the Uzawa method [4], can be used to find a saddle point (u , v ) for L .If we show that, as → 0, L lopsided converges to L, then we will know that every cluster point of {u } is a min/sup point of L. This is precisely the claim of the following theorem.

Application II
In this section, we consider the following control problem, minimize, over where and L is an operator from C([0, T ], R k ) to L 1 .The states and controls are subject to the constraint t, f t, u(t), x(t) ∈ gph E a.e., (5.3)where E is a set-valued map from [0, T ] to R j and gph E is the graph of the map E.
Assumptions on the operator L. The operator L has an inverse such that u n converges weakly to u in L p which implies that u n converges pointwise to u.
The following are examples of operators that satisfy our assumptions.
Example 5.1.L is the linear differential operator given by x(0) = α, (5.4) with the standard assumptions on A and B.
Example 5.2.L is a differential operator representing an evolution equation, under the usual assumptions of growth and Lipschitz continuity on g.
Example 5.3.Any differential operator L whose inverse can be expressed by an integral equation of the form where Assumptions on the constraint (5.3).where U , V , and C are set-valued maps.In this case, we only need to define

), x(t), M t, x(t), u(t) .
(5.9) Assumptions on the cost function I 1 .We will assume that I 1 is finite-valued, weakly lower semi-continuous, and strongly upper semi-continuous over L 2 .These assumptions can be formulated in terms of standard conditions on φ is measurable in the first argument, convex in the second, and continuous in the third.
(ii) There exists a function : R + → R + such that for all t, y, and z such that |z| ≤ φ(t, z, y).
(5.10) (iii) For all u ∈ L 2 , there exists a neighborhood V of u and a function h ∈ L 1 such that for every u ∈ V φ t, u (t), x (t) ≤ h(t), (5.11)where x = u .

Adib Bagh 47
The above conditions imply that I 1 is finite-valued.Note that assuming that I 1 is finite-valued does not limit the scope of this model, since we will deal with the constraints on the controls when we deal with constraint (5.3).Furthermore, conditions (i) and (ii) imply the weak lower semi-continuity of I 1 (see [4, Theorem 2.1, Chapter VIII]).The (strong) upper semi-continuity of I 1 follows from (iii), Fatou's lemma, and Lebesgue dominated convergence theorem.
Our goal in this section is to construct finite dimensional approximations for the above non-convex (partially convex) control problem.In [9,10], Rockafellar developed the full duality theory for a similar type of control problems.However, he was mainly interested in cases where the existence of saddle points for the Lagrangians is guaranteed, and therefore he required the cost function to be convex (in the control and the state) and the dynamics to be linear.We will be able to relax these conditions, since we are only interested in the partial duality of the problem and since we only assume the existence of min/sup points.
To simplify the notation, we will assume that f : Hence, for all u and for all x such that x = u, we have f (t, u(t), x(t)) ∈ L 2 because of the growth condition on E. We introduce an exact but finite penalty function.
Let θ : [O, T ]×R n be a normal, convex (in the second argument) integrand ( More specifically, we define where d(z, C) = inf c∈C c−z for any closed subset C. Note that θ(t, •) is finite over R j .Now the problem can be expressed as minimizing, over L p , the functional where I 2 : L p → R is defined by ) where w(t) = f (t, u(t), x(t)), and u = x.A Lagrangian associated with the above problem is given by L : where where θ is still given by (5.13), and where again w(t) = f (t, u(t), x(t)), and u = x.A direct calculation of θ * (t, z), when θ is given by (5.13), gives us The fact that F (u) = sup v∈L 2 L(u, v) follows from the definition of the conjugate function and from the fact that the conjugate of I 2 is actually given by (5.18) (see also (2.12) in the Preliminaries).
Using exact penalties for the joint state and control constraints causes serious computational complications (see [10] for details).Therefore, we introduce a sequence of finite non exact penalties θ n : [0, T ]×R n :→ R such that θ n (t, •) increase continuously to θ(t, •).More specifically, we will take Moreau envelopes of θ, (5.22) These approximating functions are strictly convex and differentiable.Moreover, the conjugates of θ * n (t, w) are given by (see [11,Chapter 11]) •) is differentiable on the interior of its domain (see [11,Theorem 11.13]).
We now use a method developed by Wright [12] to discretize the control problem in the primal and dual variables at the same time.Let n be an increasing sequence of partitions of [0, T ] such that |a n s − a n s−1 | → 0, uniformly in s, as n → +∞.

Adib Bagh 49
We define the following sets: a n s , s = 0, 1, . . ., T . (5.25) The sequence of approximating Lagrangians that we will consider is where δ U n and δ V n are the indicator functions of U n and V n , respectively, and (5.27) where the problem of finding the min/sup points of L n is a finite dimensional problem.Also note that for a step function v, calculating θ * n (t, v(t)) reduces to solving a number of simple problems of convex minimization (concave maximization) in R n .Furthermore, the discretized problem will be governed by a difference equation (see [12] for details).
Since L n is not convex in u (φ is not convex in x and is not convex in u), L and L n may not possess saddle points.However, we can use the method of augmented Lagrangians developed by Rockafellar in [8] to find the min/sup points of L n .The idea of this method is to construct an augmented Lagrangian Ln (u, v, r), for some r ∈ (0, +∞), and then use some standard primal/dual numerical method to find ( ūn , vn ), the saddle points of Ln (u, v, r).For every n, the point ūn will then be a min/sup point for L n (see [8] and [11, Section K * , Chapter 11], also see [7,Chapter 17] for an actual algorithm using augmented Lagrangians).This approach will allow us to approximate, discretize, and numerically solve the original problem despite its lack of convexity.
Finally, we state the main theorem of the section which will show that our approximation scheme will actually work.
Theorem 5.4.Let L n and L be defined by (5.17 (5.29)Before we prove Theorem 5.4, we list a number of basic lemmas that we will need in our proof.Lemma 5.5.Consider the functions I 1 and I 1,n defined by (5.16) and (5.27).Consider also the sets U n (5.25).Let u ∈ L p , then there exists a sequence ûn ∈ U n such that lim I 1,n ûn = I 1 (u). (5.30) Proof.The proof is an immediate result of the fact that step functions are dense in L p for p ∈ [2, +∞) and of our assumption that I 1 is norm continuous.Remark 5.8.Theorem 5.4 is only a stability result, in the sense that argminsup L might be empty.However, standard conditions on φ can be added to assure that w-Ls argminsup L n and argminsup L will not be empty.In this case, the numerical method we suggested will produce a sequence of pairs (u n , v n ) ∈ L p × L 2 which are the saddle points of the augmented approximating Lagrangians.The sequence v n will be bounded in L 2 , which is important for numerical reasons, but its weak cluster point may fail to be a solution to the dual problem.More specifically, a weak × weak cluster point of (u n , v n ), may fail to be a saddle point for the Lagrangian of the original control problem.However, as Theorem 5.4 shows, any weak cluster point of u n will be a solution of the original control problem.Finally, if we know a priori that the original control problem has saddle points, then Theorem 5.4 can be easily modified to show stability of the saddle points with respect to the discretization and approximation scheme of this section.
Remark 5.9.The convexity of φ in the second variable was needed only to obtain the weak lower semi-continuity of I 1 .In case the cost function is known to be weakly lsc on L p , no such assumption is needed.A cost function that depends on the state only would be an example of such a case.
Remark 5.10.We finally elaborate further on Remark 3.9 of Section 3; under the appropriate coercivity conditions, we may attempt to use the original definition of lopsided convergence, with Y taken as a large enough ball in L 2 and σ as the (relative) weak topology.In this case, Y is compact in σ .However, verifying condition (ii) of Definition ) and(5.26).Thenw − Ls argminsup L n ⊂ argminsup L.
and v n ∈ B n .Due to (4.3), and since every convex lsc function on a Banach space is also weakly lsc, we immediately obtain lim inf [11,where σ E(t) (z) = sup y∈E(t) y, z is the support function of the set E(t), and B is the unit ball in R n (see[11, Example 11.26, Chapter 11]for details).Note that θ * (t, •) is convex, proper, and lsc.Moreover, it is coercive over R n since θ(t, •) is finite everywhere.
Lemma 5.62]et v ∈ L 2 such that v(t) ∈ B a.e., where B is the unit ball in R n .Then there exists a sequence vn → v such that v n ∈ B ∩ V n and Proof.Using some elementary facts from measure theory (see[12,Lemmas 1,2]for details) and using the definition of the support function, we can find a sequence vn → v such that v Proof of Theorem 5.4.In light of Theorem 3.5, we only need to show that L n lopsided converges to L when τ 1 and τ 2 are, respectively, the norm and weak topologies on L p , and σ 1 and σ 2 are also, respectively, the norm and weak topologies on L 2 .
n (t) ∈ B, vn (t) → v(t), and v n (t) ≤ v(t) a.e. in [0, T ].Hence, for all n, we have T 0 σ E(t) vn (t) dt ≤ Lemma 5.7.Let v ∈ L 2 such that v(t) ∈ B a.e.Then, there exists a sequence vn → v such that v n (t) ∈ B ∩ V n and 3.2 is, in essence, equivalent to verifying 52 On the convergence of min/sup points in optimal control problems