The Space Decomposition Theory for a Class of Semi-Infinite Maximum Eigenvalue Optimizations

and Applied Analysis 3 to solve such problems. Using primal-dual gradient structure (PDG), we give an important conclusion about secondorder expansion of the function. Likewise, Section 4 outlines the optimization approach as in Section 3 and presents a different way to deal with the supposition of multiplying largest eigenvalues. The paper ends with some concluding remarks. 2. Preparation and Preliminary Results We recall theUV-theory developed in [15]. Let f : Rm → R be a finite-valued convex function. For a given x ∈ R, we start by defining a decomposition of the space Rm = U(x) ⊕ V(x).The subspacesU(x) andV(x) are equivalently defined as follows: U (x) := {d ∈ R m : f 󸀠 (x; d) = −f 󸀠 (x; −d)} , V (x) := U(x) ⊥ . (8) In other words, U is the subspace where f(x + ⋅) appears to be differentiable at 0. Likewise, we can obtain the following result, which is stated in [15]. Proposition 1. Let f : Rm → R be a proper convex function for a given point x; one has the following. (1) V(x) is the subspace parallel to aff ∂f(x) and U(x) = V(x) ⊥. (2) For any g ∈ ri ∂f(x), U(x) andV(x) are, respectively, the normal and tangent cones to ∂f(x) at g, where ri C stands for the relative interior respect to a given set C. We give the Clarke generalized gradient for local Lipschitz function. Definition 2 (see [23, 24]). Let f be local Lipschitz on R; the generalized gradient off at x, denoted by ∂f(x), is defined by ∂f (x) := {ξ ∈ R n : f ∘ (x; d) ≥ ⟨ξ, d⟩ , ∀d ∈ R n } , (9) where f∘(x; d) = lim sup y→x,t→0 +(f(y + td) − f(y))/t is the generalized directional derivative of f at x in the directive d. The following results come from [23]; we will use these properties in later sections; as for their proofs, we will omit them. Proposition 3. Suppose {f i } is a finite collection of functions (i = 1, . . . , n), each of which is Lipschitz near x. The function f is defined by f (x) = max i=1,...,n f i (x) . (10) Then one has ∂f (x) ⊂ conv {∂fi (x) : i ∈ I (x)} , (11) where I(x) := {i : f i (x) = f(x)}, and if f i is regular at x for each i ∈ I(x), then equality holds and f is regular at x. Notation. We introduce the basic notation in the remainder parts. S n is the space of n×n symmetricmatrices and S+ n stands for the cone of n×n positive semidefinite symmetricmatrices. A ⋅ B := trAB denotes Fröbenius scalar product of A, B ∈ S n . Let p ≥ 1 be the multiplicity of the largest eigenvalue λ 1 (A) of A; that is, A lies on the submanifold M p := {A ∈ S n : λ 1 (A) = ⋅ ⋅ ⋅ = λ p (A) > λ p+1 (A)}, where M p is a Csubmanifold of S n . Let E 1 (A) be the eigenspace associated with λ 1 , letQ 1 (A) be an orthonormal basis ofE 1 (A), letP 1 (A) be an orthonormal basis associated with λ 1 , . . . , λ p , TM(A), and letNM(A) be, respectively, the tangent andnormal spaces to the submanifold M at A ∈ M. A∗ : S n → R m is the adjoint operator of the linear operatorA : Rm → S n . Much of the additional notation comes from [25, 26]. 3. UV-Space Decomposition for Single Eigenvalue 3.1. UV-Theory of the Single Eigenvalue Function. In this section we will analyse the case where the multiplicity of λ 1 (A(x, ω)) is one at all active frequenciesω.This ismotivated by practical considerations because nonsmoothness (b) never occurred in our tests. The necessary changes required for the general case will be discussed in next section. Lemma 4. For a closed-loop stabilizing controller x, the set of active frequencies Ω(x) := {ω ∈ [0,∞] : f(x) = λ 1 (A(x, ω))} is either finite or Ω(x) = [0,∞]; that is, f(x) = λ 1 (A(x, ω)) for all ω. A system where Ω(x) = [0,∞] is called all-pass. This is rarely encountered in practice. For the technical formulas we will concentrate on those x’s, where the set of active frequencies or peaks Ω(x) = {ω ∈ [0,∞] : f(x) = λ 1 (A(x, ω))} is finite. From what follows we will analyse the case where the multiplicity of f(x, ω) := λ 1 (A(x, ω)) is one at all active frequencies ω. This is motivated by practical considerations because nonsmoothness about λ 1 never happened in our tests.The necessary changes required for the general case will be discussed in Section 4. In [27], three approaches to semi-infinite programming are discussed: exchange of constraints, discretization, and local reduction. We will use a local reduction method here. The main ideas are recalled below. Let x be a local solution of (7). Indexing the active frequencies Ω(x) := {ω 1 , . . . , ω p } at x, we suppose that the following conditions are satisfied. Assumption 5. Consider (i) f󸀠 ω (x, ω i ) = 0, i = 1, . . . , p, (ii) f󸀠󸀠 ωω (x, ω i ) < 0, i = 1, . . . , p, (iii) f(x, ω) < f(x), for every ω ∉ Ω(x) = {ω 1 , . . . , ω p }. 4 Abstract and Applied Analysis These assumptions define the setting denoted as the standard case in semi-infinite programming [27]. The three conditions express the fact that the frequencies ω i ∈ Ω(x) are the strict global maximizers of f(x, ⋅). Notice that condition (iii) is the finiteness hypothesis already mentioned, justified by Lemma 4. Lemma 6. Under conditions (i)–(iii), the neighborhood U of x may be chosen such that max ω∈[0,∞] f(x, ω) = max i=1,...,p f(x, ω i (x)) for every x ∈ U. In particular, Ω(x) ⊂ {ω 1 (x), . . . , ω p (x)} for every x ∈ U. So we have that program (6) is locally equivalent to the standard following nonlinear program: min x∈R n sup i=1,...,p f (x, ω i (x)) , (12) where f(x, ω i (x)) = λ 1 (A(x, ω i (x))); then we may solve (12) via the so-calledUV-decomposition method. Assumption 7. f󸀠 x (x, ω 1 ), . . . , f 󸀠 x (x, ω p ) are linearly independent. Under the hypothesis of Assumption 7, local convergence of this approach will be assured because this guarantees that (12) satisfies the linear independence constraint qualification hypothesis. We denote F(x) := sup i=1,...,p λ 1 (A(x, ω i (x))), and qi 1 (x), i = 1, . . . , p, stands for the eigenvector associated with the largest eigenvalue of A(x, ω i (x)). Next a special kind of structure of F(x), called primaldual structure (PDG), will be seen. Proposition 8. There exists a ball about x, denoted by B(x), p functions f i (x) := f (x, ω i (x)) = λ 1 (A (x, ω i (x))) , for i = 1, . . . , p; (13) themultiplicity of λ 1 (A(x, ω i (x))) is single, sof i (x) areC on B(x); in addition, (1) x ∈ B(x) and f i (x) = F(x) for i = 1, . . . , p; (2) for each x ∈ B(x), F(x) = max i=1,...,p f i (x); (3) △ 1 is the unit simplex in Rp given by △ 1 := {(α 1 , α 2 , . . . , α p ) : p


Introduction
∞ output feedback control is an important example of a design problem, where the feedback controller has to respond favorably to several performance specifications.Typically in  ∞ synthesis, the  ∞ channel is used to enhance the robustness of the design.Due to its prominence in practice,  ∞ control has been addressed in various ways over the years.
In nominal  ∞ synthesis, feedback controllers are computed via semidefinite programming (SDP) [1] or algebraic Riccati equations [2].When structural constraints on the controller are added, the  ∞ synthesis problem is no longer convex.Some of the problems above have even been recognized as NP-hard or as rationally undecidable.These mathematical concepts indicate the inherent difficulty of  ∞ synthesis under constraints on the controller.The  ∞ synthesis problem involves finding an output feedback control matrix  that minimizes the  ∞ norm of a certain transfer function, subject to the constraint that  is stabilizing.This is a challenging problem and even finding a stabilizing  can be difficult.Indeed, if the entries of  are restricted to lie in prescribed intervals, then finding a stabilizing  is an NPhard problem [3].
∞ feedback controller synthesis was one of the motivating application for the development of our work.We consider a linear time invariant dynamical system in the standard LFT form where  ∈    is the state,  ∈    is the output,  ∈    is the command input, and  ∈    ,  ∈    are the performance channel.To cancel direct transmission from input  to output , the assumption  22 = 0 is made.This is without loss of generality (see [4], chapter 17).
The application we have in mind is optimizing the  ∞norm, which is structurally of the form where  :   ×[0, ∞] →   is an operator with values in the space   of  ×  symmetric or Hermitian matrices, equipped with the scalar product  •  = Tr(), and  1 denotes the maximum eigenvalue function on   .The above problem (5) can be recast as a case of (6).The program we wish to solve in this paper is where the function  has the form (6).  is nonsmooth with two possible sources of nonsmoothness: (a) the infinite max-operator and (b) the nonsmoothness of  1 , which may lead to nonsmoothness of  1 ((, )) for fixed .
Optimization of the  ∞ -norm is a prominent application in feedback synthesis, which has been pioneered by Polak and coworkers; see, for instance, [7,8] and the references given there.Existing methods for the  ∞ synthesis problem are often based on first reformulating the problem into one involving linear matrix inequalities (LMIs) and an additional nonconvex rank constraint or nonconvex equality constraint.Solving methods for such reformulations of the problem include those based on linearization method [9], alternating projections method [10], augmented Lagrangian method [11], and sequential semidefinite programming method [12].The  ∞ synthesis problem can also be reformulated into a problem involving bilinear matrix inequalities (BMIs).Dealing with such reformulations of the problem includes [12,13] (see also the references therein).A disadvantage of these approaches is that they require the introduction of Lyapunov variables.As the number of Lyapunov variables grows quadratically with the number of state variables, the total number of variables can be quite large and even problems of moderate size can lead to numerical difficulties [14].
In this paper, the  ∞ synthesis problem is posed as an unconstrained, nonsmooth, nonconvex minimization problem and requires special optimization techniques.Our approach avoids the use of Lyapunov variables; hence, it is well suited for optimizing our reformulation of the  ∞ synthesis problem.We develop the local nonsmooth optimization strategy, a superlinear space decomposition algorithm, which is suited for optimizing the  ∞ -norm.Problem (7) implies the smoothness information; we can adopt variable space decomposition form.Meanwhile, since the problem (7) has the special structure called primal-dual gradient structure (PDG), which has been introduced in [15], it is possible to identify smooth tracks.So we can design a method which has fast convergent rate.The approach which is taken to solve this problem is based on using the recently developed local optimization algorithm presented in [15,16].In light of the UV-space decomposition, this method is introduced in [15] (see also [17,18]).Moreover, it is applied to many applications such as nonlinear programming and second-order cone programming (see [19][20][21][22]).The idea is to decompose   into two orthogonal subspaces V and U at a point  that the nonsmoothness of  is concentrated essentially on V, and the smoothness of  appears on Usubspace.More precisely, for a given  ∈ (), where () denotes the Clarke subdifferential of  at .Then   can be decomposed as direct sum of two orthogonal subspaces, that is,   = U ⊕ V, where V = lin(() − ), and U = V ⊥ .Then we define the primal-dual Lagrangian, an approximation of the original function, and show along certain manifolds it can be used to create as second-order expansion for a nondifferentiable function.As a result, we can design an algorithm frame that makes a step in the Vspace, followed by a U-Newton step in order to obtain superlinear convergence, and show that it improves the situation considerably.
The rest of the paper is organized as follows.In Section 2, we recall some basic concepts about UV decomposition theory.In Section 3, we reformulate these problems as unconstrained max-finite function optimization problems under the hypothesis of the multiplicity one of the largest eigenvalue.We also mention some of the issues involved in trying to solve such problems.Using primal-dual gradient structure (PDG), we give an important conclusion about secondorder expansion of the function.Likewise, Section 4 outlines the optimization approach as in Section 3 and presents a different way to deal with the supposition of multiplying largest eigenvalues.The paper ends with some concluding remarks.

Preparation and Preliminary Results
We recall the UV-theory developed in [15].Let  :   →  be a finite-valued convex function.For a given  ∈   , we start by defining a decomposition of the space   = U() ⊕ V().The subspaces U() and V() are equivalently defined as follows: In other words, U is the subspace where ( + ⋅) appears to be differentiable at 0. Likewise, we can obtain the following result, which is stated in [15].Proposition 1.Let  :   →  be a proper convex function for a given point ; one has the following.
(2) For any  ∈ ri (), U() and V() are, respectively, the normal and tangent cones to () at , where ri  stands for the relative interior respect to a given set .
We give the Clarke generalized gradient for local Lipschitz function.
The following results come from [23]; we will use these properties in later sections; as for their proofs, we will omit them.
Proposition 3. Suppose {  } is a finite collection of functions ( = 1, . . ., ), each of which is Lipschitz near .The function  is defined by Then one has where () := { :   () = ()}, and if   is regular at  for each  ∈ (), then equality holds and  is regular at .
Notation.We introduce the basic notation in the remainder parts.  is the space of × symmetric matrices and  +  stands for the cone of × positive semidefinite symmetric matrices. ⋅  := tr  denotes Fröbenius scalar product of ,  ∈   .Let  ≥ 1 be the multiplicity of the largest eigenvalue  1 () of ; that is,  lies on the submanifold Let  1 () be the eigenspace associated with  1 , let  1 () be an orthonormal basis of  1 (), let  1 () be an orthonormal basis associated with  1 , . . .,   ,  M (), and let  M () be, respectively, the tangent and normal spaces to the submanifold M at  ∈ M. A * :   →   is the adjoint operator of the linear operator A :   →   .Much of the additional notation comes from [25,26].

UV-Space Decomposition for Single Eigenvalue
3.1.UV-Theory of the Single Eigenvalue Function.In this section we will analyse the case where the multiplicity of  1 ((, )) is one at all active frequencies .This is motivated by practical considerations because nonsmoothness (b) never occurred in our tests.The necessary changes required for the general case will be discussed in next section.A system where Ω() = [0, ∞] is called all-pass.This is rarely encountered in practice.For the technical formulas we will concentrate on those 's, where the set of active frequencies or peaks From what follows we will analyse the case where the multiplicity of (, ) :=  1 ((, )) is one at all active frequencies .This is motivated by practical considerations because nonsmoothness about  1 never happened in our tests.The necessary changes required for the general case will be discussed in Section 4.
In [27], three approaches to semi-infinite programming are discussed: exchange of constraints, discretization, and local reduction.We will use a local reduction method here.The main ideas are recalled below.
Under the hypothesis of Assumption 7, local convergence of this approach will be assured because this guarantees that (12) satisfies the linear independence constraint qualification hypothesis.
We have the following result.
Theorem 9. Suppose the set Ω() is finite.Then the Clarke subdifferential of  at  is the set as follows: where is the set of active indices at  and Proof.Because () is the finite maximum functions, we can directly make use of the Clarke subdifferential of it and derivative of the eigenvalue function with multiplicity one to get and the proof is done.
Theorem 10.Suppose Assumptions 5 and 7 hold.Then one has the following results at .
(1) The Clarke subdifferential of () has the following expression: (2) Let V denote the subspace generated by the subdifferential ().Then where lin  stands for linear hull of a set .
Proof.With Theorem 9 and Assumption 5, we can get the conclusion (1).
Let  1 = 1,   = 0,  ̸ = 1, and we have ∇ 1 () ∈ ().Then it follows from the definition of space V that and U = V ⊥ means that the second formula holds.The proof is completed.
(ii) For any  ∈ (), we have From the above Theorem 10, the U-component of a subgradient  ∈ () is the same as that of any other subgradient at ; that is,  U =   .

Smooth Trajectory and Second-Order Properties.
Given that  =  U ⊕  V ∈ (), the Lagrangian-like function of  can be formulated in Theorem 12. Suppose Assumption 7 holds.Then, for all  small enough, there exists the following.
(i) The solution of the nonlinear system with variables is unique and V = V(), where V() : U → V is a  2 function.
(iii) The conclusion can be directly obtained in terms of (i) and the definition of X().
So far we have developed a primal track X().Now we take our attention to an associated dual object, which is also a smooth function of  ∈ U; we study a multiplier vector function (), which depends on structure function gradients, X(), and an arbitrary subgradient at . Lemma 13.Given that  ∈ (), the system with {  ()},  ∈ {1, . . ., }, has a unique solution  = (), which is given by in particular, for all  ∈ {1, . . ., },   (0) =   .
Theorem 14.Given that  ∈ (), at the trajectory X() =  +  + V(), one has Proof.According to the definition of  in ( 25) and the item (iii) from Theorem 12, we get Theorem 15.Given that  ∈ () and supposing Assumption 7 holds; then for  small enough, the following assertions are true.
(ii) Using the chain rule, the differential of the Lagrangian-like with respect to  can be written as follows: Multiplying each equation by the appropriate   (), summing the results, and using the fact that ∑    () = 1 yield Using the transpose of the expression of (X()), we get which together with (6.11) in [28] yields the desired result.

Theorem 16. Suppose Assumption 7 holds and 𝑔 ∈ 𝜕𝐹(𝑥).
Then for  small enough, there holds the second-order expansion of  along with the trajectory X() =  +  ⊕ V(), Proof .From the definition of , we have Since  ∈  2 , we get Therefore,

UV-Decomposition for Multiple Eigenvalues
4.1.UV-Theory of the Multiple Eigenvalue Function.The working hypothesis of the previous section was that leading eigenvalues  1 ((,   ())) had multiplicity 1 for all frequencies in the set { 1 (), . . .,   ()} and for all  in a neighborhood of .This hypothesis is motivated by our numerical experience, where we have never encountered multiple eigenvalues.This is clearly in contrast with experience in pure eigenvalue optimization problems.However, our approach is still functional if the hypothesis of single eigenvalues at active frequencies is abandoned.Based on the weaker assumption that the eigenvalue multiplicities   at the limit point  are known for all active frequencies   ,  = 1, . . ., , and on the information at the current iterate point, we have good technique to dependably guess   .This situation has been discussed by several authors (see, e.g., [27,[29][30][31]).Consider  ∈   where  1 () has multiplicity .We replace the maximum eigenvalue function  1 by the average of the first  eigenvalues This function is smooth and convex in a neighborhood of the smooth manifold of the matrices  ∈   with the largest eigenvalue multiplicity , and  1 = λ on M  .Then we may replace the nonsmooth information contained in  1 by the smooth information contained in the function λ by adding the constraint (,   ()) ∈ M   .The manifold has codimension  := ((( + 1))/2) − 1 in   and in a neighborhood of  may be described by  equations ℎ 1 () = 0, . . ., ℎ  () = 0, which has been presented independently in [16,32].The extension to the semi-infinite eigenvalue optimization is clear under the finiteness assumption (iii).We may then approach minimization of the  ∞ -norm along the same lines and obtain the finite program min  max =1,..., λ  ( (,   ())) where   stands for the multiplicity of the largest eigenvalue  1 ((,   )); we denote        by an orthonor-mal basis associated with the eigenvector of  1 ((,   )).According to the foregoing analysis, we can transform the above constrained optimization problem into the following form: where The trajectory X() =  +  ⊕ V() is  2 , and X () =  + V () . (65) In particular, X(0) = , V(0) = 0, and X(0) = .
Now we pay our attention to an associated dual object, that is, also a smooth function of  ∈ U.
Next we consider the following primal-dual function Theorem 22.Given  ∈  F(), at the trajectory X() =  +  + V(), one has Theorem 23.Given  ∈  F() and supposing that Assumption 18 holds, then for  small enough, the following assertions are true.
(i) L is a  2 function of  and (ii) The gradient of L is given by ∇L(;  V ) =   (;  V ), where and, in particular, when  = 0, one has (iii) The Hessian of L is given by where (,  V ) is the  ×  matrix function defined by In particular, when  = 0, one has Proof.(i) Because   (X()) is  2 , it follows from  2 of L in the above theorem.At the same time, Assumption 18 holds; using (68) with  = 0 gives L (; 0) =   (X ()) ,  = 1, . . ., .
Multiplying   and   , respectively, for the above equations and summing, we get the Lagrangian-like expression in item (i).
(ii) Using the chain rule, the differential of the Lagrangian-like functions (69) and (77) with respect to  can be written as follows: Multiplying each equation by the appropriate   () and   (), summing the results, and using the fact that ∑    () = 1 yield

∇L (𝑢; 𝑔
Using the transpose of the expression of (X()), we get which together with (6.11) in [28] yields the desired result.

∇L (𝑢; 𝑔
we obtain where  We call the corresponding Hessian matrix of L at  = 0 a basic U-Hessian for F at  and denote it by  := ∇ 2 L(0; 0).Using second-order U-derivatives we can specify secondorder expansions for F and give related necessary conditions for optimization problem.
Theorem 24.Suppose Assumption 18 holds and  ∈  F().Then for  small enough, there holds the second-order expansion of  along the trajectory X() =  +  ⊕ V(),

Conclusions
In this paper, we mainly study the UV-theory to optimize the  ∞ -norm or other nonsmooth criteria which are semiinfinite maxima of maximum eigenvalue functions.We use a methodology from semi-infinite programming to obtain a local nonlinear programming model and apply the UV decomposition method.With the so-called PDG that this problem possesses, Lagrangian-like theory is applied to the class of the functions.Under some hypothesis conditions, we can obtain the first-and second-order derivatives of the primal-dual Lagrangian function.This method can operate well in practice.
For further work, the need can be anticipated: in this paper we only give the theory analysis to solve the special class of eigenvalue optimization, we will continue to study its executable algorithm, and we will extend the UV algorithm of convex eigenvalues to nonconvex cases, where its related theory will be researched in later papers.