This paper presents a global optimization method for solving general nonlinear programming
problems subjected to box constraints. Regardless of convexity or nonconvexity, by introducing a
differential flow on the dual feasible space, a set of complete solutions to the original problem is obtained,
and criteria for global optimality and existence of solutions are given. Our theorems improve and
generalize recent known results in the canonical duality theory. Applications to a class of constrained
optimal control problems are discussed. Particularly, an analytical form of the optimal control is
expressed. Some examples are included to illustrate this new approach.
1. Introduction
In this paper, we consider the following general box constrained nonlinear programming problem (the primal problem (𝒫) in short): (P):min{P(x)∣x∈Xa},
where 𝒳a={x∈ℛn∣ℓl≤x≤ℓu} is a feasible space, ℓl,ℓu∈ℛn are two given vectors, and P(x) is twice continuously differentiable in ℛn. Here, we discuss the primal problem (𝒫) involving nonconvexity or convexity in the objective function.
Problem (1.1) appears in many applications, such as engineering design, phase transitions, chaotic dynamics, information theory, and network communication [1, 2]. Particularly, if ℓl={0} and ℓu={1}, the problem leads to one of the fundamental problems in combinatorial optimization, namely, the integer programming problem [3]. By the fact that the feasible space 𝒳a is a closed convex subset of ℛn, the primal problem has at least one global minimizer. When (𝒫) is a convex programming problem, a global minimizer can be obtained by many well-developed nonlinear optimization methods based on the Karush-Kuhn-Tucker (or simply KKT ) optimality theory [4]. However, for (𝒫) with nonconvexity in the objective function, traditional KKT theory and direct methods can only be used for solving (𝒫) to local optimality. So, our interest will be mainly in the case of P(x) being nonconvex on 𝒳a in this paper. For special cases of minimizing a nonconvex quadratic function subject to box constraints, much effort and progress have been made on locating the global optimal solution based on the canonical duality theory by Gao (see [5–7] for details). As indicated in [8], the key step of the canonical duality theory is to introduce a canonical dual function, but commonly used methods are not guaranteed to construct it since the general form of the objective function given in (1.1). Thus, there has been comparatively little work in global optimality for general cases.
Inspired and motivated by these facts, a differential flow for constructing the canonical dual function is introduced and a new approach to solve the general (especially nonconvex) nonlinear programming problem (𝒫) is investigated in this paper. By means of the canonical dual problem, some conditions in global optimality are deduced, and global and local extrema of the primal problem can be identified. An application to the linear-quadratic optimal control problem with constraints is discussed. These results presented in this paper can be viewed as an extension and an improvement in the canonical duality theory [8–10].
The paper is organized as follows. In Section 2, a differential flow is introduced to present a general form of the canonical dual problem to (𝒫). The relation of this transformation with the classical Lagrangian method is discussed. In Section 3, we present a set of complete solutions to (𝒫) by the way presented in Section 2. The existence of the canonical dual solutions is also given. We give an analytic solution to the box-constrained optimal control problem via canonical dual variables in Section 4. Meanwhile, some examples are used to illustrate our theory.
2. A Differential Flow and Canonical Dual Problem
In the beginning of this paper, we have mentioned that our primal goal is to find the global minimizers to a general (mainly nonconvex) box-constrained optimization problem (𝒫). Due to the assumed nonconvexity of the objective function, the classical Lagrangian L(x,σ) is no longer a saddle function, and the Fenchel-Young inequality leads to only a weak duality relation: minP≥maxP*. The nonzero value θ=minP(x)-maxP*(σ) is called the duality gap, where possibly, θ=∞. This duality gap shows that the well-developed Fenchel-Moreau-Rockafellar duality theory can be used mainly for solving convex problems. Also, due to the nonconvexity of the objective function, the problem may have multiple local solutions. The identification of a global minimizer has been a fundamentally challenging task in global optimization. In order to eliminate this duality gap inherent in the classical Lagrange duality theory, a so-called canonical duality theory has been developed [2, 9]. The main idea of this new theory is to introduce a canonical dual transformation which may convert some nonconvex and/or nonsmooth primal problems into smooth canonical dual problems without generating any duality gap and deduce some global solutions. The key step in the canonical dual transformation is to choose the (nonlinear) geometrical operator Λ(x). Different forms of Λ(x) may lead to different (but equivalent) canonical dual functions and canonical dual problems. So far, in most literatures related, the canonical dual transformation is discussed and the canonical dual function is formulated in quadratic minimization problems (i.e., the objective function is the quadratic form). However, for the general form of the objective function given in (1.1), in general, it lacks effective strategies to get the canonical dual function (or the canonical dual problem) by commonly used methods. The novelty of this paper is to introduce the differential flow created by differential equation (2.6) to construct the canonical dual function for the problem (𝒫). Lemma 2.5 guarantees the existence of the differential flow; Theorem 2.3 shows that there is no duality gap between the primal problem (𝒫) and its canonical dual problem (Pd) given in (2.7) via the differential flow; Meanwhile, Theorems 3.1–3.4 use the differential flow to present a global minimizer. In addition, the idea to introduce the set 𝒮 of shift parameters is closely following the works by Floudas et al. [11, 12]. In [12], they developed a global optimization method, αBB, for general twice-differentiable constrained optimizations proposing to utilize some α parameter to generate valid convex under estimators for nonconvex terms of generic structure.
The main idea of constructing the differential flow and the canonical dual problem is as follows. For simplicity without generality, we assume that ℓu=-ℓl=-ℓ1/2={ℓi}, namely, 𝒳a={x∈ℛn∣-ℓi≤xi≤ℓi}, where ℓi≠0 for all i.
Let 𝒮 denote the dual feasible space S={ρ∈R+n∣[∇2P(x)+Diag(ρ)]>0,∀x∈Xa},
where ℛ+n:={ρ∈ℛn∣ρ≥0}, and Diag(ρ)∈ℛn×n is a diagonal matrix with ρi,i=1,2,…,n, as its diagonal entries.
Lemma 2.1.
The dual feasible space 𝒮 is an open convex subset of ℛ+n. If ρ̂∈𝒮, then ρ∈𝒮 for any ρ≥ρ̂.
Proof.
Notice that P(x) is twice continuously differentiable in ℛn. For any x∈𝒳a, the Hessian matrix ∇2P(x) is a symmetrical matrix. We know that for any given Q=QT∈ℛn×n, {ρ∈ℛ+n∣Q+Diag(ρ)>0} is a convex set. By the fact that the intersection of any collection of convex sets is convex, the dual feasible space 𝒮 is an open convex subset of ℛ+n. In addition, it follows from the definition of 𝒮 that ρ∈𝒮 for any ρ≥ρ̂. This completes the proof.
Suppose that ρ*∈𝒮 and a nonzero vector x*∈𝒳a satisfy ∇P(x*)+Diag(ρ*)x*=0.
A differential flow x(ρ) is introduced over a relative small neighborhood of ρ* such that ∇P(x(ρ))+Diag(ρ)x(ρ)=0,x(ρ*)=x*,
which is equivalent to ∇2P(x(ρ))∇x(ρ)+Diag(ρ)∇x(ρ)+Diag(x(ρ))=0,x(ρ*)=x*,
where ∇x(ρ) is the Jacobian of x and is a matrix whose ijth entry is equal to the partial derivative ∂xi/∂ρj. Here, we hope to preserve invertibility of the matrix ∇2P(x(ρ))+Diag(ρ) on the choice of the neighborhood of ρ*.
Let H(ρ,x(ρ))=-[∇2P(x(ρ))+Diag(ρ)]-1.
Then, a differential flow x(ρ) can be defined by the following differential system: dxi(ρ)=Hi1(ρ)x1(ρ)dρ1+Hi2(ρ)x2(ρ)dρ2+⋯+Hin(ρ)xn(ρ)dρn,xi(ρ*)=xi*,i=1,2,…,n,
where Hij(ρ) is the ijth entry of H(ρ,x(ρ)). Based on the Extension theory [13, 14], the solution x(ρ) of the differential system (2.6) can be extended to a space in 𝒮. The canonical dual function is defined as Pd(ρ)=P(x(ρ))+∑i=1nρi2[xi2(ρ)-li].
Thus, the canonical dual problem for our primal problem (𝒫) can be proposed as follows (Pd):max{Pd(ρ)=P(x(ρ))+∑i=1nρi2[xi2(ρ)-li]∣ρ∈S}.
In the following, we show that (Pd) is canonically (i.e., with zero duality gap) dual to (𝒫).
Lemma 2.2.
Let x(ρ) be a given flow defined by (2.6), and Pd(ρ) be the corresponding canonical dual function defined by (2.7). For any ρ∈𝒮, we have
∇Pd(ρ)=(12[x12(ρ)-l1],12[x22(ρ)-l2],…,12[xn2(ρ)-ln])T,∇2Pd(ρ)=-Diag(x(ρ))[∇2P(x(ρ))+Diag(ρ)]-1Diag(x(ρ)).
Proof.
Since Pd(ρ) is differentiable, for any ρ∈𝒮,
∂Pd(ρ)∂ρi=∇P(x(ρ))T∂x(ρ)∂ρi+12[xi2(ρ)-li]+∑j=1nρjxj(ρ)∂xj(ρ)∂ρi=∇P(x(ρ))T∂x(ρ)∂ρi+ρTDiag(x(ρ))∂x(ρ)∂ρi+12[xi2(ρ)-li]=(∇P(x(ρ))+Diag(ρ)x(ρ))T∂x(ρ)∂ρi+12[xi2(ρ)-li].
It follows from the process (2.2)–(2.6) that
∂Pd(ρ)∂ρi=12[xi2(ρ)-li].
From (2.6), we have ∂xi(ρ)/∂ρj=Hij(ρ)xj(ρ). By (2.11), then
∂2Pd(ρ)∂ρi∂ρj=xi(ρ)∂xi(ρ)∂ρj=xi(ρ)Hij(ρ)xj(ρ).
By the definition of H(ρ,x(ρ)), this concludes the proof of Lemma 2.2.
By Lemma 2.2, the canonical dual function Pd(ρ) is concave on 𝒮. For a critical point, ρ̂∈𝒮, ρ̂ must be a global maximizer of (Pd), and it can be solved by many well-developed nonlinear programming methods. If ρ̂∈𝒮 and x(ρ̂)∈𝒳a, we have ∂Pd(ρ̂)/∂ρ̂i=(1/2)[xi2(ρ̂)-ℓi]≤0, and for any ρ≥ρ̂, by negative definiteness of H(x(ρ),ρ), ∂[(1/2)(xi2(ρ)-li)]∂ρi=xi(ρ)∂xi(ρ)∂ρi=Hii(ρ)xi2(ρ)≤0.
Thus for any ρ≥ρ̂, x(ρ) will stay in 𝒳a and Pd(ρ)≤Pd(ρ̂).
Theorem 2.3.
The canonical dual problem (Pd) is perfectly dual to the primal problem (𝒫) in the sense that if ρ̂∈𝒮 is a KKT point of Pd(ρ), then the vector x̂=x(ρ̂) is a KKT point of (𝒫) and
P(x̂)=Pd(ρ̂).
Proof.
By introducing the Lagrange multiplier vector λ∈ℛn to relax the inequality constraint ρ≥0 in 𝒮, the Lagrangian function associated with (Pd) becomes L(ρ,λ)=Pd(ρ)-λTρ.Then the KKT conditions of (Pd) become
∇L=∇Pd(ρ̂)-λ=0,λTρ̂=0,λ≤0,ρ̂≥0.
Notice that ∇Pd(ρ)=((1/2)[x12(ρ)-ℓ1],(1/2)[x22(ρ)-ℓ2],…,(1/2)[xn2(ρ)-ℓn])T. It follows from conditions (2.15) that x(ρ̂) satisfies the complementary conditions of (𝒫). By the definition of the flow x(ρ), the equation ∇P(x(ρ̂))+Diag(ρ̂)x(ρ̂)=0 holds. This proved that if ρ̂∈𝒮 is a KKT point of Pd(ρ), then the vector x̂=x(ρ̂) defined by (2.6) is a KKT point of the primal problem (𝒫).
In addition, we have
Pd(ρ̂)=P(x(ρ̂))+∑i=1nρ̂i2[xi2(ρ̂)-li]=P(x(ρ̂))=P(x̂).
This completes the proof.
Remark 2.4.
Theorem 2.3 shows that by using the canonical dual function (2.7), there is no duality gap between the primal problem (𝒫) and its canonical dual (Pd), that is, θ=0. It eliminates this duality gap inherent in the classical Lagrange duality theory and provides necessary conditions for searching global solutions. Actually, we replace 𝒮 in (Pd) with the space 𝒮#, 𝒮#:={ρ≥0∣det[∇2P(x)+Diag(ρ)]≠0,for allx∈𝒳a} in the proof of Theorem 2.3. Moreover, the inequality of det[∇2P(x)+Diag(ρ)]≠0,for allx∈𝒳a in 𝒮# is essentially not a constraint as indicated in [5].
Due to introduceing a differential flow x(ρ), the constrained nonconvex problem can be converted to the canonical (perfect) dual problem, which can be solved by deterministic methods. In view of the process (2.2)–(2.6), the flow x(ρ) is based on the KKT (2.2). In other words, we can solve equation (2.2) backwards from ρ* to get the backward flow x(ρ),ρ∈𝒮∩{0≤ρ≤ρ*}. Then, it is of interest to know whether there exists a pair (x*,ρ*) satisfying (2.2).
Lemma 2.5.
Suppose that ∇P(0)≠0. For the primal problem (𝒫), there exist a point ρ*∈𝒮 and a nonzero vector x*∈𝒳a such that ∇P(x*)+Diag(ρ*)x*=0.
Proof.
Since 𝒳a is bounded and P(x) is twice continuously differentiable in ℛn, we can choose a large positive real M∈ℛ such that ∇2P(x)+Diag(Me)>0,for allx∈𝒳a and sup𝒳a|(∇P(x))i|<Mℓi,i=1,2,…,n (e∈ℛn is an n-vector of all ones). Then, it is easy to verify that ∇P(x)+Diag(Me)x<0 at the point x=-ℓ1/2, and ∇P(x)+Diag(Me)x>0 at the point x=ℓ1/2.
Notice that the function ∇P(x)+Diag(Me)x is continuous and differentiable in ℛn. It follows from differential and integral calculus that there is a nonzero stationary point x*∈𝒳a such that ∇P(x*)+Diag(Me)x*=0. Let ρ*=Me∈𝒮. Thus, there exist a point ρ*∈𝒮 and a nonzero vector x*∈𝒳a satisfying (2.2). This completes the proof.
Remark 2.6.
Actually, Lemma 2.5 gives us some information to search the desired parameter ρ*. From Lemma 2.5, we only need to choose a large positive real M∈ℛ such that ∇2P(x)+Diag(Me)>0,for allx∈𝒳a and sup𝒳a|(∇P(x))i|<Mℓi,i=1,2,…,n. Since ∇P(0)≠0, then it follows from (∥∇2P(x)∥/M)<1 uniformly in 𝒳a that there is a unique nonzero fixed point x*∈𝒳a such that
-∇P(x*)M=x*
which is equivalent to ∇P(x*)+Diag(Me)x*=0 by Brown fixed-point theorem. In [11, 12], some good algorithms are given to estimate the bounds of ∥∇2P(x)∥. If there is a positive real number K such that ∥∇2P(x)∥≤K,for allx∈𝒳a, then a properly large parameter M can be obtained by the inequalities
‖∇2P(x)‖M≤KM<1,M>supXa|(∇P(x))ili|,∀i
uniformly on 𝒳a for us to use Brown fixed-point theorem. We should choose
M>max{supXa‖∇2P(x)‖,supXa|(∇P(x))ili|,∀i}.
Finally, let ρ*=Me which is the desired parameter ρ* for us. We will discuss to calculate the parameter ρ* in detail by the use of the results in [15, 16] with the future works.
Remark 2.7.
Moreover, for the proper parameter ρ*, it is worth investigating how to get the solution x* of (2.2) inside of 𝒳a. For this issue, when P(x) is a polynomial, we may be referred to [17]. There are results in [17] on bounding the zeros of a polynomial. We may consider for a given bounds to determine the parameter by the use of the results in [17] on the relation between the zeros and the coefficients. We will discuss it with the future works as well. However, the KKT conditions are only necessary conditions for local minimizers to satisfy for the nonconvex case of (𝒫). To identify, a global minimizer among all KKT points remains a key task for us to address in the next section.
3. Complete Solutions to Global Optimization ProblemsTheorem 3.1.
Suppose that ρ̂ is a KKT point of the canonical dual function Pd(ρ) and x̂=x(ρ̂) defined by (2.6). If ρ̂∈𝒮, then ρ̂ is a global maximizer of (Pd) on 𝒮, and x̂ is a global minimizer of (𝒫) on 𝒳a and P(x̂)=minx∈XaP(x)=maxρ∈SPd(ρ)=Pd(ρ̂).
Proof.
If ρ̂∈𝒮 is a KKT point of (Pd) on 𝒮, by (2.15), x(ρ̂) stays in 𝒳a, that is, (1/2)(xi2(ρ̂)-ℓi)≤0 for all i. By Lemma 2.1 and 2.2, it is easy to verify that x(ρ)∈𝒳a and Pd(ρ̂)≥Pd(ρ) for any ρ≥ρ̂.
For any given parameter ρ, (ρ≥ρ̂), we define the function fρ(x) as follows:
fρ(x)=P(x)+∑i=1nρi2(xi2-li).
It is obvious that P(x)≥fρ(x) for all x∈𝒳a. Since fρ(x) is twice continuously differentiable in ℛn, there exists a closed convex region ℰ containing 𝒳a such that on ℰ,
∇{P(x(ρ))+∑i=1nρi2[xi2(ρ)-li]}=∇P(x(ρ))+Diag(ρ)x(ρ)=0,∇2{P(x)+∑i=1nρi2(xi2-li)}=∇2P(x)+Diag(ρ)>0,∀x∈Xa.
This implies that x(ρ) is the unique global minimizer of fρ(x) over ℰ. By (2.7), we have
fρ(x(ρ))=P(x(ρ))+∑i=1nρi2[xi2(ρ)-li]=Pd(ρ).
Thus, for any ρ≥ρ̂,
P(x)≥fρ(x)≥minx∈Xafρ(x)=fρ(x(ρ))=Pd(ρ).
On the other hand, by the fact that the canonical dual function Pd(ρ) is concave on 𝒮, ρ̂ must be a global maximizer of (Pd) on 𝒮, and we have
maxρ∈SPd(ρ)=Pd(ρ̂)=P(x(ρ̂))+∑i=1nρ̂i2[xi2(ρ̂)-li]=P(x(ρ̂)),
and for all x∈𝒳a,
P(x)≥maxρ≥ρ̂Pd(ρ)=Pd(ρ̂)=P(x̂).
Thus, x̂ is a global minimizer of (𝒫) on 𝒳a and
P(x̂)=minx∈XaP(x)=maxρ∈SPd(ρ)=Pd(ρ̂).
This completes the proof.
Remark 3.2.
Theorem 3.1 shows that a vector x̂=x(ρ̂) is a global minimizer of the problem (𝒫) if ρ̂∈𝒮 is a critical solution of (Pd). However, for certain given P(x), the canonical dual function might have no critical point in 𝒮. For example, the canonical dual solutions could locate on the boundary of 𝒮. In this case, the primal problem (𝒫) may have multiple solutions.
In order to study the existence conditions of the canonical dual solutions, we let ∂𝒮 denote the boundary of 𝒮.
Theorem 3.3.
Suppose that P(x) is a given twice continuously differentiable function, 𝒮≠∅ and ∂𝒮≠∅. If for any given ρ0∈∂𝒮 and ρ∈𝒮,
limα→0+Pd(ρ0+αρ)=-∞,
then the canonical dual problem (Pd) has a critical point ρ̂∈𝒮, and x(ρ̂) is a global minimizer to (𝒫).
Proof.
We first show that for any given ρ̅≥0∈ℛn,ρ̅≠0,
limα→∞Pd(αρ̅)=-∞.
Notice that there exist a point ρ*∈𝒮 and a nonzero vector x(ρ*)∈𝒳a. For any given ρ̅, the inequality α0ρ̅>ρ* always holds as α0>0 becomes large enough. Then for any α≥α0, it follows from Lemma 2.1 and 2.2 that αρ̅∈𝒮 and x(αρ̅) stays in 𝒳a, that is, xi2(αρ̅)≤ℓifor alli. It means that there exists a large positive real L such that |P(αρ̅)|≤L for α≥α0 since P(x) is twice continuously differentiable in ℛn.
By Lemma 2.2, we have
dPd(αρ̅)dα=(∇Pd(αρ̅))Tρ̅=∑i=1nρ̅i2[xi2(αρ̅)-li],d2Pd(αρ̅)dα2=-ρ̅TDiag(x(αρ̅))[∇2P(x(αρ̅))+Diag(αρ̅)]-1Diag(x(αρ̅))ρ̅,
where Diag(x(αρ̅))ρ̅=(ρ̅1x1(αρ̅),ρ̅2x2(αρ̅),…,ρ̅nxn(αρ̅))T. For any α≥α0, by the definition of 𝒮, it is easy to see that d2Pd(αρ̅)/dα2≤0, namely, dPd(αρ̅)/dα monotonously decreases on [α0,+∞). Moreover, since x(αρ̅)∈𝒳a, we have dPd(αρ̅)/dα≤0 by (3.11). Then, Pd(αρ̅) is monotonously decreasing on [α0,+∞). Thus, to prove (3.10), it is only needed to show that there exists a positive real α̅≥α0 such that
dPd(α̅ρ̅)dα̅=∑i=1nρ̅i2[xi2(α̅ρ̅)-li]<0,
which implies that Pd(αρ̅) is strictly monotonously decreasing on [α̅,+∞).
Suppose that
∑i=1nρ̅i2[xi2(α1ρ̅)-li]=0
at a point α1≥α0. Since ρ̅i[xi2(α1ρ̅)-ℓi]≤0for alli and ρ̅≠0, the equation xr2(α1ρ̅)=ℓr holds for some subscript r, which means that ρ̅rxr(α1ρ̅)≠0. By positive definiteness of [∇2P(x(α1ρ̅))+Diag(α1ρ̅)], we have dPd(α1ρ̅)/dα1=0 and d2Pd(α1ρ̅)/dα12<0. One can verify that dPd(αρ̅)/dα<0 on (α1,+∞) since its monotonicity. Obviously, it is very easy to hold (3.12) if there does not exist α satisfying (3.13). Consequently, there always exists a positive real α̅ such that Pd(αρ̅) is strictly monotonously decreasing on [α̅,+∞). It leads to the conclusion (3.10).
Since Pd:𝒮→ℛ is concave and the condition (3.10) holds, if (3.9) holds, then the canonical dual function Pd(ρ) is coercive on the open convex set 𝒮. Therefore, the canonical dual problem (Pd) has one maximizer ρ̂∈𝒮 by the theory of convex analysis [4, 18]. This completes the proof.
Clearly, when ∇2P(x)>0 on 𝒳a, the dual feasible space 𝒮 is equivalent to ℛ+n={ρ∈ℛn∣ρ≥0} and Diag(ρ)∈ℛn×n by (2.1). Notice that limα→∞Pd(αρ̅)=-∞ for any given ρ̅≥0. Then Pd(ρ) is concave and coercive on ℛ+n, and (Pd) has at least one maximizer on ℛ+n. In this case, it is then of interest to characterize a unique solution of (𝒫) by the dual variable.
Let
I={i∈{1,2,…,n}∣xi2(0)<li},J={1,2,…,n}I.
Theorem 3.4.
If ∇2P(x)>0 on 𝒳a, the primal problem (𝒫) has a unique global minimizer x(ρ̂) determined by ρ̂∈ℛ+n satisfying
ρ̂i=0,∀i∈I,xi2(ρ̂)=li,∀i∈J.
Proof.
To prove Theorem 3.4, by Theorem 2.3 and 3.1, it is only needed to prove that ρ̂ is a KKT point of (Pd) in ℛ+n. By (3.14), the relations xi2(ρ̂)<ℓi for all i∈I also hold. Since ρ̂ satisfies equations xi2(ρ̂)=ℓi for all i∈J, we can verify that x(ρ̂) stays in 𝒳a and the complementarity conditions ρ̂i(xi2(ρ̂)-ℓi)=0,for alli hold. Thus, ρ̂ is a KKT point of (Pd) in ℛ+n by (2.9), (2.15), and x(ρ̂) is a unique global minimizer of (𝒫). This completes the proof.
Before beginning of applications to optimal control problems, we present two examples to find global minimizers by differential flows.
Example 3.5.
As a particular example of (𝒫), let us consider the following one dimensional nonconvex minimization problem with a box:
minP(x)=13x3+2xs.t.x2≤1.
We have P′(x)=x2+2andP′′(x)=2x,for allx2≤1. By choosing ρ*=62, we solve the following equation in {x2≤1}:
x2+2+62x=0
to get a solution x*=-2/(4+32). Next we solve the following boundary value problem of the ordinary differential equation:
ẋ=-x2x+ρ,x(62)=-24+32,2<ρ≤62.
To find a parameter such that
x2(ρ)=1,
we get
ρ̂=3,
which satisfies
P′′(x)+ρ̂=2x+3>0,∀x2≤1.
Let x(3) be denoted by x̂. To find the value of x̂, we compute the solution of the following algebra equation:
x2+2+3x=0,x2=1
and get x̂=-1. It follows from Theorem 3.1 that x̂=-1 is a global minimizer of Example 3.5.
Remark 3.6.
In this example, we see that a differential flow is useful in solving a nonconvex optimization problem. For the global optimization problem, people usually compute the global minimizer numerically. Even in using canonical duality method, one has to solve a canonical dual problem numerically. Nevertheless, the differential flow directs us to a new way for finding a global minimizer. Particularly, one may expect an exact solution of the problem provided that the corresponding differential equation has an analytic solution.
Example 3.7.
Given a symmetric matrix A∈ℛn×n and a vector c∈ℛn. Let P(x)=(1/2)xTAx-cTx. We consider the following box-constrained nonconvex global optimization:
minP(x)=12xTAx-cTxs.t.-li≤xi≤li,i=1,2,…,n.
Since A is an indefinite matrix, we choose a large ρ*∈ℛn such that A+Diag(ρ*)>0 and sup-ℓi≤xi≤ℓi|(Ax-c)i|<ρi*ℓi,i=1,2,…,n. We see that the differential equation is
dxi(ρ)=[A+Diag(ρ)]i1(ρ)x1(ρ)dρ1+[A+Diag(ρ)]i2(ρ)x2(ρ)dρ2+⋯+[A+Diag(ρ)]in(ρ)xn(ρ)dρn,x(ρ*)=[A+Diag(ρ*)]-1c,ρ∈S∩{0≤ρ≤ρ*},
where 𝒮={ρ≥0∣A+Diag(ρ)>0}. It leads a differential flow
x(ρ)=[A+Diag(ρ)]-1c,ρ∈S.
For simplicity without generality, we assume that A=(a1a2a2a3) and c=(c1c1). If for ρ, det[A+Diag(ρ)]≠0, we have
[A+Diag(ρ)]-1=(a3+ρ2(a1+ρ1)(a3+ρ2)-a22-a2(a1+ρ1)(a3+ρ2)-a22-a2(a1+ρ1)(a3+ρ2)-a22a1+ρ1(a1+ρ1)(a3+ρ2)-a22).
Then the dual problem can be formulated as
max{P(ρ)=-12cT[A+Diag(ρ)]-1c-lTρ:ρ∈S}.
If we choose a1=-0.5,a2=-0.5,a3=-0.3,c1=c2=0.3 and ℓ1=0.5,ℓ2=2, this dual problem has only one KKT point ρ̂=(1.8,0.7)T∈𝒮. By Theorem 3.1, x̂=[A+Diag(ρ̂)]-1c=(1,2)T is a global minimizer of Example 3.7 and P(1,2)=-2.75=Pd(1.8,0.7).
4. Applications to Constrained Optimal Control Problems
In this section, we consider the following constrained linear-quadratic optimal control problem: minJ(x,u)=12∫0TxT(t)Qx(t)+uT(t)Ru(t)dts.t.ẋ(t)=Ax(t)+Bu(t),x(0)=x0,u(t)∈U,t∈[0,T],
where Q∈ℛn×n,R∈ℛm×m are positive semidefinite and positive definite symmetric matrices respectively, x(t)∈ℛn is a state vector, and u(t)∈ℛm is integrable or piecewise continuous on [0,T] within U. Simply, U={u∈ℛm|-1≤ui≤1}, and U is a unit box. Problems of the above type arise naturally in system science and engineering with wide applications [19, 20].
It is well known that the central result in the optimal control theory is the Pontryagin maximum principle providing necessary conditions for optimality in very general optimal control problems.
4.1. Pontryagin Maximum Principle
Define the Hamilton-Jacobi-Bellman function
H(x,u,λ)=λT(Ax+Bu)+12xTQx+12uTRu.
If the control û is an optimal solution for the problem (4.1), with x̂(·) and λ̂(·) denoting the state and costate corresponding to û(·), respectively, then û is an extremal control, that is, we have
x̂̇=Hλ(x̂,û,λ̂)=Ax̂+Bû,x̂(0)=x0,λ̂̇=-Hx(x̂,û,λ̂)=-ATλ̂-Qx̂,λ̂(T)=0,
and a.e.t∈[0,T],
H(t,x̂(t),û(t),λ̂(t))=minu∈UH(t,x̂(t),u(t),λ̂(t)).
Unfortunately, above conditions are not, in general, sufficient for optimality. In such a case, we need to go through the process of comparing all the candidates for optimality that the necessary conditions produce, and picking out an optimal solution to the problem. Nevertheless, Lemma 4.1 can prove that the solution satisfies sufficiency conditions of the type considered in this section, then these conditions will ensure the optimality of the solution.
Lemma 4.1.
Let û(·) be an admissible control, x̂(·) and λ̂(·) be the corresponding state and costate. If x̂(t), û(t), and λ̂(t) satisfy the Pontryagin maximum principle ((4.3)-(4.4)), then û(t) is an optimal control to the problem (4.1).
Proof.
For any given x,λ, let
H*(x,λ)=minu∈UH(x,u,λ).
For any u∈U, by the definition of H*, H*(x,λ)≤H(x,u,λ), and H*(x,λ) is equivalent to the following global optimization
minu∈U12uTRu+λTBu.
Moreover, we can derive an analytic form of the global minimizer for (4.6) via the co-state λ. It is easy to see that the minimizer û of (4.6) doesn't depend on x, that is, ∂û/∂x=0 which implies that
Hx*(x,λ)=Hx(x,û,λ)+Hu(x,û,λ)∂û∂x=Hx(x,û,λ).
Since U is a closed convex set, by the classical linear systems theory, the state set X of (4.1) is a convex subset of ℛn. By the fact that the minimizer û does not depend on x and the convexity of the integrand in the cost functional, the function H*(x,λ) is convex with respect to x over X. In other words, for any x∈X, and a.e.t∈[0,T],
H*(x,λ̂(t))≥H*(x̂(t),λ̂(t))+Hx*T(x̂(t),λ̂(t))(x-x̂(t))=H(x̂(t),û(t),λ̂(t))+HxT(x̂(t),û(t),λ̂(t))(x-x̂(t)).
Thus, for any admissible pair (x(·),u(·)), and a.e.t∈[0,T], by (4.5), we have
H*(x(t),u(t),λ̂(t))≥H*(x(t),λ̂(t))≥H(x̂(t),û(t),λ̂(t))+HxT(x̂(t),û(t),λ̂(t))(x(t)-x̂(t)),
which leads to
H(x̂(t),û(t),λ̂(t))-H(x(t),u(t),λ̂(t))≤λ̂̇T(t)(x(t)-x̂(t)).
Notice that λ̂(T)=0 and x(0)=x̂(0)=x0. We can obtain
J[û(⋅)]-J[u(⋅)]=∫0T[H(x̂(t),û(t),λ̂(t))-H(x(t),u(t),λ̂(t))]dt+∫0Tλ̂T(t)(ẋ(t)-x̂̇(t))dt≤∫0Td[λ̂T(t)(x(t)-x̂(t))]=-λ̂T(0)(x(0)-x̂(0))=0.
This means that J attains its minimum at û. The proof is completed.
Lemma 4.1 reformulates the constrained optimal control problem (4.1) into a global optimization problem (4.6). Based on Theorem 3.4, an analytic solution of (4.1) can be expressed via the co-state.
Theorem 4.2.
Suppose that
u=argminu∈U12uTRu+λTBu.
We have the following expression
u=-[R+Diag(ρ(λ))]-1BTλ,
where ρ(λ) with respect to the co-state λ is given by ρ(λ)≥0 satisfying
ρi(λ)=0if(R-1BTλ)i2<1,[(R+Diag(ρ(λ)))-1BTλ]i2=1if(R-1BTλ)i2≥1.
Proof.
The proof of Theorem 4.2 is parallel to Theorem 3.4.
Substituting u=-[R+Diag(ρ(λ))]-1BTλ into (4.3), we have
ẋ=Ax+B(-[R+Diag(ρ(λ))]-1BTλ),x(0)=x0,λ̇=-ATλ-Qx,λ(T)=0.
If (x̂(·),λ̂(·)) is a solution of the above equations (4.15), let
û(⋅)=-[R+Diag(ρ(λ̂(⋅)))]-1BTλ̂(⋅).
By Lemma 4.1, x̂(·),û(·),λ̂(·) satisfy the Pontryagin maximum principle, and we present an analytic form of the optimal control to (4.1) via the canonical dual variable
û(t)=-[R+Diag(ρ(λ̂(t)))]-1BTλ̂(t),a.e.t∈[0,T].
Next, we give an example to illustrate our results.
Example 4.3.
We consider
A=(27-2.53),B=(2-51.57),Q=(2.43-2-1),R=(2003),x(0)=(11),
and T=1 in (4.1). Q≥0 and R>0 satisfy the assumption in (4.1).
Following the idea of Lemma 4.1 and Theorem 4.2, we need to solve the following boundary value problem for differential equations to derive the optimal solution
x̂̇=Hλ(x̂,û,λ̂)=Ax̂+Bû,x̂(0)=(11),λ̂̇=-Hx(x̂,û,λ̂)=-ATλ̂-Qx̂,λ̂(1)=0,û1=-(BTλ̂)1max[2,|(BTλ̂)1|],û2=-(BTλ̂)2max[3,|(BTλ̂)2|],a.e.t∈[0,1].
By solving equations (4.19) in MATLAB, we can obtain the optimal control û and the dual variable ρ(λ̂) as follows (see Figures 1 and 2).
The optimal control û in Example 4.3.
The dual variable ρ(λ̂) in Example 4.3.
Acknowledgments
The authors would like to thank the referees for their helpful comments on the early version of this paper. This work was supported by the National Natural Science Foundation of China (no. 10971053).
FloudasC. A.200037Dordrecht, The NetherlandsKluwer Academic Publishersxviii+739Nonconvex Optimization and Its Applications1746644GaoD. Y.Nonconvex semi-linear problems and canonical duality solutions20034Boston, Mass, USAKluwer Academic Publishers261312Adv. Mech. Math.2058481FangS.-C.GaoD. Y.SheuR.-L.WuS.-Y.Canonical dual approach to solving 0-1 quadratic programming problems2008411251422379923ZBL1180.90195RockafellarR. T.1970Princeton, NJ, USAPrinceton University Pressxviii+451Princeton Mathematical Series, No. 280274683GaoD. Y.Solutions and optimality criteria to box constrained nonconvex minimization problems2007322933042329373ZBL1171.90504GaoD. Y.RuanN.Solutions to quadratic minimization problems with box and integer constraints201047346348410.1007/s10898-009-9469-02657942WangZ.FangS.-C.GaoD. Y.XingW.Global extremal conditions for multi-integer quadratic programming2008422132252386071ZBL1161.90457GaoD. Y.Perfect duality theory and complete solutions to a class of global optimization problems2003524-546749310.1080/023319303100016115012039234GaoD. Y.200039Dordrecht, The NetherlandsKluwer Academic Publishersxviii+454Nonconvex Optimization and Its Applications1773838GaoD. Y.Canonical dual transformation method and generalized triality theory in nonsmooth global optimizaton2000171–412716010.1023/A:10265376308591807971AdjimanC. S.DallwigS.FloudasC. A.NeumaierA.A global optimization method, αBB, for general twice-differentiable constrained NLPs-I. Theoretical advances19982291137115810.1016/S0098-1354(98)00027-1AdjimanC. S.AndroulakisI. P.FloudasC. A.A global optimization method, αBB, for general twice-differentiabe constrained NLPs-II. Implementation and computational results19982291159117910.1016/S0098-1354(98)00218-XGrassmannH.200019Providence, RI, USAAmerican Mathematical Societyxviii+411History of Mathematics1747519RotaG. C.Extension theory of differential operators. I1958112365009685210.1002/cpa.3160110103ZBL0088.06303AdjimanC. S.AndroulakisI. P.FloudasC. A.Global optimization of mixed-integer nonlinear problems200046917691797MönnigmannM.Efficient calculation of bounds on spectra of Hessian matrices200830523402357242946910.1137/070704186ZBL1181.65055HertzD.AdjimanC. S.FloudasC. A.floudas@titan.princeton.eduTwo results on bounding the roots of interval polynomials19992391333133910.1016/S0098-1354(99)00292-6EkelandI.TemamR.1976Amsterdam, The NetherlandsNorth-Hollandix+4020463994CastiJ.The linear-quadratic control problem: some recent results and outstanding problems198022445948510.1137/1022089593859ZBL0452.49003RobinsonC.1995Boca Raton, Fla, USACRC Pressxii+468Studies in Advanced Mathematics1396532