A new parallel variable distribution algorithm based on interior point SSLE algorithm is proposed for solving inequality constrained optimization problems under the condition that the constraints are block-separable by the technology of sequential system of linear equation. Each iteration of this algorithm only needs to solve three systems of linear equations with the same coefficient matrix to obtain the descent direction. Furthermore, under certain conditions, the global convergence is achieved.
1. Introduction
Consider the following inequality constrained optimization problems:
(1)minf(x)s.t.gj(x)≤0,j∈I={1,…,m},
where f(x):Rn→R, g(x):Rn→R are continuously differentiable. We denote
(2)X={x∈Rn∣gj(x)≤0,j∈I},X0={x∈Rn∣gj(x)<0,j∈I},I(x)={j∣gj(x)=0,j∈I}.
To solve the problem (1), there are two type methods with superlinear convergence: sequential quadratic programming (SQP) type algorithms (see [1–4], etc.) and SSLE (sequential system of linear equations) type algorithms (see [5–9], etc.). In general, since SQP algorithms are necessary to solve one or more quadratic programming subproblems in single iteration, the computation effort is very large.
SSLE algorithms were proposed to solve the problem (1), in which an iteration similar to the following linear system was considered:
(3)Hd0+∇xL(x,λ0)=0,μj∇gj(x)Td0+λ0jgj(x)=0,j=1~m,
where L(x,λ)=f(x)+∑j=1mλjgj(x) is Lagrangian function, H is an estimate of the Hessian of L, x is the current estimate of a solution x*, d0 is the search direction, and λ0 is the next estimate of the Kuhn-Tucker multiplier vector associated with x*. Obviously, it is simpler to solve system of linear equations than to solve the QP (quadratic programming) problem with inequality constraints.
In addition, parallel variable distribution (PVD) algorithm [10] is a method that distributes the variables among parallel processors. The problem is parted into many respective subproblems and each subproblem is arranged to a different processor in it. Each processor has the primary responsibility for updating its block of variables while allowing the remaining secondary variables to change in a restricted fashion along some easily computable directions. In 2002, Sagastizábal and Solodov [11] proposed two new variants of PVD for the constrained case. Without assuming convexity of constraints, but assuming block-separable structure, they showed that PVD subproblems can be solved inexactly by solving their quadratic programming approximations. Han et al. [12] proposed an asynchronous PVT algorithm for solving large-scale linearly constrained convex minimization problems with the idea in 2009, which is based on the idea that a constrained optimization problem is equivalent to a differentiable unconstrained optimization problem by introducing the Fischer function. In 2011, Zheng et al. [13] gave a parallel SSLE algorithm, in which the PVD subproblems are solved inexactly by serial sequential linear equations, for solving large-scale constrained optimization with block-separable structure. Without assuming the convexity of constraints, the algorithm is proved to be globally convergent to a KKT point.
In this paper, we use Zhu [8] as our main reference on SSLE-type PVD method for problem (1). Suppose that the problem (1) has the following block-separable structure:
(4)x=(x1,x2,…,xp),xl∈Rnl,l=1,2,…,p,∑l=1pnl=n,g(x)=(g1(x1),g2(x2),…,gp(xp)),gl(xl):Rnl⟶Rml,∑l=1pml=m,I=(I1,I2,…,Ip)={1,2,…,m},Il(xl)={lj∣glj(xl)=0,lj∈Il={1,2,…,ml}}.
Then, the problem is distributed into p parallel subproblems which have been computed by the p parallel processors. In the algorithm, at each iteration, the search direction is obtained by solving three systems of linear equations with the same coefficient matrix, which guarantees that every iteration is feasible. Thereby, the computational effort of the proposed algorithm is reduced further. Furthermore, its global convergence is obtained under some suitable conditions.
The remaining part of this paper is organized as follows. In Section 2, a parallel SSLE algorithm is presented. Global convergence is established under some basic assumptions in Section 3. And concluding remarks are given in the last section.
2. Description of Algorithm
Now we state our algorithm as follows.
Algorithm
Step 0 (initialization). Given a starting point x0∈X0. Choose parameters θ, σ∈(0,1), α∈(0,1/2), 2<τ<δ<3, μ¯>0, 0<μlj0≤μ¯, and (lj∈Il) and an initial symmetric positive definite matrix Hl0∈Rnl×nl. Set k=0.
Step 1 (parallelization). For each processor l∈{1,2,…,p}, let
(5)Nlk=(∇glj(xlk),lj∈Il),Glk=diag(glj(xlk),lj∈Il),Mlk=diag(μljk,lj∈Il).
1.1 Computation of the Newton Direction. Solve the following system of linear equations:
(6)(HlkNlkMlk(Nlk)TGlk)(dλ)=-(∇lf(xlk)0).
Let (dl0k,πlk) be the solution. If d0k=0, stop.
1.2 Computation of the Descent Direction. Solve the following system of linear equations:
(7)(HlkNlkMlk(Nlk)TGlk)(dλ)=-(∇lf(xlk)∥dl0k∥δel),
where el=(1,…,1)T∈Rml. Let (dl1k,π~lk) be the solution.
1.3 Computation of the Main Search Direction. Establish a convex combination of dl0k and dl1k:
(8)dlk=(1-βlk)dl0k+βlkdl1k,λlk=(1-βlk)πlk+βlkπ~lk,
where
(9)βlk=max{β∈(0,1]∣(1-β)∇lf(xlk)Tdl0kaaaaaaa+β∇lf(xlk)Tdl1k≤θ∇lf(xlk)Tdl0k}.
1.4 Computation of the High-Order Corrected Direction. Set
(10)Llk={lj∈Il∣glj(xlk)≥-λljk}.
Solve the following system of linear equations:
(11)(HlkNlkMlk(Nlk)TGlk)(dλ)=-(∇lf(xlk)∥dl0k∥τel+Mlkg~lk),
where
(12)g~lk=(g~ljk,lj∈Il),g~ljk={glj(xlk+dlk),lj∈Llk,0,lj∈Il∖Llk.
Let (d~lk+dlk,λ~lk+λlk) be the solution. If ∥d~lk∥>∥dlk∥, set d~lk=0.
Step 2 (synchronization). Let
(13)dk=(d1k,d2k,…,dpk),d~k=(d~1k,d~2k,…,d~pk),λk=(λ1k,λ2k,…,λpk),Jk={j∈I∣gj(xk)≥λjk}.
2.1 The Line Search. Compute tk, the first number t in the sequence {1,1/2,1/4,…} satisfying
(14)f(xk+tdk+t2d~k)≤f(xk)+αt∇f(xk)Tdk,(15)gj(xk+tdk+t2d~k)≤gj(xk),j∈Jk,(16)gj(xk+tdk+t2d~k)<0,j∈I∖Jk.
Step 3 (update). Obtain Hlk+1 by updating the positive definite matrix Hlk using some quasi-Newton formulas. Set
(17)μljk+1=min{max{πljk,∥dl0k∥},μ¯},lj∈Il,
and xk+1=xk+tkdk+tk2d~k. Let k=k+1. Go back to Step 1.
3. Global Convergence of Algorithm
We make the following general assumptions and let them hold throughout the paper.
For l=1,2,…,p, the sets Xl={xl∣gl(xl)≤0} and Xl0={xl∣gl(xl)<0} are nonempty. The set Xl∩{xl∣f(xl)≤f(xl0)} is compact.
The functions f, gj, and j∈I are continuously differentiable.
For all xl∈Xl, the set of vectors {∇glj(xl):lj∈Il(xl)} is linearly independent.
There exist a, b>0, such that a∥d∥2≤dTHlkd≤b∥d∥2, for all k∈R and d∈Rnl.
Lemma 1.
For any xl∈Xl, any positive-definite matrix Hl∈Rnl×nl, and nonnegative vector μl=(μlj,lj∈Il)∈Rml such that μlj>0, lj∈Il(xl), the matrix
(18)F(xl,Hl,μl)=(HlNlMlNlTGl)
is nonsingular, where Nl=(∇glj(xl),lj∈Il)∈Rnl×ml, Ml=diag(μlj,lj∈Il)∈Rml×ml, and Gl=diag(glj(xl),lj∈Il)∈Rml×ml.
Proof.
We need only to prove that (dl,λl)=(0,0)∈Rnl×ml is the unique solution of the following linear equations:
(19)F(xl,Hl,μl)(dlλl)=0,thatis,{Hldl+Nlλl=0,MlNlTdl+Glλl=0.
Now consider the cases lj∈Il(xl) and lj∉Il(xl) separately.
For lj∈Il(xl), if glj(xl)=0 and μlj>0, it follows from (19) that
(20)∇glj(xl)Tdl=0.
By (19), we have
(21)∑lj∈Il(xl)λlj∇glj(xl)=0.
Then from the assumption (H3.3), it shows that λlj=0(∀lj∈Il(xl)).
For lj∉Il(xl), if μlj=0, it follows from the first equation of (19) that
(22)λlj=0,dlj=0.
If μlj>0, it follows from the second equation of (19) that
(23)∇glj(xl)Tdl=-λljμljglj(xl).
Hence, if μlj≥0, combine (22) and (23), and from the first equation of (19) and the assumption (H3.4), we get
(24)0≤dlTHldl=-λlTNlTdl=-∑lj∉Il(xl),μlj>0λlj(∇glj(xl)Tdl)=∑lj∉Il(xl),μlj>0(λlj)2μljglj(xl)≤0.
It shows that dlj=0 and λlj=0(∀lj∉Il(xl)).
Lemma 2.
For l=1,2,…,p,
if dl0k=0, then xk is a KKT point of (1);
if dl0k≠0, then dlk computed according to (8) is well defined and
(25)∇lf(xlk)Tdl0k≤-(dl0k)THlkdl0k<0,∇lf(xlk)Tdlk≤θ∇lf(xlk)Tdl0k<0,∇glj(xlk)Tdlk=-λljkμljkglj(xlk)-βk∥dl0k∥δ,lj∈Il.
Proof.
(1) It is obvious according to the definition of the KKT point of (1).
(2) If dl0k≠0, from (6), we have
(26)∇lf(xlk)Tdl0k=-(dl0k)THlkdl0k-(Nkπk)Tdl0k=-(dl0k)THlkdl0k+∑j∈I(πljk)2μljkglj(xk)≤-(dl0k)THlkdl0k<0,∇glj(xk)Tdl0k=-πljkμljkglj(xk),lj∈Il.
Thereby, from (9), there exists some β∈(0,1], such that βk=β∈(0,1]; that is, dlk is well defined. In addition, from (7), it follows that
(27)∇glj(xlk)Td1k=-π~ljkμljkglj(xlk)-∥dl0k∥δ,lj∈Il.
Thus, from (8), it is clear to see that
(28)∇glj(xlk)Tdlk=(1-βk)∇glj(xlk)Tdl0k+βk∇gj(xlk)Tdl1k=-λljkμljkglj(xlk)-βk∥dl0k∥δ,lj∈Il,∇lf(xlk)Tdlk=(1-βk)∇lf(xlk)Tdl0k+βk∇lf(xlk)Tdl1k≤θ∇lf(xlk)Tdl0k<0.
The claim holds.
Lemma 3.
The line search in Step 2 of the algorithm is well defined; that is, there exists tk>0 such that (14)–(16) hold.
Proof.
Firstly, for (14), since f is continuously differentiable, we can see that
(29)ak≜f(xk+tdk+t2d~k)-f(xk)-αt∇f(xk)Tdk=∇f(xk)T(tdk+t2d~k)-αt∇f(xk)Tdk+o(t)=(1-α)t∇f(xk)Tdk+o(t).
From (25), we have ∇lf(xlk)Tdlk<0, (l∈{1,2,…,p}). Then we can obtain
(30)∇f(xk)Tdk=∑l=1p∇lf(xlk)Tdlk<0.
Thus, for α∈(0,1/2), there exists some t¯>0, such that ak≤0, ∀t∈[0,t¯].
On the other hand, from (25), if lj∈Il, (l∈{1,2,…,p}), we have
(31)∇glj(xlk)Tdlk=-λljkμljkglj(xlk)-βk∥dl0k∥δ,lj∈Il.
So, for all j∈{1,2,…,m},
(32)∇gj(xk)Tdk=∑l=1p∑j=1ml∇glj(xlk)Tdlk=-λjkμjkgj(xk)-βk∥dk0∥δ,j∈I.
Then, from (13), we obtain, for j∈Jk, that
(33)bjk≜gj(xk+tdk+t2d~k)-gj(xk)=t∇gj(xk)Tdk+o(t)=-λjkμjktgj(xk)-tβk∥d0k∥δ+o(t)≤-1μjktgj2(xk)-tβk∥d0k∥δ+o(t).
So, there exists some t¯j>0, j∈Jk, such that bjk≤0, ∀t∈[0,t¯j]; that is, (15) holds.
Thirdly, for (16), since g is continuous and gj(xk)<0, there exists some t¯j>0, j∉Jk, such that
(34)gj(xk+tdk+t2d~k)≤12gj(xk)<0.
Let tk=(1/2)i=min{t¯,t¯j,j∈I}; then the conclusion holds.
According to (H3.1), (H3.2), and (H3.4), we might assume that there exists a subsequence K as well, such that
(35)xk⟶x*,Hlk⟶Hl*,dl0k⟶dl0*,mmmmmimmmπlk⟶πl*,μlk⟶μl*,mmmmmmmmmmmmmmmmmk∈K.
In order to obtain the global convergence of the algorithm, we assume the following condition.
The number of stationary points of (1) is finite.
Theorem 4.
The algorithm in Section 2 either stops at the KKT point xk of (1) in finite iteration or generates an infinite sequence {xk} whose all accumulation points are KKT points of (1).
Proof.
The first statement is obvious, the only stopping point being in Step 1.1. Firstly, we show that
(36){xk}k∈K⟶x*,d0k⟶0,k∈K,
where x*=(x1*,x2*,…,xp*), d0k=(d10k,d20k,…,dp0k).
Since {f(xk)} is monotonically decreasing, the facts {xk}k∈K→x* and continuity of f imply that
(37)f(xk)⟶f(x*),k⟶∞.
For lj∈Il, (l∈{1,2,…,p}), suppose by contradiction that dl0*≠0. Then, from (17), we have
(38)μljk⟶μlj*>0,lj∈Il,k∈K.
Hence, it is easy to prove that (dl0*,πl*) is the unique solution of the following linear system:
(39)Hl*dl+∇lf(xl*)+Nl*πl=0,μlj*∇glj(xl*)Tdl+πljglj(xl*)=0,lj∈Il,
where Nl*=(∇glj(xl*), lj∈Il). Then we can obtain
(40)H*d+∇f(x*)+N*π=0,μj*∇gj(x*)Td+πjgj(x*)=0,j∈I.
Thereby,
(41)∇f(x*)Td0*<0.
Similar to (8), we define d*, and by imitating the proof of Lemma 2, it follows that
(42)∇f(x*)Td*<0,∇gj(x*)Td*≤-λj*μj*gj(x*)-β*∥d0*∥δ,j∈I.
From (41), (42), and the proof of Lemma 3, we can conclude that the step-size tk obtained by the linear search in Step 2.2 is bounded away from zero on K; that is,
(43)tk≥t*=inf{tk,k∈K}>0,k∈K.
So, from (14), (37), and (42), we get
(44)0=limk∈K(f(xk+1)-f(xk))≤limk∈Kαtk∇f(xk)Tdk≤12αt*∇f(x*)Td*<0.
It is a contradiction, which shows that d0k→0, k∈K.
Furthermore, from (40), we have
(45)∇f(x*)+N*π*=0,πj*gj(x*)=0,j∈I.
If gj(x*)<0, ∀j∈I, then π*=0, ∇f(x*)=0, and it is obvious that x* is a KKT point of (1).
Without loss of generality, we suppose that there exists some j0∈I, such that gj0(x*)=0. If πj0*≥0, then it is easy to see that x* is a KKT point of (1). Suppose that πj0*<0. Since there are only finitely many choices for sets Jk⊆I, we might assume, for k∈K, k large enough, that Jk≡J as well, where J is a constant set. Obviously, j0∈J≠∅. From condition (H3.5), it holds that xk→x*, k→∞. Thereby, it holds that
(46)λj0k≤gj0(xk),λj0k⟶λj0*<0,gj0(xk)⟶gj0(x*)=0.
While, from (15), there exists some k0, such that, for k≥k0,
(47)gj0(xk)≤gj0(xk-1)≤⋯≤gj0(xk0+1)≤gj0(xk0)<0,
it is in contradiction to (46), which shows that x* is a KKT point of (1).
4. Concluding Remarks
In this paper, combined with the idea of parallel variable distribution, we proposed a new interior point SSLE algorithm for solving constrained optimization problems. The presented algorithm is a special structure of the objective function or constraints with a special structure. Under some mild conditions, the theoretical analysis shows that the convergence of this algorithm can be obtained.
It is noted that there are still some problems worthy of further discussion such as study of the parallel algorithm with inequality and equality constraints.
Conflict of Interests
The authors have declared that there is no conflict of interests.
Acknowledgments
The authors would also like to thank the anonymous referees for the careful reading and helpful comments and suggestions that led to an improved version of this paper. This research was supported by the Foundation of Hunan Provincial Education Department under Grant (nos. 12A077, 12C0743, and 13C453) and Scientific Research Fund of Hunan University of Humanities, Science and Technology of China (no. 2012QN04).
BoggsP. T.TolleJ. W.A strategy for global convergence in a sequential quadratic programming algorithm1989263600623LawrenceC. T.TitsA. L.Nonlinear equality constraints in feasible sequential quadratic programming1996642652822-s2.0-0030408634ZhuZ.JianJ.An efficient feasible SQP algorithm for inequality constrained optimization2009102122012282-s2.0-5654908680710.1016/j.nonrwa.2008.01.001LuoZ.ChenG.LuoS.Improved feasible SQP algorithm for nonlinear programs with equality constrained sub-problems20138614961503PanierE. R.TitsA. L.HerskovitsJ. N.QP-free, globally convergent, locally superlinearly convergent algorithm for inequality constrained optimization19882647888112-s2.0-0024036003QiH. D.QiL.A new QP-free, globally convergent, locally superlinearly convergent algorithm for inequality constrained optimization20011111131322-s2.0-003455308510.1137/S1052623499353935ChenL.WangY.HeG.A feasible active set QP-free method for nonlinear programming20061724014292-s2.0-3424900870510.1137/040605904ZhuZ.An interior point type QP-free algorithm with superlinear convergence for inequality constrained optimization2007316120112122-s2.0-3384613091410.1016/j.apm.2006.04.019ChengW. X.HuangC. C.JianJ. B.An improved infeasible SSLE method for constrained optimization without strict complementarity20134051506151510.1016/j.cor.2012.10.012FerrisM. C.MangasarianO. L.Parallel variable distribution199444815832SagastizábalC. A.SolodovM. V.Parallel variable distribution for constrained optimization20022211111312-s2.0-003654069310.1023/A:1014890403681HanC.WangY.HeG.On the convergence of asynchronous parallel algorithm for large-scale linearly constrained minimization problem200921124344412-s2.0-6484910561010.1016/j.amc.2009.01.081ZhengF.HanC.WangY.Parallel SSLE algorithm for large scale constrained optimization201121712537753842-s2.0-7955163053510.1016/j.amc.2010.12.005