The problems studied are the separable variational inequalities with linearly coupling constraints. Some existing decomposition methods are very problem specific, and the computation load is quite costly. Combining the ideas of proximal point algorithm (PPA) and augmented Lagrangian method (ALM), we propose an asymmetric proximal decomposition method (AsPDM) to solve a wide variety separable problems. By adding an auxiliary quadratic term to the general Lagrangian function, our method can take advantage of the separable feature. We also present an inexact version of AsPDM to reduce the computation load of each iteration. In the computation process, the inexact version only uses the function values. Moreover, the inexact criterion and the step size can be implemented in parallel. The convergence of the proposed method is proved, and numerical experiments are employed to show the advantage of AsPDM.
1. Introduction
The original model considered here is the convex minimization problem with linearly coupling constraints:
minimize ∑j=1Nθj(xj)∑j=1NAjxj-bj≥0(or∑j=1NAjxj-bj=0),xi∈Xi,i=1,…,N,
where 𝒳i⊂ℛni, Ai are given m×ni matrixes, bi are given m-vector, and θi:ℛni→ℛ are the ith block convex differentiable functions for each i=1,…,N. This special problem is called convex separable problem. Problems possessing such separable structure arise in discrete-time deterministic optimal control and in the scheduling of hydroelectric power generation [1]. Note that θi are differentiable, setting ∇θi(xi)=fi(xi); by the well-known minimum principle in nonlinear programming, it is easy to get an equivalent form of problem (1.1): find x*=(x1*,…,xN*)∈Ω such that(xi′-xi*)Tfi(xi*)≥0,i=1,…,N,∀x′∈Ω,
whereΩ={(x1,…,xN)∣∑j=1NAjxj≥b(or∑j=1NAjxj=b),xi∈Xi,i=1,…,N},b=∑j=1Nbj.
Problems of this type are called separable variational inequalities (VIs). We will utilize this equivalent formulation and provide method for solution of separable VI.
One of the best-known algorithms for solving convex programming or equivalent VI is the proximal point algorithm (PPA) first proposed by Martinet (see [2]) and had been studied well by Rockafellar [3, 4]. PPA and its dual version, the method of multipliers, draw on a large volume of prior work by various authors [5–9]. However, classical PPA and most of its subsequence papers cannot take advantage of the separability of the original problem, and this makes them inefficient in solving separable structure problems. One major direction of PPA’s study is to develop decomposition methods for separable convex programming and VI. The motivations for decomposition techniques are splitting the problem into isolate smaller or easier subproblems and parallelizing computations on specific parallel computing device. Decomposition-type methods [10–14] for large-scale problems have been widely studied in optimization as well as in variational problems and are explicitly or implicitly derived from PPA. However, most of those methods only can solve separable problems with special equality constraints:Minimizeθ(x)+ψ(y),Ax+y=0,x∈X,y∈Y.
Two very well-known methods for solving equality constrained convex problems and VI are the augmented Lagrangian method [15, 16] (ALM) and the alternating direction method (ADM) [17]. The classic ALM has been deeply studied and has many advantages over the general Lagrange methods; see [18] for more detail. However, it can not preserve separability. ADM is a different method but closely related to ALM, which essentially can preserve separability for problems with two operators (N=2). Recently, separable augmented Lagrangian method (SALM) [19, 20] overcomes the nonseparability of ALM. For example, for solving problem (1.1) with equality constraints, Hamdi and Mahey [19] allocated a resource quantity yi to each block Aixi-bi leading (1.1) to an enlarged problem in ∏j=1N𝒳j×ℛmNminimize∑j=1Nθj(xj)∑j=1Nyj=0Aixi-bi+yi=0,i=1,…,N,xi∈Xi,yi∈Rm,i=1,…,N.
It is worth mentioning that (1.5) is only equivalent to problem (1.1) with equality constraints. The expression of the augmented lagrangian function of (1.5) is:L(x,y,u,τ)=∑j=1NLj(xj,yj,uj,τ),
withLj(xj,yj,uj,τ)=θj(xj)-〈uj,Ajxj-bj+yj〉+τ2‖Ajxj-bj+yj‖2.
SALM finds a saddle point of problem (1.5) by the following stages:
xik+1=argminxi∈𝒳iLi(xi,yik,uik,τ);
yik+1=argmin∑j=1Nyj=0Li(xik+1,yi,uik,τ);
uik+1=uik+τ(Aixik+1-bi+yik+1).
Note that the process in SALM for xk+1 allows one to solve N subproblems in parallel. This has great practical importance from the computation point of view. In fact, SALM belongs to the family of splitting algorithms and ADM for solving special convex problem (1.4) with ψ(y)=0 andA=(A1⋱AN);
SALM has to introduce an additive variable y to exploit the inner separable structure of the problem, which makes the problem larger. Moreover, SALM is suitable to solve equality constraints problems and fraught with difficulties in solving inequality constraints problems.
To our best knowledge, there are few dedicated methods for solving inequality constraints problems (1.1) or VI(1.2)-(1.3), except the decomposition method proposed by Tseng [21] and the PPA-based contraction method by He et al. [22]. The decomposition method in [21] decomposes the computation of xk+1 at a fine level without introducing additive variable y. But, in each iteration of this method, the minimization subproblems for xk+1 are dependent on the step size of multiplier, which greatly restricts the computation of subproblems. The PPA-based contraction method in [22] has a nice decomposable structure; however, it has to solve the subproblem exactly. To solve (1.1) or VI(1.2)-(1.3), motivated by PPA-based contraction method and SLAM, we propose an asymmetric proximal decomposition method (AsPDM) which can well conserve the separability feature of the problem. Besides, it does not need to introduce the resource variables y like SALM and the subproblems do not depend on the step size of multiplier. In the following, we briefly describe our method for (1.1): we add an auxiliary quadratic term to the general Lagrangian function:L(x,xik,λ)=∑j=1NLj(xj,xjk,λ),
withLj(xj,xjk,λ)=θj(xj)-〈λ,Ajxj-bj〉+βj2‖xj-xjk‖2.
The general framework of AsPDM is as follows:
Here, βi>0, μ>0, αk>0, and G are proper chosen which will be detailed in the later sections. Note that the first phase consists of N isolate subproblems, and each involves xi, i=1,…,N only, namely; it can be partitioned into N independent lower-dimension subproblems. Hence, this method can take advantage of operators’ separability. Since we mainly focus on solving equivalent separable VI, hence, we present this method under VI framework and analyze its convergence in the following sections.
2. The Asymmetric Proximal Decomposition Method2.1. Structured VI
The separable VI(1.2)-(1.3) consists of N partitioned sub-VIs. Introducing a Lagrange multiplier vector λ∈Λ(Λ=ℛ+morℛm) associated with the linearly coupling constraint ∑j=1NAjxj≥b(or∑j=1NAjxj=b), we equivalently formulate the separable VI(1.2)-(1.3) as an enlarged VI: find w*=(x*,λ*)∈𝒲 such that(xi′-xi*)T{fi(xi*)-AiTλ*}≥0,i=1,…,N,(λ′-λ*)T(∑j=1NAjxj*-b)≥0,∀w′∈W,
whereW=X×Λ,X=∏j=1NXj.
VI(2.1)-(2.2) is referred as structured variational inequality (SVI), denoted as SVI(𝒲,Q). Here,Q(w)=(f1(x1)-A1Tλ⋮fN(xN)-ANTλ∑j=1NAjxj-b).
2.2. Preliminaries
We summarize some basic properties and related definitions which will be used in the following discussions.
Definition 2.1.
(i) The mapping f is said to be monotone if and only if
(w-w̃)T(f(w)-f(w̃))≥0,∀w,w̃.
(ii) A function f is said to be Lipschitz continuous if there is a constant L>0 such that
‖f(w)-f(w̃)‖≤L‖w-w̃‖,∀w,w̃.
The projection onto a closed convex set is a basic concept in this paper. Let 𝒲⊂ℛn be any closed convex set. We use P𝒲(w) to denote the projection of w onto 𝒲 under the Euclidean norm; that is,PW(w)=argmin{‖w-w′‖∣w′∈W}.
The following lemmas are useful for the convergence analysis in this paper.
Lemma 2.2.
Let 𝒲 be a closed convex set in ℛn, then one has
(w-PW(w))T(w′-PW(w))≤0∀w∈Rn,∀w′∈W,
‖PW(w)-w′‖2≤‖w-w′‖2-‖w-PW(w)‖2,∀w∈Rn,w′∈W.
Proof.
See [23].
Lemma 2.3.
Let 𝒲 be a closed convex set in ℛn, then w* is a solution of VI(𝒲,Q) if and only if
w*=PW[w*-βQ(w*)],∀β>0.
Proof.
See [10, page 267].
Hence, solving VI(𝒲,Q) is equivalent to finding a zero point of the residue functione(w,β)=w-PW[w-βQ(w)],β>0.
Generally, the term ∥e(w)∥ (denotes ∥e(w,1)∥) is referred to as the error bound of VI(𝒲,Q), since it measures the distance of u from the solution set.
2.3. The Presentation of the Exact AsPDM
In each iteration, by our proper construction, our method solves N independent sub-VIs involving each individual variable xi only so that xi can be obtained in parallel. In what follows, to illustrate our method’s practical significance, we interpret our algorithm process as a system which has a central authority and N local administrators; each administrator attempts to unilaterally solve a certain problem under the presumption that the instructions given by the authority are parametric inputs and the responses of other administrations’ actions are not available; namely, the N local administrators acts synchronously and independently once they receive the information given by the central authority. We briefly describe our method which consists of two main phases.
Phase I: For arbitrary (x1k,…,xNk,λk) given by the central authority, each local administrator uses his own way to offer the solution (denoted as x̃ik) of his individual problem: find x̃ik∈𝒳i, such that
(xi′-x̃ik)T{fi(x̃ik)+βi(x̃ik-xik)-AiTλk}≥0,∀xi′∈Xi.
Phase II: After the N local administrators accomplish their tasks, the central authority collects the resulting (x̃1k,…,x̃Nk), moreover, corresponding (f1(x̃1k),…,fN(x̃Nk)) which can be viewed as the feedback information from the N local administrators and sets:
λ̃k=PΛ[λk-μ-1(∑j=1NAjx̃jk-b)].
Here, μ is suitably chosen by the central authority. So the central authority aims to employ this feedback information effectively to provide (x1k+1,…,xNk+1,λk+1) which will be beneficial for the next iteration loop. In this paper, our proposed methods will update the new iterate by the following two forms:
w1k+1=wk-αkG(wk-w̃k),
or
w2k+1=PW[wk-αkQ(w̃k)],
where αk>0 is a specific step size and
G=(β1A1T⋱⋮βNANTμI).
We make the standard assumptions to guarantee that the problem under consideration is solvable and the proposed methods are well defined.
Assumption A.
fi(xi) is monotone and Lipschitz continuous, i=1,…,N.
By this assumption, it is easy to get that Q(w) is monotone.
2.4. Presentation of the Inexact AsPDM
In this section, the inexact version of the AsPDM method is present, and some remarks are briefly made.
For later analysis convenience, we denoteD=(β12⋱βN2ηI).
At each iteration, we solve N sub-VIs (see (2.11)) independently. No doubt, the computation load for an exact solution of (2.11) is usually expensive. Hence, it is desirable for us to consider solving (2.11) inexactly under a relative relaxed inexact criterion. We now describe and analyze our inexact method. Each iteration consists of two main phases, one of which provides an inexact solution of (2.11) and the other of which employs the inexact solution to offer a new iterate for the next iteration.
The first phase of our method works as follows. At the beginning of kth iteration, an iterate wk=(xk,λk) is given. If xik=P𝒳i[xik-(fi(xik)-AiTλk)], then xik is the exact solution of ith sub-VI; there is nothing we need to do with the ith sub-VI. Otherwise, we should find x̃ik∈𝒳i such that(xi′-x̃ik)T{fi(x̃ik)+βi(x̃ik-xik)-AiTλk+ξxik}≥0,∀xi′∈Xi,
withξxik=fi(xik)-fi(x̃ik).
Here, the obtained βi should satisfy following two inexact criteria:(xik-x̃ik)Tξxik≤νβi2‖xik-x̃ik‖2,ν∈(0,1),‖ξxik‖≤βi2‖xik-x̃ik‖.
Once one of the above criteria fails to be satisfied, we will increase βi by βi=βi*1.8 and turn back to solve the ith sub-VI of (2.17) with this updated βi. It should be noted that both inexact criteria are quite easy to check since they do not contain any unknown variables. In addition, another favorable characterization of these criteria is that they are independent; namely, they only involve xik,x̃ik, irrelevant to xjk,x̃jk(j≠i).
In what follows, let us describe the second phase. We requireλ̃k=PΛ[λk-μ-1(∑j=1NAjx̃jk-b)],
where μ∈(0,∑j=1N(∥AjAjT∥/2βj)+η] (here, η>0) is suitably chosen to satisfy(wk-w̃k)TG(wk-w̃k)≥‖wk-w̃k‖D2.
Now we use this w̃k=(x̃k,λ̃k) (or Q(w̃k)) to construct the new iteration. Here, we provide two simple forms for the new iteration:w1k+1=wk-αk[G(wk-w̃k)-ξk],ξk=(ξxk,0),ξxk=(ξx1k,…,ξxNk),
orw2k+1=PW[wk-αkQ(w̃k)],
whereαk=γαk*,αk*=(wk-w̃k)T(G(wk-w̃k)-ξk)‖G(wk-w̃k)-ξ‖2,γ∈(0,2).
In fact, each iteration of the proposed method consists of two main phases. Using the point of view that the problem is a system with a central authority and N administrators, the first phase is accomplished by N administrators based on the instruction given by the authority. That is, ith sub-VI only involves ith administrator’s activities. On the other hand, the second phase is implemented by the central authority to give new instruction for the next iteration.
Remark 2.4.
In the inexact AsPDM, the main task of Phase I is to find a solution for (2.17). From (2.17), it is easy to get that
x̃ik=PXi[x̃ik-(fi(xik)+βi(x̃ik-xik)-AiTλk)].
It seems that equality (2.22) is an implicit form since both sides of (2.22) contain x̃ik. In fact, we can transform equality (2.22) to an explicit form. Using the property of the projection, we have
x̃ik=PXi{x̃ik-βi[(x̃ik-xik)+βi-1(fi(xik)-AiTλk)]}=PXi[xik-1βi(fi(xik)-AiTλk)].
Consequently, using the above formula, we can compute x̃ik quite easily.
Remark 2.5.
Combining (2.22) and (2.19), we then find that
w̃k=PW[w̃k-(Q(w̃k)+G(w̃k-wk)+ξk)].
If ξk=0, it yields an exact version. In this special case, it is clear that
w̃k=PW[w̃k-(Q(w̃k)+G(w̃k-wk))].
We find that this formula is quite similar to the iterates produced by the classic PPA [3], which employs
wk+1=PW[wk+1-(Q(wk+1)+S(wk+1-wk))]
as the new iterate; here, S is a positive symmetry definite matrix. For deeper insight, our method does not appear fit into any of the known PPA frameworks. It is virtually not equivalent to PPA even if G is positive definite. The reason why our method can not be viewed as PPA lies in the fact that G is asymmetry, moreover, may be not positive definite. This lack of symmetry makes it fail to introduce an inner product as S. Consequently, if one sets w̃k as the new iterate, one may fail to obtain the convergence. Due to the asymmetric feature of G, we call our method asymmetric proximal decomposition method.
Remark 2.6.
Recalling that λ̃k is obtained by (2.19), it is easy to get that
(λ′-λ̃k)T{λ̃k-λk+μ-1(∑j=1NAjx̃j-b)}≥0,∀λ′∈R+m.
Combining (2.17) and (2.27), we have
(w′-w̃k)T(Q(w̃k)+G(w̃k-wk)+ξk)≥0,∀w′∈W.
Since w̃k=(x̃1k,…,x̃Nk,λ̃k)∈𝒲 is generated by (2.17)–(2.20) from a given (x1k,…,xNk,λk), we have that ∥wk-w̃k∥=0 implies ∥ξk∥=0 and ∥G(w̃k-wk)+ξk∥=0. According to (2.28), we have
(w′-w̃k)TQ(w̃k)≥0∀w′∈W.
In other words, w̃k is a solution of Problem (2.1)-(2.2) if xik=x̃ik (i=1,…,N) and λk=λ̃k. Hence, we use ∥wk-w̃k∥≤ε as stopping criterion in the proposed method.
Remark 2.7.
The update form (2.21*a) is based on the fact that G(w̃k-wk)+ξk is a descent direction of the unknown distance function (1/2)∥w-w*∥2 at point wk. This property will be proved in Section 3.1. αk* in (2.21) is the “optimal” step length, which will be detailed in Section 3.2. We can also use (2.21*b) to update the new iterate. For fast convergence, the practical step length should be multiplied by a relaxed factor γ∈[1,2).
Remark 2.8.
Note that ∥wk-w̃k∥=0 if and only if ∥wk-w̃k∥D=0. In the case ∥wk-w̃k∥D≠0, by choosing a suitable μ∈(0,∑j=1N(∥AjAjT∥/2βj)+η], (2.20) will be satisfied. We state this fact in the following lemma.
Lemma 2.9.
Let G and D be defined in (2.15) and (2.16), respectively. If μ=∑j=1N(∥AjAjT∥/2βj)+η, for all w=(x1,…,xN,λ)∈ℛ∑j=1Nnj+m, one has
wTGw≥‖w‖D2.
Proof.
According to Definition (2.16), we have
wTGw=∑j=1Nβj‖xj‖2+(∑j=1N‖AjAjT‖2βj+η)‖λ‖2+∑j=1NxjTAjTλ=∑j=1Nβj2‖xj‖2+η‖λ‖2+12∑j=1N‖βjxj+AjTλβj‖2+∑j=1N(‖AjAjT‖‖λ‖2-‖AjTλ‖22βj)≥‖w‖D2,
and the assertion is obtained.
Set w=wk-w̃k in the above lemma, we get(wk-w̃k)TG(wk-w̃k)≥‖wk-w̃k‖D2.
If one chooses μ=∑j=1N(∥AjAjT∥/2βj)+η, then the Condition (2.20) is always satisfied; hence, ∑j=1N(∥AjAjT∥/2βj)+η can be regarded as a safe upper bound for this condition. Note that we use an inequality in the proof of Lemma 2.9; it seems that there exists some relaxations. As a result, rather than fix μ=∑j=1N(∥AjAjT∥/2βj)+η, let μ be a smaller value, and check if Condition (2.20) is satisfied. If not, increase μ by μ=min{∑j=1N(∥AjAjT∥/2βj)+η,4μ} and try again. This process enables us to reach a suitable μ∈(0,(∑j=1N∥AjAjT∥+1)/2] to meet (2.20).
Note that, in our proposed method, problems VI (2.17) produce x̃ik in a parallel wise. In addition, instead of taking the solution of the subproblems, the new iterate in the proposed methods is updated by a simple manipulation, for example, (2.18*a)-(2.18*b).
3. Convergence of AsPDM
In the proposed methods, the first phase (accomplished by the local administrators) offers a descent direction of the unknown distance function, and the second phase (accomplished by the central authority) determines the “optimal” step length along this direction. This section gives more theory analysis.
3.1. The Descent Direction in the Proposed AsPDM
For any w*∈𝒲*, (wk-w*) is the gradient of the unknown distance function (1/2)∥w-w*∥2 at point wk∉𝒲*. A direction d is called a descent direction of (1/2)∥w-w*∥2 at point wk if and only if the inner product 〈wk-w*,d〉<0. Let w̃k=(x̃1k,…,x̃Nk,λ̃k) be generated by (2.17)–(2.20) from a given wk=(x1k,…,xNk,λk). A goal of this subsection is to elucidate that, for any w*∈𝒲*,(wk-w*)T(G(wk-w̃k)-ξk)≥(1-ν)‖wk-w̃k‖D2.
It guarantees that G(w̃k-wk)+ξk is a descent direction of (1/2)∥w-w*∥2 at point wk∉𝒲*. The above inequality plays an important role in the convergence analysis.
Lemma 3.1.
Assume that w̃k=(x̃1k,…,x̃Nk,λ̃k) is generated by (2.17)–(2.20) from a given wk=(x1k,…,xNk,λk), then for any w*=(x1*,…,xN*,λ*)∈𝒲* one has
(wk-w*)T(G(wk-w̃k)-ξk)≥(1-ν)‖wk-w̃k‖D2.
Proof.
Since w*∈𝒲, substituting w′=w* in (2.28), we obtain
(w*-w̃k)T(Q(w̃k)+G(w̃k-wk)+ξk)≥0.
Using the monotonicity of Q(w) and applying with w′=w̃k in (2.1), it is easy to get
(w̃k-w*)TQ(w̃k)≥(w̃k-w*)TQ(w*)≥0.
Combining (3.3) and (3.4), we then find
(wk-w*)T(G(wk-w̃k)-ξk)≥(wk-w̃k)T(G(wk-w̃k)-ξk).
Note that Criterion (2.18*a) holds; we have
(wk-w̅k)Tξk≤ν∑j=1Nβj2‖xjk-x̃jk‖2≤ν‖wk-w̃k‖D2≤ν(wk-w̃k)TG(wk-w̃k).
The last inequality follows directly from the result of Lemma 2.9. Consequently,
(wk-w̃k)T(G(wk-w̃k)-ξk)≥(1-ν)(wk-w̃k)TG(wk-w̃k).
Using the preceding inequality and (2.32) in (3.5) yields
(wk-w*)T(G(wk-w̃k)-ξk)≥(1-ν)‖wk-w̃k‖D2,
completing the proof.
Now, we state the main properties of Q(w̃k) in the lemma below.
Lemma 3.2.
Let w̃k=(x̃1k,…,x̃Nk,λ̃k) be generated by (2.17)–(2.20) from given wk=(x1k,…,xNk,λk). Then for any w′∈𝒲 one has
(w′-w̃k)TQ(w̃k)≥(w′-w̃k)T[G(wk-w̃k)-ξk].
Proof.
Recalling that w̃k=P𝒲[w̃k-(Q(w̃k)+G(w̃k-wk)+ξk)]∈𝒲, substituting w=w̃k-(Q(w̃k)+G(w̃k-wk)+ξk) in inequality (2.1), we have, immediately,
(w′-w̃k)T{w̃k-[w̃k-(Q(w̃k)+G(w̃k-wk)+ξk)]}≥0.
By some manipulations, our assertion holds immediately.
3.2. The Step Size and the New Iterate
Since G(w̃k-wk)+ξk is a descent direction of (1/2)∥w-w*∥2 at point wk, the new iterate will be determined along this direction by choosing a suitable step length. In order to explain why we have the “optimal” step αk* as defined in (2.21), we letw1k+1(α)=wk-α[G(wk-w̃k)-ξk],w2k+1(α)=PW[wk-αQ(w̃k)]
be the step-size-dependent new iterate, and letΘk,i(α)=‖wk-w*‖2-‖wik+1(α)-w*‖2,i=1,2,
be the profit function of the kth iteration. Because Θk,i(α) includes the unknown vector w*, it can not be maximized directly. The following lemma offers us a lower bound of Θk,i(α) which is a quadratic function of α.
Lemma 3.3.
Let w̃k=(x̃1k,…,x̃Nk,λ̃k) be generated by (2.17)–(2.20) from a given wk=(x1k,…,xNk,λk). Then one has
Θk,i(α)≥qk(α),∀α>0,i=1,2,
where
qk(α)=2α(wk-w̃k)T(G(wk-w̃k)-ξk)-α2‖G(wk-w̃k)-ξk‖2.
Proof.
It follows from Definition (3.12) and inequality (3.5) that
Θk,1(α)=‖wk-w*‖2-‖wk-w*-α[G(wk-w̃k)-ξk]‖2=2α(wk-w*)T[G(wk-w̃k)-ξk]-α2‖G(wk-w̃k)-ξk‖2≥(3.5)qk(α).
Let us deal with Θk,2(α) which seems more complicated:
Θk,2(α)≥(2.8)‖wk-w*‖2-‖wk-w*-αkQ(w̃k)‖2+‖wk-wk+1-αkQ(w̃k)‖2=2α(wk+1-w*)TQ(w̃k)+‖wk-wk+1‖2≥2α(wk+1-w̃k)TQ(w̃k)+‖wk-wk+1‖2≥(3.9)‖wk-wk+1‖2+2α(wk+1-w̃k)T[G(wk-w̃k)-ξk]=‖wk-wk+1-α[G(wk-w̃k)-ξk]‖2+2α(wk-w̃k)T[G(wk-w̃k)-ξk]-α2‖G(wk-w̃k)-ξk‖2≥2α(wk-w̃k)T(G(wk-w̃k)-ξk)-α2‖G(wk-w̃k)-ξk‖2.
Since qk(α) is a quadratic function of α, it reaches its maximum at
αk*=(wk-w̃k)T(G(wk-w̃k)-ξk)‖G(wk-w̃k)-ξk‖2;
this is just the same defined in (2.21). In practical computation, taking a relaxed factor γ is wise for fast convergence. Note that for any αk=γαk*, it follows from (3.13), (3.14), and (2.21) that
Θk,i(γαk*)≥qk(γαk*)=2γαk*(wk-w̃k)T(G(wk-w̃k)-ξk)-γ2(αk*)(αk*‖G(wk-w̃k)-ξk‖2)=γ(2-γ)αk*(wk-w̃k)T(G(wk-w̃k)-ξk).
In order to guarantee that the right-hand side of (3.18) is positive, we take γ∈(0,2).
In fact, αk* is bounded below by a positive amount which is the subject of the following lemma.
Lemma 3.4.
Assume that w̃k=(x̃1k,…,x̃Nk,λ̃k) is generated by (2.17)–(2.20) from a given wk=(x1k,…,xNk,λk), then one has
αk*≥1-ν(‖GD-1/2‖+1)2.
Proof.
Using the fact that square matrix D is positive symmetric definite, we have
‖G(wk-w̃k)‖=‖GD-1/2D1/2(wk-w̃k)‖≤‖GD-1/2‖‖D1/2(wk-w̃k)‖.
Moreover, Note that Criterion (2.18*b) implies
‖ξk‖2≤∑j=1Nβj2‖xjk-x̃jk‖2+η‖λk-λ̃k‖2=‖wk-w̃k‖D2.
Hence, applying ∥wk-w̃k∥D2=∥D1/2(wk-w̃k)∥2 in the above inequality, we get
‖ξk‖≤‖D1/2(wk-w̃k)‖.
Combining (3.20) and (3.22), we have
‖G(wk-w̃k)-ξk‖≤(‖GD-1/2‖+1)‖D1/2(wk-w̃k)‖.
Consequently, applying the above inequality, (3.7), and (2.32) to αk* yields
αk*=(wk-w̃k)T(G(wk-w̃k)-ξk)‖G(wk-w̃k)-ξk‖2≥(1-ν)‖wk-w̃k‖D2(‖GD-1/2‖+1)2‖D1/2(wk-w̃k)‖2≥1-ν(‖GD-1/2‖+1)2,
and thus the assertion is proved.
Now, we are in the stage to prove the main convergence theorem of this paper.
Theorem 3.5.
For any w*=(x*,λ*)∈𝒲*, the sequence {wk=(xk,λk)} generated by the proposed method satisfies
‖wk+1-w*‖2≤‖wk-w*‖2-γ(2-γ)(1-ν)2(‖GD-1/2‖+1)2‖wk-w̃k‖D2.
Thus we have
limk→∞‖wk-w̃k‖2=0,
and the iteration of the proposed method will terminate in finite loops.
Proof.
First, it follows from (3.12) and (3.18) that
‖wk+1-w*‖2≤‖wk-w*‖2-γ(2-γ)αk*(wk-w̃k)T(G(wk-w̃k)-ξk).
Using (2.32), (3.7), and (3.19), we have
αk*(wk-w̃k)T(G(wk-w̃k)-ξk)≥(1-ν)2(‖GD-1/2‖+1)2‖wk-w̃k‖D2.
Substituting (3.28) in (3.27), Assertion (3.25) is proved. Therefore, we have
∑k=0∞γ(2-γ)(1-ν)2(‖GD-1/2‖+1)2‖wk-w̃k‖D2≤‖w0-w*‖2,
and Assertion (3.26) follows immediately.
Since we use
max{‖x1k-x̃1k‖∞,…,‖xNk-x̃Nk‖∞,‖λk-λ̃k‖∞}<ϵ,
as the stopping criterium, it follows from (3.26) that the iteration will terminate in finite loops for any given ϵ>0.
4. Numerical Examples
This section describes experiments testifying to the good performance of proposed method. The algorithms were written in Matlab (version 7.0) and complied on a notebook with CPU of Intel Core 2 Duo (2.01 GHz and RAM of 0.98 GB).
To evaluate the behavior of the proposed method, we construct examples about convex separable quadratic programming (CSQP) with linearly coupling constraints. The convex separable quadratic programming was generated as follows:min{12∑i=1NxiTHixi-ciTxi∣x∈Ω},Ω={x≥0∣∑i=1NAixi≤b},
where Hi∈Rni×ni is a symmetric positive definite matrix, Ai∈Rm×ni, b∈Rm, and ci∈Rni. We construct matrices Ai and Hi in the test examples as follows. The elements of Ai are randomly given in (-5,5), and the matrices Hi are defined by settingHi∶=ViΣiViT,
whereVi=Ini-2viviTviTvi,Σi=diag(σk,i),σk,i=coskπni+1+1.
In this way, Hi is positive definite and has prescribed eigenvalues between (0,2). If x* is the solution of Problem (4.1), according to the KKT principle, there is a 0≤y*∈Rm such thatHixi*+AiTy*-ci≥0,x*T(Hixi*+AiTy*-ci)=0,xi*≥0,∑i=1NAixi*≤b,y*T(∑i=1NAixi*-b)=0,y*≥0.
Let ξi∈Rni and z∈Rm be random vectors whose elements are between (-1,1). We setxi*=max(ξi,0)*τ1,ξi*=max(-ξi,0)*τ2,y*=max(z,0)*τ3,z*=max(-z,0)*τ4,
where τi,i=1,2,3,4, are positive parameters. By settingci=Hixi*+AiTy*-ξi*,b=∑i=1NAixi*+z*,
we constructed a test problem of (4.1) which has the known solution point x* and the optimal Lagrangian multipliers y*. We tested such problems with τ1=0.5,τ2=10,τ3=0.5, τ4=10. Here, two example sets were considered. The problems in the first set have 3 separable operators (N=3), and the second have 2 (N=2).
In the first experiment, we employ AsPDM with update w2k+1 to solve CSQP with 3 separable operators. (The reason why we choose w2k+1 here is that it usually performs better than w1k+1.) The stopping criterion was chosen as ∥e(w)∥∞<10-3; the parameters were set as ν=0.2, η=0.5, and γ=1.8. Table 1 reports the number iterations (denoted as Its.), the total number of function evaluations (denoted as Numf) for different problem-sizes. Here, Numf=∑i=13Numfi. Observed form Table 1, the solutions are obtained in a moderate number of iterations; thus the proposed method is effectively applicable. In addition, the evaluations of fi per iteration are approximately equal to 2. AsPDM is well suited to solve separable problems.
AsPDM for CSQP with 3 separable operators.
m
n1
n2
n3
Its.
Numf
100
50
50
50
674
4066
100
100
100
100
1581
9508
150
150
150
150
1775
10672
200
200
200
200
2108
12670
Next, we compared the computational efficiency of AsPDM against the method in [7] (denoted as PCM), regarded as a highly efficient PPA-based method that can be well suited to solve VI. Iterations were terminated when the criterion ∥e(w)∥∞<10-5 was met. Table 2 reports the iterations, the total number of function evaluation for both methods. We observe that both methods are acceptable to for us to find a solution. Concerning computational efficiency, we can observe that AsPDM is comparable and clearly faster than PCM; moreover, function evaluations are also less, except in the case of m=500,n1=500,n2=500. In some cases, AsPDM can reduce about 20% computation cost than PCM. For m=100,n1=100,n2=100, we plot the error versus iteration number for both AsPDM and PCM in Figure 1. We have found that both methods converge quickly for the first hundred iterations but slow down as the exact solution is reached. The speed of AsPDM is better than PCM.
Its. and function eval. for different problem sizes.
m
n1
n2
AsPDM
PCM
Its.
Numf
Its.
Numf
100
100
100
1089
4371
1343
5393
200
200
200
1265
5075
1649
6621
300
300
300
1655
6635
1765
7087
400
400
400
1573
6307
1834
7365
500
500
500
2299
9211
2218
8901
600
600
600
2267
9083
2289
9187
100
80
80
568
2287
616
2485
100
50
150
1192
4787
1231
4945
200
150
100
567
2283
572
2311
200
100
200
744
2991
846
3407
Error versus iteration number for the method and for with m=100, n1=100, n2=100.
In addition to being fast, AsPDM can solve the problem separately; that is the most significant advantage over other methods. Hence, AsPDM is more suitable to solve the real-life separable problems.
5. Conclusions
We have proposed AsPDM for solving separable problems. It decomposes the original problem to independent low-dimension subproblems and solves those subproblems in parallel. Only the function values is required in the process, and the total computational cost is very small. AsPDM is easy to implement and does not appear to require application-specific tuning. The numerical results also evidenced the efficiency of our method. Thus, the new method is applicable and recommended in practice.
Acknowledgment
The author was supported by the NSFC Grant 70901018.
RockafellarR. T.WetsR. J.-B.Generalized linear-quadratic problems of deterministic and stochastic optimal control in discrete time1990284810822105162510.1137/0328046ZBL0714.49036MartinetB.Régularisation d'inéquations variationnelles par approximations successives197041541580298899ZBL0215.21103RockafellarR. T.Monotone operators and the proximal point algorithm1976145877898041048310.1137/0314056ZBL0358.90053RockafellarR. T.Augmented Lagrangians and applications of the proximal point algorithm in convex programming19761297116041891910.1287/moor.1.2.97ZBL0402.90076AuslenderA.TeboulleM.Ben-TibaS.A logarithmic-quadratic proximal method for variational inequalities1999121-33140170409910.1023/A:1008607511915ZBL1039.90529CensorY.ZeniosS. A.Proximal minimization algorithm with D-functions1992733451464116480310.1007/BF00940051HeB.YuanX.ZhangJ. J. Z.Comparison of two kinds of prediction-correction methods for monotone variational inequalities2004273247267203571510.1023/B:COAP.0000013058.17185.90ZBL1061.90111NemirovskyA.Prox-method with rate of convergence 0(1/k) for smooth variational inequalities and saddle point problem, Draft of 30/10/2003TeboulleM.Convergence of proximal-like algorithms19977410691083147961510.1137/S1052623495292130ZBL0890.90151BertsekasD. P.TsitsiklisJ. N.1989Englewood Cliffs, NJ, USAPrentice HallChenG.TeboulleM.A proximal-based decomposition method for convex minimization problems1994641, Ser. A81101127417310.1007/BF01582566HeB.LiaoL.-Z.WangS.Self-adaptive operator splitting methods for monotone variational inequalities20039447157371990590ZBL1033.65047MaheyP.OualibouchS.PhamD. T.Proximal decomposition on the graph of a maximal monotone operator199552454466133020610.1137/0805023ZBL0834.90105TsengP.Applications of a splitting algorithm to decomposition in convex programming and variational inequalities1991291119138108822210.1137/0329006ZBL0737.90048GlowinskiR.Le TallecP.19899Philadelphia, Pa, USASIAMx+295SIAM Studies in Applied Mathematics1060954HeB.-S.YangH.ZhangC.-S.A modified augmented Lagrangian method for a class of monotone variational inequalities200415913551206874410.1016/S0377-2217(03)00385-0ZBL1067.90152FukushimaM.Application of the alternating direction method of multipliers to separable convex programming problems19921193111119563110.1007/BF00247655ZBL0763.90071NocedalJ.WrightS. J.1999New York, NY, USASpringerxxii+636Springer Series in Operations Research10.1007/b988741713114HamdiA.MaheyP.Separable diagonalized multiplier method for decomposing nonlinear programs20001911–29, 1251994675HamdiA.MaheyP.DussaultJ. P.A new decomposition method in nonconvex programming via a separable augmented Lagrangian1997452Berlin, GermanySpringer90104Lecture Notes in Economics and Mathematical Systems146702310.1007/978-3-642-59073-3_7ZBL0882.65055TsengP.Alternating projection-proximal methods for convex programming and variational inequalities199774951965147960810.1137/S1052623495279797ZBL0914.90218HeB. S.FuX. L.JiangZ. K.Proximal-point algorithm using a linear proximal term20091412299319250083610.1007/s10957-008-9493-0ZBL1170.49008ZarantonelloE. H.ZarantonelloE. H.Projections on convex sets in Hilbert space and spectral theory1971New York, NY, USAAcademic PressZBL0281.47043