This paper is concerned with the study of a penalization-gradient algorithm for solving variational inequalities, namely, find x̅∈C such that 〈Ax̅,y-x̅〉≥0 for all y∈C, where A:H→H is a single-valued operator, C is a closed convex set of a real Hilbert space H. Given Ψ:H→ℝ∪{+∞} which acts as a penalization function with respect to the constraint x̅∈C, and a penalization parameter βk, we consider an algorithm which alternates a proximal step with respect to ∂Ψ and a gradient step with respect to A and reads as xk=(I+λkβk∂Ψ)-1(xk-1-λkAxk-1). Under mild hypotheses, we obtain weak convergence for an inverse strongly monotone operator and strong convergence for a Lipschitz continuous and strongly monotone operator. Applications to hierarchical minimization and fixed-point problems are also given and the multivalued case is reached by replacing the multivalued operator by its Yosida approximate which is always Lipschitz continuous.
1. Introduction
Let H be a real Hilbert space, A:H→H a monotone operator, and let C be a closed convex set in H, we are interested in the study of a gradient-penalization algorithm for solving the problem of finding x̅∈C such that
〈Ax̅,y-x̅〉≥0∀y∈C,
or equivalently
Ax̅+NC(x̅)∋0,
where NC is the normal cone to a closed convex set C. The above problem is a variational inequality, initiated by Stampacchia [1], and this field is now a well-known branch of pure and applied mathematics, and many important problems can be cast in this framework.
In [2], Attouch et al., based on seminal work by Passty [3], solve this problem with a multivalued operator by using splitting proximal methods. A drawback is the fact that the convergence in general is only ergodic. Motivated by [2, 4] and by [5] where penalty methods for variational inequalities with single-valued monotone maps are given, we will prove that our proposed forward-backward penalization-gradient method (1.9) enjoys good asymptotic convergence properties. We will provide some applications to hierarchical fixed-point and optimization problems and also propose an idea to reach monotone variational inclusions.
To begin with, see, for instance [6], let us recall that an operator with domain D(T) and range R(T) is said to be monotone if
〈u-v,x-y〉≥0wheneveru∈T(x),v∈T(y).
It is said to be maximal monotone if, in addition, its graph, gphT:={(x,y)∈H×H:y∈T(x)}, is not properly contained in the graph of any other monotone operator. An operator sequence Tk is said to be graph convergent to T if (gph(Tk)) converges to gph(T) in the Kuratowski-Painlevé's sense, that is, limsupkgph(Tk)⊂gph(T)⊂liminfkgph(Tk). It is well-known that for each x∈H and λ>0 there is a unique z∈H such that x∈(I+λT)z. The single-valued operator JλT:=(I+λT)-1 is called the resolvent of T of parameter λ. It is a nonexpansive mapping which is everywhere defined and is related to its Yosida approximate, namely Tλ(x):=(x-JλT(x))/λ, by the relation Tλ(x)∈T(JλT(x)). The latter is 1/λ-Lipschitz continuous and satisfies (Tλ)μ=Tλ+μ. Recall that the inverse T-1 of T is the operator defined by x∈T-1(y)⇔y∈T(x) and that, for all x,y∈H, we have the following key inequality
‖JλT(x)-JλT(y)‖2≤‖x-y‖2+‖(I-JλT)(x)-(I-JλT)(y)‖2.
Observe that the relation (Tλ)μ(x)=Tλ+μ(x) leads to
JμTλ(x)=λλ+μx+(1-λλ+μ)Jλ+μT(x).
Now, given a proper lower semicontinuous convex function f:H→ℝ∪{+∞}, the subdifferential of f at x is the set
∂f(x)={u∈H:f(y)≥f(x)+〈u,y-x〉∀y∈H}.
Its Moreau-Yosida approximate and proximal mapping fλ and proxλf are given, respectively, by
fλ(x)=infy∈H{f(y)+12λ‖y-x‖2},proxλf(x)=argminy∈H{f(y)+12λ‖y-x‖2}.
We have the following interesting relation (∂f)λ=∇fλ. Finally, given a nonempty closed convex set C⊂H, its indicator function is defined as δC(x)=0 if x∈C and +∞ otherwise. The projection onto C at a point u is PC(u)=infc∈C∥u-c∥. The normal cone to C at x is
NC(x)={u∈H:〈u,c-x〉≤0∀c∈C}
if x∈C and ∅ otherwise. Observe that ∂δC=NC, proxλf=Jλ∂f, and JλNC=PC.
Given some xk-1∈H, the current approximation to a solution of (1.2), we study the penalization-gradient iteration which will generate, for parameters λk>0,βk→+∞, xk as the solution of the regularized subproblem
1λk(xk-xk-1)+Axk-1+βk∂Ψ(xk)∋0,
which can be rewritten as
xk=(I+λkβk∂Ψ)-1(xk-1-λkAxk-1).
Having in view a large range of applications, we shall not assume any particular structure or regularity on the penalization function Ψ. Instead, we just suppose that Ψ is convex, lower semicontinuous and C=argminΨ≠∅. We will denote by VI(A,C) the solution set of (1.2).
The following lemmas will be needed in our analysis, see for example [6, 7], respectively.
Lemma 1.1.
Let T be a maximal monotone operator, then (βkT) graph converges to NT-1(0) as βk→+∞ provided that T-1(0)≠∅.
Lemma 1.2.
Assume that αk and δk are two sequences of nonnegative real numbers such that
αk+1≤αk+δk.
If limk→+∞δk=0, then there exists a subsequence of (αk) which converges. Furthermore, if ∑k=0∞δk<+∞, then limk→+∞αk exists.
2. Main Results2.1. Weak ConvergenceTheorem 2.1.
Assume that VI(A,C)≠∅, A is inverse strongly monotone, namely
〈Ax-Ay,x-y〉≥1L‖Ax-Ay‖2∀x,y∈H,forsomeL>0.
If
∑k=0∞‖x̅-Jλkβk∂Ψ(x̅-λkAx̅)‖<+∞∀x̅∈VI(A,C),
and λk∈]ε,2/L-ε[ (where ε>0 is a small enough constant), then the sequence (xk)k∈ℕ generated by algorithm (1.9) converges weakly to a solution of Problem (1.2).
Proof.
Let x̅ be a solution of (1.2), observe that x̅ solves (1.2) if and only if x̅=(I+λkNC)-1(x̅-λkAx̅)=PC(x̅-λkAx̅). Set x̅k=(I+λkβk∂Ψ)-1(x̅-λkAx̅), by the triangular inequality, we can write
‖xk-x̅‖≤‖xk-x̅k‖+‖x̅k-x̅‖.
On the other hand, by virtue of (1.4) and (2.1), we successively have
‖xk-x̅k‖2≤‖xk-1-x̅-λk(Axk-1-Ax̅)‖2-‖xk-1-xk-λk(Axk-1-Ax̅)+x̅k-x̅‖2≤‖xk-1-x̅‖2-λk(2L-λk)‖Axk-1-Ax̅‖2-‖xk-1-xk-λk(Axk-1-Ax̅)+x̅k-x̅‖2.
Hence
‖xk-x̅‖<‖xk-1-x̅‖2-ε2‖Axk-1-Ax̅‖2-‖xk-1-xk-λk(Axk-1-Ax̅)+x̅k-x̅‖2+‖x̅-x̅k‖.
The later implies, by Lemma 1.2 and the fact that (2.2) insures limk→+∞∥x̅-x̅k∥=0, that the positive real sequence (∥xk-x̅∥2)k∈ℕ converges to some limit l(x̅), that is,
l(x̅)=limk→+∞‖xk-x̅‖2<+∞,
and also assures that
limk→+∞‖Axk-1-Ax̅‖2=0,limk→+∞‖xk-1-xk-λk(Axk-1-Ax̅)+x̅k-x̅‖2=0.
Combining the two latter equalities, we infer that
limk→+∞‖xk-1-xk‖2=0.
Now, (1.9) can be written equivalently as
xk-1-xkλk+Axk-Axk-1∈(A+βk∂Ψ)(xk).
By virtue of Lemma 1.1, we have (βk∂Ψ) graph converges to NargminΨ because
(∂Ψ)-1(0)=∂Ψ*(0)=argminΨ.
Furthermore, the Lipschitz continuity of A (see, e.g., [8]) clearly ensures that the sequence (A+βk∂Ψ)graph converges in turn to A+NargminΨ.
Now, let x* be a cluster point of {xk}. Passing to the limit in (2.9), on a subsequence still denoted by {xk}, and taking into account the fact that the graph of a maximal monotone operator is weakly strongly closed in H×H, we then conclude that
0∈(A+NC)x*,
because A is Lipschitz continuous, (xk) is asymptotically regular thanks to (2.8), and (λk) is bounded away from zero.
It remains to prove that there is no more than one cluster point, our argument is classical and is presented here for completeness.
Let x̃ be another cluster of {xk}, we will show that x̃=x*. This is a consequence of (2.6). Indeed,
l(x*)=limk→+∞‖xk-x*‖2,l(x̃)=limk→+∞‖xk-x̃‖2,
from
‖xk-x̃‖2=‖xk-x*‖2+‖x*-x̃‖2+2〈xk-x*,x*-x̃〉,
we see that the limit of 〈xk-x*,x*-x̃〉 as k→+∞ must exists. This limit has to be zero because x* is a cluster point of {xk}. Hence at the limit, we obtain
l(x̃)=l(x*)+‖x*-x̃‖2.
Reversing the role of x̃ and x*, we also have
l(x*)=l(x̃)+‖x*-x̃‖2.
That is x̃=x*, which completes the proof.
Remark 2.2.
(i) Note that, we can remove condition (2.2), but in this case we obtain that there exists a subsequence of (xk) such that every weak cluster point is a solution of problem (1.2). This follows by Lemma 1.2 combined with the fact that x̅=Jλ*∂δC(x̅-λ*Ax̅) and that (βk∂Ψ) graph converges to ∂δC. The later is equivalent, see for example [6], to the pointwise convergence of Jλkβk∂Ψ to Jλ*∂δC and therefore ensures that
limk→+∞‖x̅-Jλkβk∂Ψ(x̅-λkAx̅)‖=0.
(ii) In the special case Ψ(x)=(1/2)dist(x,C)2, (2.2) reduces to ∑k=0∞1/βk<+∞, see Application (2) of Section 3.
Suppose now that Ψ(x)=dist(x,C), it well-known that proxγΨ(x)=PC(x) if dist(x,C)≤γ. Consequently,
Jλkβk∂Ψ(x)=PC(x)ifdist(x,C)≤λkβk,
which is the case for all k≥κ for some κ∈ℕ because (λk) is bounded and limk→+∞βk=+∞. Hence limk→+∞∥x̅-Jλkβk∂Ψ(x̅-λkAx̅)∥=0, for all k≥κ, and thus (2.2) is clearly satisfied.
The particular case Ψ=0 corresponds to the unconstrained case, namely, C=H. In this context the resolvent associated to βk∂Ψ is the identity, and condition (2.2) is trivially satisfied.
2.2. Strong Convergence
Now, we would like to stress that we can guarantee strong convergence by reinforcing assumptions on A.
Proposition 2.3.
Assume that A is strong monotone with constant α>0, that is,
〈Ax-Ay,x-y〉≥α‖x-y‖2∀x,y∈H,forsomeα>0,
and Lipschitz continuous with constant L>0, that is,
‖Ax-Ay‖≤L‖x-y‖∀x,y∈H,forsomeL>0.
If λk∈]ε,2α/L2-ε[ (where ε>0 is a small enough constant) and limk→+∞λk=λ*>0, then the sequence generated by (1.9) strongly converges to the unique solution of (1.2).
Proof.
Indeed, by replacing inverse strong monotonicity of A by strong monotonicity and Lipschitz continuity, it is easy to see from the first part of the proof of Theorem 2.1 that the operator of I-λkA satisfies
‖(I-λkA)(x)-(I-λkA)(y)‖2≤(1-2λkα+λk2L2)‖x-y‖2.
Following the arguments in the proof of Theorem 2.1 to obtain
‖xk-x̅‖≤1-2λkα+λk2L2‖xk-1-x̅‖+δk(x̅)withδk(x̅):=‖x̅-Jλkβk∂Ψ(x̅-λkAx̅)‖.
Now, by setting Θ(λ)=1-2λα+λ2L2, we can check that 0<Θ(λ)<1 if and only if λk∈]0,2α/L2[, and a simple computation shows that 0<Θ(λk)≤Θ*<1 with Θ*=max{Θ(ε),Θ(2α/L2-ε)}. Hence,
‖xk-x̅‖≤(Θ*)k‖x0-x̅‖+∑j=0k-1(Θ*)jδk-j(x̅).
The result follows from Ortega and Rheinboldt [9, page 338] and the fact that limk→+∞δk(x̅)=0. The later follows thanks to the equivalence between graph convergence of the sequence of operators (βk∂Ψ) to ∂δC and the pointwise convergence of their resolvent operators combined with the fact that limk→+∞λk=λ*.
Having in mind the connection between monotone operators and convex functions, we may consider the special case A=∇Φ, Φ being a proper lower semicontinuous differentiable convex function. Differentiability of Φ ensures that ∇Φ+NargminΨ=∂(Φ+δargminΨ) and (1.2) reads as
minx∈argminΨΦ(x).
Using definition of the Moreau-Yosida approximate, algorithm (1.9) reads as
xk=argminy∈H{f(y)+12λk‖y-(I-λkA)xk-1‖2}.
In this case, it is well-known that the assumption (2.1) of inverse strong monotonicity of ∇Φ is equivalent to its L-Lipschitz continuity. If further we assume ∑k=1∞δk(x̅)<+∞ for all x̅∈VI(∇Φ,C) and λk∈]ε,2/L-ε[, then by Theorem 2.1 we obtain weak convergence of algorithm (3.2) to a solution of (3.1). The strong convergence is obtained, thanks to Proposition 2.3, if in addition Ψ is strongly convex (i.e., there is α>0;
(1-μ)Ψ(x1)+μΨ(x2)≥Ψ((1-μ)x1+μx2)+α2μ(1-μ)‖x1-x2‖2
for all μ∈[0,1], all x1,x2∈H) and (λk) a convergent sequence with λk∈]ε,2α/L2-ε[. Note that strong convexity of Ψ is equivalent to α-strong monotonicity of its gradient. A concrete example in signal recovery is the Projected Land weber problem, namely,
minx∈CΦ(x):=12‖Lx-z‖2,L being a linear-bounded operator. Set A(x):=∇Φ(x)=L*(Lx-z). Consequently,
∀x,y∈H‖A(x)-A(y)‖=‖L*L(x-y)‖≤‖L‖2‖x-y‖,
and A is therefore Lipschitz continuous with constant ∥L∥2. Now, it is well-known that the problem possesses exactly one solution if L is bounded below, that is,
∃κ>0∀x∈H‖L(x)‖≥κ‖x‖.
In this case, A is strongly monotone. Indeed, it is easily seen that f is strongly convex: consider x,y∈H and μ∈]0,1[, one has
‖μ(Lx-z)+(1-μ)(Ly-z)‖22≤μ‖Lx-z‖22+(1-μ)‖Ly-z‖22-κ2μ(1-μ)‖x-y‖22.
(2) Classical Penalization
In the special case where Ψ(x)=(1/2)dist(x,C)2, we have
∂Ψ(x)=x-ProjC(x),
which is nothing but the classical penalization operator, see [10]. In this context, taking into account the fact that
((∂f)λ)μ=∇fλ+μ,Jλ∂f=I-λ(∂f)λ=I-λ∇fλ,(δC)λ=1λΨ,
and that x̅ solves (1.2), and thus x̅=PC(x̅-λkAx̅), we successively have
‖x̅k-x̅‖=‖Jλkβk∂Ψ(x̅-λkAx̅)-JλkNC(x̅-λkAx̅)‖=λk‖(βk∂Ψ)λk(x̅-λkAx̅)-(NC)λk(x̅-λkAx̅)‖=λk‖βk(∂Ψ)λkβk(x̅-λkAx̅)-∇(δC)λk(x̅-λkAx̅)‖=λk‖βk(∂(δC)1)λkβk(x̅-λkAx̅)-∇(δC)λk(x̅-λkAx̅)‖=λk‖βk∇(δC)1+λkβk(x̅-λkAx̅)-∇(δC)λk(x̅-λkAx̅)‖=λk(1λk-βk1+λkβk)‖(x̅-λkAx̅)-PC(x̅-λkAx̅)‖=11+λkβk‖λkAx̅‖≤1βk‖Ax̅‖.
So condition on the parameters reduces to ∑k=1∞1/βk<+∞, and algorithm (1.9) is nothing but a relaxed projection-gradient method. Indeed, using (1.5) and the fact that JλNC=PC, we obtain
xk=(11+λkβkI+λkβk1+λkβkPC)(I-λkA)xk-1.
An inspection of the proof of Theorem 2.1 shows that the weak converges is assured with λk∈]ε,2/L-ε[.
(3) A Hierarchical Fixed-Point Problem
Having in mind the connection between inverse strongly monotone operators and nonexpansive mappings, we may consider the following fixed-point problem:
(I-P)x+NC(x)∋0,
with P a nonexpansive mapping, namely, ∥Px-Py∥≤∥x-y∥.
It is well-known that A=I-P is inverse strongly monotone with L=2. Indeed, by definition of P, we have
‖(I-A)x-(I-A)y‖≤‖x-y‖.
On the other hand
‖(I-A)x-(I-A)y‖2=‖x-y‖2+‖Ax-Ay‖2-2〈x-y,Ax-Ay〉.
Combining the two last inequalities, we obtain
〈x-y,Ax-Ay〉≥12‖Ax-Ay‖2.
Therefore, by Theorem 2.1 we get the weak convergence of the sequence (xk) generated by the following algorithm:
xk=proxβkΨ((I-λk)xk-1+λkPxk-1)
to a solution of (3.12) provided that ∑k=1∞δk(x̅)<+∞ for all x̅∈VI(I-P,C) and λk∈]ε,1-ε[. The strong convergence of (1.9) is obtained, by applying Proposition 2.3, for P a contraction mapping, namely, ∥Px-Py∥≤γ∥x-y∥ for 0<γ<1 which is equivalent to the (1-γ)-strong monotonicity of (I-P), and (λk) is a convergent sequence with λk∈]ε,2(1-γ)/(1+γ)2-ε[. It is easily seen that in this case I-P is (1+γ)-Lipschitz continuous.
4. Towards the Multivalued Case
Now, we are interested in (1.2) when A:H→2H is a multi-valued maximal monotone operator. With the help of the Yosida approximate which is always inverse strongly monotone (and thus single-valued), we consider the following partial regularized version of (1.2):
Aγxγ*+NC(xγ*)∋0,
where Aγ stands for the Yosida approximate of A.
It is well-known that Aγ is inverse strongly monotone. More precisely, we have
〈Aγx-Aγy,x-y〉≥γ‖Aγx-Aγy‖2.
Using definition of the Yosida approximate, algorithm (1.9) applied to (4.1) reads as
xkγ=(I+λkβk∂Ψ)-1((1-λkγ)xk-1γ+λkγJγA(xk-1γ)).
From Theorem 2.1, we infer that xkγ converges weakly to a solution x̅γ provided that λk∈]ε,2γ-ε[. Furthermore, it is worth mentioning that if A is strongly monotone, Aγ is also strongly monotone, and thus (4.1) has a unique solution x̅γ. By a result in [8, page 35], we have the following estimate:
‖x̅-x̅γ‖≤o(γ).
Consequently, (4.3) provides approximate solutions to the variational inclusion (1.2) for small values of γ. Furthermore, when A=∇Φ, we have
(∂Φ)γ(x̅)+NC(x̅)=∇Φγ(x̅)+NC(x̅)=∂(Φγ+δC)(x̅),
and thus (4.1) reduces to minx∈CΦγ(x).
If (3.1) and (4.1) are solvable, by ([11] Theorem 3.3), we have for all γ>00≤minx∈CΦ(x)-minx∈CΦγ(x)≤γ‖y̅‖2,
where y̅=∇Φ(y̅)(∈-NC(x̅)) with x̅ a solution of (3.1). The value of (3.1) is thus close to those of (4.1) for small values of γ, and hence, this confirmed the pertinence of the proposed approximation idea to reach the multi-valued case. Observe that in this context, algorithm (4.3) reads as
xkγ=proxβkΨ((1-λkγ)xk-1γ+λkγproxγΦ(xk-1γ)).
5. Conclusion
The authors have introduced a forward-backward penalization-gradient algorithm for solving variational inequalities and studied their asymptotic convergence properties. We have provided some applications to hierarchical fixed-point and optimization problems and also proposed an idea to reach monotone variational inclusions.
Acknowledgment
We gratefully acknowledge the constructive comments of the anonymous referees which helped them to improve the first version of this paper.
StampacchiaG.Formes bilinéaires coercitives sur les ensembles convexes1964258441344160166591ZBL0124.06401AttouchH.CzarneckiM. O.PeypouquetJ.Prox-penalization and splitting methods for constrained variational problems2011211149173PasstyG. B.Ergodic convergence to a zero of the sum of monotone operators in Hilbert space197972238339055937510.1016/0022-247X(79)90234-8ZBL0428.47039LemaireB.Coupling optimization methods and variational convergence198884Basel, SwitzerlandBirkhäuser163179International Series of Numerical Mathematics1017952ZBL0633.49010GwinnerJ.On the penalty method for constrained variational inequalities198386New York, NY, USADekker197211Lecture Notes in Pure and Applied Mathematics716365ZBL0519.49021RockafellarR. T.WetsR. J.-B.1998317Berlin, GermanySpringerxiv+73310.1007/978-3-642-02431-31491362MartinetP. L.1972Université de GrenobleBrézisH.1973Amsterdam, The NetherlandsNorth-Hollandvi+183North-Holland Mathematics Studies, no. 50348562ZBL0325.35033OrtegaJ. M.RheinboldtW. C.1970New York, NY, USAAcademic Pressxx+5720273810PascaliD.SburlanS.1978The Hague, The NetherlandsMartinus Nijhoffx+341531036LehdiliN.1996Université de Montpellier