Convergence of a Proximal Point Algorithm for Solving Minimization Problems

Copyright q 2012 Abdelouahed Hamdi et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. We introduce and consider a proximal point algorithm for solving minimization problems using the technique of Güler. This proximal point algorithm is obtained by substituting the usual quadratic proximal term by a class of convex nonquadratic distance-like functions. It can be seen as an extragradient iterative scheme. We prove the convergence rate of this new proximal point method under mild assumptions. Furthermore, it is shown that this estimate rate is better than the available ones.


Introduction
The purpose of this paper is twofold.Firstly, it proposes an extension of the proximal point method introduced by G üler 1 in 1992, where the usual quadratic proximal term is substituted by a class of strictly convex distance-like functions, called Bregman functions.Secondly, it offers a general framework for the convergence analysis of the proximal point method of G üler.This framework is general enough to apply different classes of Bregman functions and still yield simple convergence proofs.The methods being analyzable in this context are called G üler's generalized proximal point algorithm, and are closely related to the Bregman proximal methods 2-5 .The analysis, we develop is different from the works in 4, 5 , since our method is based on G üler's technique.

Preliminaries
To be more specific, we consider the minimization problem in the following form: We denote also by ρ x, X the distance of x to the set X and it is given by ρ x, X min y∈X x − y .Further notations and definitions used in this paper are standard in convex analysis and may be found in Rockafellar's book 7 .
This type of kernels was introduced first by 8 in 1967.The corresponding algorithm using these Bregman proximal mappings is called the Generalized Proximal Point Method GPPM and known also under the terminology of Bregman Proximal Methods.These proximal method solve 2.1 by considering a sequence of unconstrained minimization problems, which can be summed as follows.
2 Compute the solution x k 1 by the iterative scheme: where {c k } is a sequence of positive numbers and D h •, • is defined by 2.3 .
For D h x, y 1/2 x − y 2 , Algorithm 2.1 coincides with the classical proximal point algorithm PPA introduced by Moreau 9 and Martinet 10 .
Under mild assumptions on the data of 2.1 ergodic convergence was proved 2, 5 when σ n n k 1 c k → ∞ with the following global rate of convergence estimate: Our purpose in this paper is to propose an algorithm of the same type as Algorithm 2.1 which has better convergence rate.To this goal, we propose to combine G üler's scheme 1 and the Bregman proximal method.The main difference concerns the generation of an additional sequence {y k } ⊂ R n in the unconstrained minimization 2.4 in such a way: We show see Section 4 that this new proximal method possesses the following rate estimate which is faster than 2.5 .Further, the convergence in terms of the objective values occurs when γ n → ∞ which is weaker than σ n → ∞.
We briefly recall here the notion of Bregman functions called also D-functions introduced by Brègman 8 , 1967 , developed From the above definition, we extract the following properties see, for instance, 6, 13 .

Lemma 2.3. Let h be a Bregman function with zone S. Then, i D h x, x
0 and D h x, y ≥ 0 for x ∈ S and y ∈ S, ii for all a, b ∈ S and c ∈ S, iii for all a, b ∈ S, then g is a Bregman function.
(ii) If g is a Bregman function, then g x c x d for any c ∈ R n , d ∈ R, also is a Bregman function.
Remark 2.5.D h •, • cannot be considered as a distance because of the lack of the triangle inequality and the symmetry property.D h •, • is usually called an entropy distance.
The paper is organized as follows.In Section 3, we recall briefly the proximal point method of G üler.Section 4 will be devoted to the presentation and convergence analysis of the proposed algorithm.Finite convergence is shown in Section 5. Finally, in Section 6 we present an application of this method to solve variational inequalities problem.

Extragradient Algorithm
In 1992, G üler 1 has developed a new proximal point approach similar to the classical one PPA based on the idea stated by Nesterov 14 .G üler's proximal point algorithm GPPA can be summed up as follows.
iii Compute the solution x k 1 by the iterative scheme:

3.1
For the convergence analysis, see G üler 1 .
Remark 3.2.The GPPA can be seen as a suitable conjugate gradient type modification of the PPA of Rockafellar applied to 2.1 .

Introduction
The method that we are proposing is a modification of G üler's new proximal point approach GPPA discussed in Section 3 and can be considered as a nonlinear or a nonquadratic version of GPPA with Bregman kernels.In this paper it is shown that this method, which we call BGPPA possesses the strong convergence results obtained by G üler 1 and therefore this new scheme provides faster global convergence rates than the classical Bregman proximal point methods cf. 2, 4-6, 11, 13, 15 .In this paper, we propose the following algorithm generalizing G üler's proximal point algorithm and summed up as follows.
Algorithm 4.1.i Initialize: iii Compute the solution x k 1 by the iterative scheme:

4.1
In this section we develop convergence results for the generalized G üler's proximal point algorithm GGPPA presented in Section 4.2.Our analysis is basically based on the following lemma.
Theorem 4.3.For all x ∈ S such that f x < ∞, one has the following convergence rate estimate:

4.3
Proof.Using the fact that φ 0 x : f x 0 AD h x, x 0 , x 0 ∈ S, and Lemma 4.2, we obtain Proof. a Follows from 4, Theorem 4 .b Uses assumption 2.3 in the following manner and by taking x x * in 4.3 , then we have
c Is obvious.d It suffices to observe that if c k ≥ c > 0, we have 4.9

Convergence Rate of GGPPA
If {x k } is a sequence of points, one forms the sequence {z n } of weighted averages given by where c k > 0. If the sequence {z n } converges, then {x k } k is said to converge ergodically.
Theorem 5.1.GGPPA possesses the following convergence rate: Proof.Let x * be a minimizer of f.For brevity, we denote At optimality in the unconstrained minimization in GGPPA, we can write and by the convexity of f, we have Setting x x k in 5.5 , we obtain and for x y k , we have Or again, if we set x x * in 5.5 , and using the Cauchy-Schwartz inequality, we obtain that is, Since h is convex, x k 1 − y k , Δ ≤ 0. Then we can write

5.11
Using the relation Δ 2 ≤ L Δ, y k − x k 1 and the inequality 5.7 , we get the relation

5.12
For short we denote M k y k − x * thus, 5.12 becomes Then by dividing both terms by LM 2 k > 0, we get

5.14
Since the left-side term is positive, then

5.15
Now following G üler 17, page 410 , we use the fact that 1 x −1 ≤ 1 − 2x/3, for all x ∈ 0, 1/2 .To apply this inequality, it suffices to show that c k /LM 2 k W k 1 is less than or equal to 1/2.This can be deduced from this relation see Lemma 2.3 ii : Indeed, since D h •, • ≥ 0, then the proof of this next inequality can be found in the proof of Theorem 4.4-b 5.17 Therefore, 0 < c k /LM 2 k W k 1 ≤ 1/2 and we obtain

5.18
To continue the proof, we will separate some different cases.
and by summation from k 0 to k n, we get

5.22
Since x * is an arbitrary solution, we can write

5.23
and by multiplying both terms by σ n 1

5.24
Since y k and x k 1 converge to the same point indeed, we can see it via the formula giving ν k 1 in the algorithm GGPPA and ρ x k 1 , X * → 0, then ρ x k 1 , X * −2 → ∞; hence, we obtain k .Thus, using inequality 5.18 , we write and by summation from k 0 to k n, we get

5.30
Since x * is an arbitrary solution, we can write

5.31
and by multiplying both terms by σ n 1

5.32
Since y k and x k 1 converge to the same point indeed, we can see it via the formula giving ν k 1 in the algorithm GGPPA and ρ x k 1 , X * → 0, then ρ x k 1 , X * −2 → ∞; hence, we obtain 0 ≤ σ n 1 f x n 1 − f * −→ 0, 5.33 which implies,

5.35
Case 3. If f x k ≤ f x k 1 ≤ f y k .In this case we observe that sequence {f x k } is increasing, which may imply a divergence of the approach.
Since f is convex, then the following convergence rate estimate can be derived directly.

Conclusion
We have introduced an extragradient method to minimize convex problems.The algorithm is based on a generalization of the technique originally proposed by Nesterov 14 and readapted by G üler in 1, 17 , where the usual quadratic proximal term was substituted by a class of convex nonquadratic distance-like functions.The new algorithm has a better theoretical convergence rate compared to the available ones.This motivates naturally the study of the numerical efficiency of the new algorithm and its application solve variational inequality problems 18, 19 .Also, further efforts are needed to consider the given study for nonconvex situations and apply it to solve nonconvex equilibrium problems 20 .

Corollary 5 . 2 .
If one assumes that c k ≥ c > 0 for all k, then f z n − f * o 1 n .5.36 2.1 where f : R n → R ∪ {∞} is a closed proper convex function.To solve the problem 2.1 , Teboulle 6 , Chen and Teboulle 2, 6 , Eckstein 4 and Burachik 3 proposed a general scheme using the Bregman proximal mappings of the type J c • denotes the l 2 -norm and •, • denotes the Euclidean inner product and used in the proximal theory by 4, 6, 11-13 .Let S be an open subset of R n and let h : S → R be a finite-valued continuously differentiable function on S be and let D h defined by a h is continuously differentiable on S and continuous on S, b h is strictly convex on S, c for every λ ∈ R, the partial level sets L 1 y, λ {x ∈ S : D h x, y ≤ λ} and L 2 x, λ {y ∈ S : D h x, y ≤ λ} are bounded for every y ∈ S and x ∈ S, respectively, Note that the finite convergence property was established for the classical proximal point algorithm in the case of sharp minima, see, for example, 16 .Recently,Kiwiel 5has extended this property to his generalized Bregman proximal method BPM .In the following theorem we prove that Algorithm 3.1 has this property.Our proof is based on Kiwiel's one 5, Theorem 6.1 page 1151 .
Definition 4.5.A closed proper convex function f : R n → R is said to have a sharp minimum on R n if and only if there exists τ > 0 such that f x ≥ min R n f τ min z∈Argminf x − z ∀x.4.10 Theorem 4.6.Under the same hypothesis as in Theorem 4.4 and by considering GGPPA with f having a sharp minimum on R n and c k being bounded, then there exists k such that 0 ∈ ∂f x k and x k ∈ X * .Proof.Straightforward, using Theorem 4.4 and 5, Theorem 6.1, page 1151 .