New Convergence Properties of the Primal Augmented Lagrangian Method

and Applied Analysis 3 Given x, λ, μ, c , the augmented Lagrangian relaxation problem associated with the augmented Lagrangian L is defined by min L ( x, λ, μ, c ) s.t. x ∈ Ω. Lλ,μ,c Given ε ≥ 0, then the ε-optimal solution set of Lλ,μ,c , denoted by S∗ λ, μ, c, ε , is defined as { x ∈ Ω | Lx, λ, μ, c ≤ inf x∈Ω L ( x, λ, μ, c ) ε } . 2.2 IfΩ is closed and bounded, then the global optimal solution of Lλ,μ,c exists. However, if Ω is unbounded, then Lλ,μ,c maybe unsolvable. To overcome this difficultly, we assume throughout this paper that f is bounded on Ω from below, that is, f∗ : inf x∈Ω f x > −∞. 2.3 This assumption is rathermild in optimization programming, because otherwise the objective function f can be replaced by e x . It ensures that the ε-optimal solution set with ε > 0 always exists, since L x, λ, μ, c is bounded from below by 2.1 and 2.3 . Recall that a vector x∗ is said to be a KKT point of P if there exist λi ≥ 0 for each i 1, . . . , m and μj for each j 1, . . . , l such that 0 ∈ ∇f x m ∑ i 1 λi∇gi x l ∑ j 1 μj∇hj x NΩ x∗ , λi gi x∗ 0, ∀i 1, . . . , m, 2.4 whereNΩ x∗ denotes the normal cone ofΩ at x∗. The collection set of all λ∗ and μ∗ satisfying 2.4 is denoted by Λ x∗ . The multiplier algorithm based on the primal augmented Lagrangian L is proposed below. One of its main features is that the Lagrangianmultipliers associatedwith equality and inequality constraints are not restricted to be bounded, whichmakes the algorithm applicable for many problems in practice. Algorithm 2.1 Multiplier algorithm based on L . Step 1. Select an initial point x0 ∈ R, λ0 ≥ 0, μ0 ∈ R, c0 > 0, and ε0 ≥ 0. Set k : 0. Step 2. Compute λ 1 i max { 0, ckgi ( x ) λki } , ∀i 1, . . . , m, 2.5 μ 1 j μ k j ckhj ( x ) , ∀j 1, . . . , l, 2.6 4 Abstract and Applied Analysis εk 1 εk k 1 , 2.7 ck 1 ≥ k 1 max ⎧ ⎨ ⎩ 1, m ∑ i 1 ( λ 1 i )2 , l ∑ j 1 ( μ 1 j )2 ⎫ ⎬ ⎭ . 2.8 Step 3. Find x 1 ∈ S∗ λk 1, μ 1, ck 1, εk 1 ; Step 4. If x 1 ∈ X and λ 1, μ 1 ∈ Λ x 1 , then STOP; otherwise, let k : k 1 and go back to Step 2. The iterative formula for εk 1 given in 2.7 is just used to guarantee its convergence to zero. In fact, in the practical numerical experiment, we can choose εk 1 εk/ck to improve the convergence of the algorithm. The following lemma gives the relationship between the penalty parameter ck and the multipliers λ and μ. Lemma 2.2. Let λ, μ, ck be given as in Algorithm 2.1, then the following terms λ ck , μ ck , ( λ )2 ck , ( μ )2 ck 2.9 all approach to zero as k → ∞. Proof. This follows immediately from 2.8 . For establishing the convergence property of Algorithm 2.1, we first consider the perturbation analysis of P . Given α ≥ 0, define the perturbation of feasible region as X α { x ∈ Ω | gi x ≤ α, ∣hj x ∣ ≤ α, i 1, . . . , m, j 1, . . . , l, 2.10 and the perturbation of level set as L α { x ∈ Ω | f x ≤ v 0 α. 2.11 It is clear that X 0 coincides with the feasible set of P . The corresponding perturbation function is given as v α inf { f x | x ∈ X α . 2.12 The following result shows that the perturbation value function is upper semicontinuous at zero. Lemma 2.3. The perturbation function v is upper semicontinuous at zero from right. Proof. SinceX 0 ⊂ X α for any α ≥ 0, then v α ≤ v 0 by definition 2.12 . This implies that lim supα→ 0 v α ≤ v 0 . Abstract and Applied Analysis 5 Lemma 2.4. Let λ, μ, ck, εk be given as in Algorithm 2.1. For any ε > 0, one has S∗ ( λ, μ, ck, εk ) ⊆ { x ∈ Ω | L ( x, λ, μ, ck ) ≤ v 0 ε } , 2.13and Applied Analysis 5 Lemma 2.4. Let λ, μ, ck, εk be given as in Algorithm 2.1. For any ε > 0, one has S∗ ( λ, μ, ck, εk ) ⊆ { x ∈ Ω | L ( x, λ, μ, ck ) ≤ v 0 ε } , 2.13 whenever k is sufficiently large. Proof. For any given ε, it follows from 2.7 and Lemma 2.4 that when k is large enough, we have 1 2ck m ∑ i 1 ( λki ck )2 1 2ck l ∑ i 1 ( μki ck )2 εk ≤ ε. 2.14 Therefore, for x ∈ S∗ λ, μ, ck, εk , L ( x, λ, μ, ck ) ≤ inf { L ( x, λ, μ, ck ) | x ∈ Ω } εk ≤ inf { L ( x, λ, μ, ck ) | x ∈ X 0 } εk ≤ inff x | x ∈ X 0 } 1 2ck m ∑ i 1 ( λki ck )2 1 2ck l ∑ i 1 ( μki ck )2 εk ≤ v 0 ε. 2.15 Lemma 2.5. Let λ, μ, ck be given as in Algorithm 2.1. For any ε > 0, one has { x ∈ Ω | Lx, λ, μ, ck ) ≤ v 0 ε ⊆ X ε . 2.16 whenever k is sufficiently large. Proof. We prove this result by the way of contradiction. Suppose that we can find an ε0 > 0 and an infinite subsequence K ⊆ {1, 2, . . .} such that z ∈ x ∈ Ω | Lx, λ, μ, ck ) ≤ v 0 ε, ∀k ∈ K, 2.17 but z / ∈ X ε0 , ∀k ∈ K. 2.18 It follows from 2.17 that v 0 ε ≥ L ( z, λ, μ, ck ) f ( z ) ck 2 ⎡ ⎢ ⎣ l ∑ j 1 ⎛ ⎝hj ( z ) μkj ck ⎞ ⎠ 2 m ∑ i 1 max { 0, gi ( z ) λki ck }2 ⎤ ⎥ ⎦. 2.19 Since z / ∈ X ε0 , it needs to consider the following two cases. 6 Abstract and Applied Analysis Case 1. There exist an index j0 and an infinite subsequence K0 ⊆ K such that |hj0 z | > ε0. It then follows from 2.19 that v 0 ε ≥ f∗ ck 2 ⎡ ⎢ ⎣ l ∑ j 1 ⎛ ⎝hj ( z ) μkj ck ⎞ ⎠ 2 m ∑ i 1 max { 0, gi ( z ) λki ck }2 ⎤ ⎥ ⎦ ≥ f∗ ck 2 ⎛ ⎝hj0 ( z ) μkj0 ck ⎞ ⎠ 2 . 2.20 Using Lemma 2.2 and the fact that |hj0 z | ≥ ε0 gives us ⎛ ⎝hj0 ( z ) μkj0 ck ⎞ ⎠ 2 ≥ 1 2 ε0, 2.21 whenever k is sufficiently large. This, together with 2.20 , yields v 0 ∞ by taking k ∈ K0 approaching to ∞, which leads to a contradiction. Case 2. There exist an index i0 and an infinite subsequence K0 ⊆ K such that gi0 z > ε0. It follows from 2.19 that v 0 ε ≥ f∗ ck 2 ⎡ ⎢ ⎣ l ∑ j 1 ⎛ ⎝hj ( z ) μkj ck ⎞ ⎠ 2 m ∑ i 1 max { 0, gi ( z ) λki ck }2 ⎤ ⎥ ⎦


Introduction
In this paper, we consider the following nonlinear programming problem: min f x s.t.
g i x ≤ 0, i 1, . . ., m; h j x 0, j 1, . . ., l; x ∈ Ω, P where f, g i : R n → R for each i 1, . . ., m and h j : R n → R for each j 1, . . ., l are all continuously differentiable functions, Ω is a nonempty and closed set in R n .Denoted by X the feasible region and by X * the solution set.
Augmented Lagrangian algorithms are very popular tools for solving nonlinear programming problems.At each outer iteration of these methods, a simpler optimization problem is solved, for which efficient algorithms can be used, especially when the problems are large.The most famous augmented Lagrangian algorithm based on the Powell-Hestenes-Rockafellar 1-3 formula has been successfully used for defining practical nonlinear programming algorithms 4-7 .At each iteration, a minimization problem with simple constraints is approximately solved whereas Lagrange multipliers and penalty parameters are updated in the master routine.The advantage of the Augmented Lagrangian approach over other methods is that the subproblems can be solved using algorithms that can deal with a very large number of variables without making use of factorization of matrices of any kind.
An indispensable assumption in the most existing global convergence analysis for augmented Lagrangian methods is that the multiplier sequence generated by the algorithms is bounded.This restrictive assumption confines applications of augmented Lagrangian methods in many practical situation.The important work on this direction includes 8 , where global convergence of modified augmented Lagrangian methods for nonconvex optimization with equality constraints was established; and Andreani et al. 4 and Birgin et al. 9 investigated the augmented Lagrangian methods using safeguarding strategies for nonconvex constrained problems.Recently, for inequality-constrained global optimization, Luo et al. 10 established the convergence properties of the primal-dual method based on four types of augmented Lagrangian functions without the boundedness assumption of the multiplier sequence.More information can be found in 5, 11, 12 .In this paper, for the optimization problem P with both equality and inequality constraints, we further study the convergence property of the proximal Lagrangian method without requiring the boundedness of multiplier sequences.The main contribution of this paper lies in the following three aspects.First, more general constraints are considered, without restricting only inequality constraints as in 10, 13 and requiring boundedness of X as in 9 .Second, an essential assumption on the global convergence properties given in 4-7, 9, 10 is that the iterative sequence {x k } must be convergent in advance; here, we further discuss the case when {x k } is divergent and develop a necessary and sufficient condition for {f x k } converging to the optimal value of primal problem.Third, the definition of degeneration in 9, 10 is extended from inequality constraint to both inequality and equality constraints.
This paper is organized as follows.In Section 2, we propose the multiplier algorithm and study its global convergence properties.Preliminary numerical results are reported in Section 3. The conclusion is drawn in Section 4.

Multiplier Algorithms
The primal augmented Lagrangian function for P is , and R denotes the all positive real scalars, that is, Given x, λ, μ, c , the augmented Lagrangian relaxation problem associated with the augmented Lagrangian L is defined by min L x, λ, μ, c Given ε ≥ 0, then the ε-optimal solution set of L λ,μ,c , denoted by S * λ, μ, c, ε , is defined as If Ω is closed and bounded, then the global optimal solution of L λ,μ,c exists.However, if Ω is unbounded, then L λ,μ,c maybe unsolvable.To overcome this difficultly, we assume throughout this paper that f is bounded on Ω from below, that is, This assumption is rather mild in optimization programming, because otherwise the objective function f can be replaced by e f x .It ensures that the ε-optimal solution set with ε > 0 always exists, since L x, λ, μ, c is bounded from below by 2.1 and 2.3 .Recall that a vector x * is said to be a KKT point of P if there exist λ * i ≥ 0 for each i 1, . . ., m and μ * j for each j 1, . . ., l such that where N Ω x * denotes the normal cone of Ω at x * .The collection set of all λ * and μ * satisfying 2.4 is denoted by Λ x * .The multiplier algorithm based on the primal augmented Lagrangian L is proposed below.One of its main features is that the Lagrangian multipliers associated with equality and inequality constraints are not restricted to be bounded, which makes the algorithm applicable for many problems in practice.

Algorithm 2.1 Multiplier algorithm based on L .
Step 1. Select an initial point Step 3.
The iterative formula for ε k 1 given in 2.7 is just used to guarantee its convergence to zero.In fact, in the practical numerical experiment, we can choose ε k 1 ε k /c k to improve the convergence of the algorithm.The following lemma gives the relationship between the penalty parameter c k and the multipliers λ k and μ k .Lemma 2.2.Let λ k , μ k , c k be given as in Algorithm 2.1, then the following terms all approach to zero as k → ∞.
Proof.This follows immediately from 2.8 .
For establishing the convergence property of Algorithm 2.1, we first consider the perturbation analysis of P .Given α ≥ 0, define the perturbation of feasible region as and the perturbation of level set as It is clear that X 0 coincides with the feasible set of P .The corresponding perturbation function is given as The following result shows that the perturbation value function is upper semicontinuous at zero.
Proof.For any given ε, it follows from 2.7 and Lemma 2.4 that when k is large enough, we have

2.15
Lemma 2.5.Let λ k , μ k , c k be given as in Algorithm 2.1.For any ε > 0, one has whenever k is sufficiently large.
Proof.We prove this result by the way of contradiction.Suppose that we can find an ε 0 > 0 and an infinite subsequence K ⊆ {1, 2, . ..} such that It follows from 2.17 that

2.19
Since z k / ∈ X ε 0 , it needs to consider the following two cases.
Case 1.There exist an index j 0 and an infinite subsequence

2.20
Using Lemma 2.2 and the fact that whenever k is sufficiently large.This, together with 2.20 , yields v 0 ∞ by taking k ∈ K 0 approaching to ∞, which leads to a contradiction.Case 2. There exist an index i 0 and an infinite subsequence

2.22
where the last step is due to Lemma 2.2, since g i 0 z k > ε 0 and λ k i 0 /c k → 0. Taking limits in the above inequality yields v 0 ∞, which is a contradiction.This completes the proof.

2.24
The proof is complete.
With these preparation, the global convergence property of Algorithm 2.1 can be given, which shows that if the algorithm terminates in finite steps, then we obtain a KKT point of P ; otherwise, every limit point of {x k } would be the optimal solution of P .Theorem 2.7.Let {x k } be the iterative sequence generated by Algorithm 2.1.Then if {x k } is terminated in finite steps, then one gets a KKT point of P ; otherwise, every limit point of {x k } belongs to X * .Proof.According to the construction of Algorithm 2.1, the first part is clear.It remains to prove the second part.Let ε > 0 be given.It follows from Lemmas 2.4-2.6 that when k is large enough, we have

2.25
Thus, Note that X ε and L ε are closed, due to the continuity of f, g i for all i 1, . . ., m and h j for all j 1, . . ., l and the closeness of Ω. Taking the limit in 2.26 yields x * ∈ X ε ∩ L ε , which further shows that x * ∈ X 0 ∩ L 0 , since ε > 0 is arbitrary, that is, x * ∈ X * .The proof is complete.
The foregoing result is applicable to the case when {x k } at least has an accumulation point.However, a natural question arises: how does the algorithm perform as {x k } is divergent?The following theorem gives an answer.Theorem 2.8.Let {x k } be an iterative sequence generated by Algorithm 2.1.Then, if and only if v α is lower semicontinuous at α 0 from right.
Proof.We first show the sufficiency.According to the proof of Theorem 2.7 recall 2.26 , we know that whenever k is sufficiently large.Since v α is lower semicontinuous at α 0 from right, taking the lower limitation in 2.28 yields

2.30
We now show the necessity.Suppose on the contrary that v is not lower semicontinuous at zero from right, then there exist δ 0 > 0 and ε j → 0 as j → ∞ such that

2.31
For any given k, since ε j → 0 we can choose a subsequence j k satisfying

2.32
In addition, let where the last step is due to the fact |h j z k | ≤ ε j k and g i z k ≤ ε j k since z k ∈ X ε j k .Taking limits in both sides of 2.33 and using 2.7 , 2.27 , and Lemma 2.2, we get which leads to a contradiction.The proof is complete.
Note that in many practical cases, the set Ω typically stands for a more simple constraint, for example, a box or a bounded polytope 7 .Hence, we conclude this paper by considering the case of Ω is a bounded, closed, and convex subset of R n .In this case, the global optimal solution of the augmented Lagrangian relaxation problem always exists.Hence, we choose ε 0 0 in Step 1 of Algorithm 2.1, which in turn implies that ε k 0 for all k by 2.7 .First, however, we need to extend the definition of degenerate from inequality constraint as in 10 to both inequality and equality constraints.Definition 2.9.A point x * ∈ X is said to be degenerate if there exists λ * ∈ R m and μ * ∈ R l such that where P Ω x denotes the projection of x onto Ω and I x * {i | g i x * 0, i 1, . . ., m}.
Theorem 2.10.Suppose that Ω is a bounded, closed, and convex set of R n .Let ε 0 0 and {x k } be the iterative sequence generated by Algorithm 2.1.Then, every accumulation point of {x k }, say x * , is either a degenerate or a KKT point of P .
Proof.Noting that ε k 0 for all k by ε 0 0 and 2.7 , then {x k } is a global optimal solution of L x, λ k , μ k , c k by Step 3 in Algorithm 2.1.Applying the well-known optimality condition of optimization problem to the augmented Lagrangian relaxation problem L λ,μ,c yields where N Ω x k is the normal cone of Ω at x k .This together with 2.5 and 2.6 means that where we have used the basic property of normal cone of convex set.Let K be an infinite subsequence in {1, 2, . ..} such that {x k } K → x * ∈ Ω.Consider now the following two cases.
Abstract and Applied Analysis Case 1.Either {λ k 1 } K or {μ k 1 } K is unbounded.In this case, we must have

2.38
Since 0 ≤ λ k 1 i /T k ≤ 1 and 0 ≤ μ k 1 i /T k ≤ 1 are bounded, we can assume by passing a subsequence if necessary that

2.39
Clearly, λ * i and μ * j are not all zeros.On the other hand, since N Ω x * is cone, then it follows from 2.36 that from which and using the basic property of normal cone of convex set, we further have

2.41
Since x k → x * and T k → ∞ as k ∈ K → ∞, we obtain from 2.39 and 2.41 where we have used the continuity of the projection operator.If i / ∈ I x * , then g i x * < 0. Since c k → ∞, we have c k g i x k → −∞ as k ∈ K → ∞.Using 2.5 and Lemma 2.2, we obtain which, together with 2.39 , implies that λ * i 0 for all i / ∈ I x * .Therefore, we obtain from 2.42 that

2.45
Taking limits in 2.37 gives rise to which is equivalent to

2.47
We claim that x * is a feasible point.In fact, if g i x * > 0 for some i, then Note that 2.6 can be rewritten as

2.48
Taking limits in both sides and using the boundedness of {μ k 1 i } k∈K , we obtain that h j x * 0 for all j 1, 2, . . ., l.Thus, x * is a feasible solution of P as claimed.

2.49
This together with 2.47 implies that x * is a KKT point of P and λ * , μ * are the corresponding Lagrangian multipliers.

Numerical Reports
To give some insight into the behavior of our proposed algorithm presented in this paper, we solve the following nonlinar programming problems.The test was done at a PC of Pentium 4 with 2.8 GHz CPU and 1.99 GB memory, and the computer codes were written in MATLAB 7.0.Numerical results are reported in Tables 1-4, where k is the number of iterations, c k is the penalty parameter, x k is iterative point found by the algorithm, and f x k is the objective value.

Conclusions
Augmented Lagrangian methods are useful tools for solving many practical nonconvex optimization problems.In this paper, new convergence property of proximal augmented Lagrangian algorithm is established without requiring the boundedness of multiplier sequences.It is proved that if the algorithm terminates in finite steps, then we obtain a KKT point of the primal problem; otherwise, the iterative sequence {x k } generalized by algorithm converges to optimal solution.Even if {x k } is divergent, we also present a necessary and sufficient condition for the convergence of {f x k } to the optimal value.Moreover, under suitable assumptions, we show that every accumulation point of the iterative sequence generated by the algorithm is either a degenerate or is a KKT point of the primal problem.As our future work, one of the interesting and important topics is whether these nice properties could be extended to more general cone programming, for example, nonlinear semidefinite programming or second-order cone programming.
Both {λ k 1 } K and {μ k 1 } K are bounded.In this case, we can assume without loss of