Generalized Quadratic Augmented Lagrangian Methods with Nonmonotone Penalty Parameters

For nonconvex optimization problem with both equality and inequality constraints, we introduce a new augmented Lagrangian function and propose the corresponding multiplier algorithm. New iterative strategy on penalty parameter is presented. Different global convergence properties are established depending on whether the penalty parameter is bounded. Even if the iterative sequence {xk} is divergent, we present a necessary and sufficient condition for the convergence of {f x } to the optimal value. Finally, preliminary numerical experience is reported.


Introduction
This paper is concerned with the following nonlinear programming problem: min f x s.t.g i x ≤ 0, i 1, . . ., m, h j x 0, j 1, . . ., l, x ∈ Ω, P where f, g i : R n → R for i 1, . . ., m and h j : R n → R for j 1, . . ., l are all continuously differentiable functions, and Ω is a nonempty and closed subset in R n .Denoted by X the feasible region and by X * the solution set.
The classical Lagrangian function for this problem is where λ, μ ∈ R m × R l .It is well known that the above classic Lagrangian function would fail to find out the optimal solution of primal problem because a nonzero duality gap may be arisen for nonconvex programming.Augmented Lagrangians were proposed to overcome this difficulty.It combines the advantage from Lagrangian duality methods and penalty function methods; for example, it does not need to make the penalty parameter infinitely large, and hence successfully avoid ill-conditioning effects.In fact, if the penalty parameter is very large, the Lagrangian relaxation subproblem would be very difficult in the sense that the stabilized Newton step is not regarded as a candidate to be a decreasing direction.
Global convergence properties of augmented Lagrangian methods have been studied by many researchers 1-8 .It should be noted that the multiplier sequence generated by the algorithm may be unbounded.Several strategies have proposed to deal with these questions, such as adding constraint qualifications 5, 6 or via safeguarding projection 1, 2, 9, 10 .
In this paper, for the optimization problem P with both equality and inequality constraints, we introduce a new class of augmented Lagrangian functions, which include the well-known quadratic augmented Lagrangian function as special case 11-13 .It should be noted that this function class is more general since we do not restrict φ to be convex see 2.1 below for the definition of φ .Convergence properties of the corresponding multiplier algorithm are studied.Specially, a new updated strategy on penalty parameter is proposed, that is, the penalty parameter c k is improved only when the multipliers are too large, and the iterative point is far away from the feasible region.This strategy guarantees that the penalty parameter remains unchanged, provided that measure of infeasibility of iterative point and bounding of multipliers have enough progress see 2.4 .Furthermore, we study the performance of the multiplier sequences generated by our algorithm.Finally, compared with 4, 9, 10 , we further consider the case when {x k } is divergent, in which a necessary and sufficient condition for {f x k } converging to the optimal value is given as well.
The organization of this paper is as follows.In the next section, we propose the multiplier algorithm and study the global convergence properties.Preliminary numerical results are reported in Section 3. Conclusion is drawn in Section 4.

Multiplier Algorithms
A new generalized quadratic augmented Lagrangian function for P is defined as follows: The function φ : R → R involved in 2.1 satisfies the following properties: A 1 continuously differentiable and strictly increasing on R with φ 0 0 and φ α ≥ α for α ≥ 0.
Particularly, if φ α α for α ∈ R, then L reduces to the quadratic augmented Lagrangian function.Compared with 9, 14, 15 , an important point made above is that φ is not required to be convex.Hence, it makes the augmented Lagrangian we introduce here more general.
Given x, λ, μ, c , the Lagrangian relaxation subproblem associated with the augmented Lagrangian L is defined as: min L x, λ, μ, c , Its solution set is denoted by S * λ, μ, c .We always assume throughout this paper that the set S * λ, μ, c is nonempty, which is ensured if Ω is compact, since f, g i , and h j are all continuous.In addition, we assume that: It is a rather mild assumption in optimization problem because otherwise the objective function f can be replaced by e f x .Recall that a vector x * is said to be a stationary point of P if it is a feasible point and there exist λ i ∈ R for all i 1, . . ., m and μ j for j 1, . . ., l such that the following system holds: where N Ω x * denotes the normal cone of Ω at x * 16, Chapter 6 .Let Λ x * denote the collection of multipliers satisfying 2.2 .The set of Λ x * is larger than that of R-multiplier defined in 17 , which reduces to the well-known Karush-Kuhn-Tucker KKT point of P when Ω R n , since the complementarity condition is required for KKT point.The following is the multiplier algorithm based on the generalized quadratic augmented Lagrangian L. One of its main feature is that the Lagrangian multipliers associated with equality and inequality constraints are not restricted to be bounded.
Algorithm 2.1 multiplier algorithm based on L .
Step 1. initialization Let M > 0 and {ε k } ∞ k 1 be an arbitrary positive sequence with zero as limit i.e., ε k → 0 .Select an initial point Step 2. estimating multipliers Compute

2.3
Step 3. updating penalty parameter Let Step 4. solving Lagrangian relaxation subproblem Find Step 5. cycling If Consider the perturbation of P ; precisely, for any fixed α ≥ 0, define the perturbation value function as follows: where It is clear that Ω 0 X.In practical implementation of the previous algorithm, we always choose the constant M large enough to strictly include ∂v 0 , the set of subdifferential of v at origin, since according to the strong duality theory, the optimal Lagrangian multipliers belong to ∂v 0 ; see 18 for the detailed discussion.In 5, 6 , the authors require the convergence of {λ k /c k } to zero.It follows from 2.5 that this nice property holds automatically in our new strategy.In addition, 2.4 and 2.5 indicate that Lagrangian relaxation subproblem needs to place more emphasis on decreasing the constraint violations via increasing the penalty parameter, provided that the current point x k is far away from feasible region.Lemma 2.2.Let {x k } be the iterative sequence generated by Algorithm 2.1, and let x * be one of its accumulate points.If {c k } is bounded, then x * is a stationary point of P .
Proof.If the algorithm is terminated finitely, then the result follows immediately according to the stop criterion in Step 5. On the other hand, if the condition 2.5 occurs infinitely, then c k ≥ k, and hence c k must be unbounded.Therefore, by hypothesis we knew that only condition 2.4 occurs when k is large enough.Because g i x k ≤ ε k and |h j x k | ≤ ε k by 2.4 , taking the limit yields g i x * ≤ 0 and h j x * 0, since ε k converges to zero.This shows the feasibility of x * .Meanwhile, we know from 2.4 that {λ k i } and {μ k j } are bounded.Hence, we can assume without loss of generality that λ k i → λ * i and μ k j → μ * j .Since x k is an optimal solution of minimizing L x, λ k , μ k , c k over Ω by Step 4, then according to the well-known optimality conditions we have which together with the formula of 2.3 and 2.4 yields Taking limits gives us where we have used the upper semicontinuity of normal cone 16, Chapter 6 .This completes the proof.Now, we turn our main concern to the case when {c k } is unbounded.
Theorem 2.3.Let {x k } be the iterative sequence generated by Algorithm 2.1, and let x * be one of its accumulate points.If {c k } is unbounded, then x * is an optimal solution of P .
Proof.It is clear that if {c k } is unbounded, then the iteration formula 2.5 occurs infinitely.Hence, for simplification, we assume that 2.5 always happens as k sufficiently large, and that x k converges to x * not resorting to a subsequence .Taking into account of 2.5 , the terms λ k /c k , μ k /c k , λ k 2 /c k , μ k 2 /c k are all converges to zero.This fact will be used in the following analysis.
Let us first show that x * is a feasible point of P .We argue it by contradiction.Note first that where the third and fourth steps are due to the feasibility of x ∈ X.This implies that lim sup Now, let us consider the following two cases.
Case 1.There exists an index j 0 such that |h j 0 x * | > 0. Due to the continuity of h j 0 , we must have

2.13
Noticing that μ k j 0 /c k → 0 as mentioned above and |h j 0 x k | ≥ 1/2 |h j 0 x * |, we know that as k is large enough This, together with 2.13 , yields v 0 ∞ since c k approaches to ∞, which contradicts 2.12 .
Case 2. There exists an index i 0 such that g i 0 x * > 0. Similarly, due to the continuity of g i 0 , we know that g i 0 x k ≥ 1/2 g i 0 x * > 0 as k is sufficiently large.Therefore, where the second inequality comes from the fact φ a ≥ a for all a ≥ 0 by A 1 , and the last inequality follows from the nonnegativity of λ k i by 2.3 , since φ α ≥ 0 for all α ∈ R due to the monotonicity of φ.Invoking 2.5 and taking limits in the above inequality yields v 0 ∞, which contradicts 2.12 .So far, we have established the feasibility of x * .
It remains to show that x * is an optimal solution.In fact, notice that where the second inequality is due to 2.11 .Taking limits in the above yields where the last step comes from the fact that μ k j 2 /c k and λ k i 2 /c k converge to zero see 2.5 .
This means that x * must be an optimal solution of P .The proof is complete.
Finally, let us deal with the performance of multiplier sequences {λ k } and {μ k } generated by our proposed algorithm.Theorem 2.4.Let {x k } be an iterative sequence generated by Algorithm 2.1, and let x * be one of its accumulate points.The following statements hold: where λ * i and μ * i are accumulate points of {λ k i } and where λ * i and μ * j are accumulate points of {λ k i /T k } and {μ k /T k } with

2.21
Proof.The proof is divided into the following two cases.
Case 1.Both { m i 1 λ k i } and { l j 1 μ k j } are bounded.In this case, we can obtain that x * is a stationary point of P by following almost the same argument as in Lemma 2.2.

Case 2. Either {
According to the updated strategy on penalty parameter c k , we know that 2.5 must occur as k sufficiently large.Hence, {c k } is unbounded in this case.Since {x k } is a global optimal solution of L x, λ k , μ k , c k over Ω by Step 4 in Algorithm 2.1, applying the optimality condition to the augmented Lagrangian relaxation problem L λ k ,μ k ,c k yields

2.24
Clearly, λ * i and μ * j are not all zeros.Dividing by T k 1 in both sides of 2.23 and using the fact that N Ω is cone, we have

2.25
Taking limits in the above yields This establishes the result as desired.
Now, a natural question arises: how does the algorithm perform if {x k } is divergent?The following theorem gives the answer.Theorem 2.5.Let {x k } be an iterative sequence generated by Algorithm 2.1.If {c k } is unbounded, then the following statements are equivalent: a f x k converges to the optimal value, that is, Proof. a ⇒ b .Suppose on the contrary that v is not lower semicontinuous at zero from right, then there must exist δ 0 > 0 and ξ t → 0 as t → ∞ such that v ξ t ≤ v 0 − δ 0 , ∀t.

2.28
For any given k, since ξ t → 0 we can choose a subsequence t k satisfying which together with the continuity of φ implies that

2.30
In addition, pick where the last step is due to the fact |h j z k | ≤ ξ t k and g i z k ≤ ξ t k since z k ∈ Ω ξ t k and φ is nondecreasing by A 1 .Taking the limits in both sides of 2.31 and using 2.27 and 2.30 yield which leads to a contradiction.b ⇒ a .Since x k converges to x * , it then follows from Theorem 2.3 that x * is an optimal solution of P .Hence, for any arbitrary ε, we have f x k ≤ f x * ε v 0 ε and x k ∈ Ω ε since g i x k → g i x * ≤ 0 < ε and |h j x k | → |h j x * | 0 < ε as k is sufficiently large .The latter further implies that v ε ≤ f x k by definition 2.6 , that is, Since v ε is lower semicontinuous at ε 0 from right by hypothesis, taking the lower limitation in 2.33 yields It follows by invoking the fact that v α ≤ v 0 , since Ω 0 ⊂ Ω α for all α ≥ 0. Then lim sup α → 0 v α ≤ v 0 , that is, v is upper semicontinuous at origin from right.This together with the lower semicontinuous of v at origin by hypothesis yields the desired result.

Preliminary Numerical Reports
To give some insight into the behavior of our proposed algorithm presented in this paper, we solve two problems by letting φ take the following different functions: The computer codes were written in Matlab 7.0, and the test was done at a PC of Pentium 4 with 2.8 GHz CPU and 1.99 GB memory.We implement our algorithm to solve the following programming problems, where Example 3.1 is obtained by adding an equality constraint in Example 6.3 in 19 , and Example 3.2 comes from 20 .The corresponding numerical results are reported below, where k is the number of iterations, c k is the penalty parameter, x k is iterative point, and f x k is the objective function value.
Example 3.1.Consider the following nonconvex programming:  Example 3.2.In the implementation of our algorithm, we use the BFGS quasi-Newton method with a mixed quadratic and cubic line search procedure to solve the Lagrangian subproblem: i for Example 3.1, the initial data are x 0 1, 1 T , M 1000, k 1/2 k , λ 0 1, and μ 0 1, c 0 1; {c k } remains unchanged because M is taken large enough to ensure the validity of 2.4 ; the obtained solution is a stationary point with the corresponding multipliers λ * 1.172 and μ * 0, and the inequality is active;