A New Augmented Lagrangian Method for Equality Constrained Optimization with Simple Unconstrained Subproblem

We propose a new method for equality constrained optimization based on augmented Lagrangian method. We construct an unconstrained subproblem by adding an adaptive quadratic term to the quadratic model of augmented Lagrangian function. In each iteration, we solve this unconstrained subproblem to obtain the trial step.Themain feature of this work is that the subproblem can be more easily solved. Numerical results show that this method is effective.

For (1), we define the Lagrangian function  (, ) =  () −    () (2) and the augmented Lagrangian function where  is called the Lagrangian multiplier and  is called the penalty parameter.In this paper, ‖ ⋅ ‖ refers to the Euclidean norm.
In a typical AL method, at the th step, for given multiplier   and penalty parameter   , an unconstrained subproblem min is solved to find the next iteration point.Then, the multiplier and penalty parameter are updated by some rules.For convenience, for given   and   , we define Φ  () =  (,   ,   ) =  (,   ) +   2 ‖ ()‖ 2 . ( Motivated by the regularized Newton method for unconstrained optimization (see [16][17][18][19]), we construct a new subproblem of (1).At the th iteration point   , Φ  (  + ) is approximated by the following quadratic model: =  (  ,   ) + (∇   (  ,   )) where (  ) = [∇   1 (  ), . . ., ∇    (  )]  and   is a positive semidefinite approximation of ∇ 2  (  ,   ).Let thus we have In [14,15],   () is minimized within a trust region to find the next iteration point.Motivated by the regularized Newton method, we add a regularization term to the quadratic model   () and define where   is called regularized parameter.At the th step of our algorithm, we solve the following convex unconstrained quadratic subproblem: min for finding the trial step   .Then, we compute the ratio between the actual reduction and predicted reduction When   is close to 1, we accept   +   as the next iteration point.At the same time, we think the quadratic model   () is a sufficiently "good" approximation of Φ  (  + ) and reduce the value of   .Conversely, when   is close to zero, we set  +1 =   and increase the value of   , by which we wish to reduce the length of the next trial step.This technique is similar to the update rule of trust region radius.Actually, sufficiently large   indeed reduces the length of the trial step   .However, the regularized parameter is different from trust region radius.In [14,15], the authors construct a trust region subproblem min The exact solution   of ( 12) satisfies the first-order critical conditions if there exists some   ≥ 0 such that B +    is positive semidefinite and while the first-order critical condition of ( 10) is Equations ( 13) and ( 15) can show the similarities and differences between regularized subproblem (10) and trust region subproblem (12).It seems that the parameter   plays a role similar to the multiplier   in the trust region subproblem.But, actually, the update rule of   (see ( 26)) shows that   is not the approximation of   .The update of   depends on the quality of last trial step  −1 and has no direct relation with system (13).
To establish the global convergence of an algorithm, some kind of constraint qualification is required.There are many well-known constraint qualifications, such as LICQ, MFCQ, CRCQ, RCR, CPLD, and RCPLD.In case there are only equality constraints, LICQ is equivalent to MFCQ in which {∇  () |  = 1, . . ., } has full rank; CRCQ is equivalent to CPLD in which any subset of {∇  () |  = 1, . . ., } maintains constant rank in a neighborhood of ; RCR is equivalent to RCPLD in which {∇  () |  = 1, . . ., } maintains constant rank in a neighborhood of .RCPLD is weaker than CRCQ, and CRCQ is weaker than LICQ.In this paper, we use RCPLD which is defined in the following.Definition 1.One says that RCPLD holds at a feasible point  * of (1), if there exists a neighborhood ( * ) of  * such that {∇  () |  = 1, . . ., } maintains constant rank for all  ∈ ( * ).
The rest of this paper is organized as follows.In Section 2, we give a detailed description of the presented algorithm.The global convergence is proved in Section 3. In Section 4, we present the numerical experiments.Some conclusions are given in Section 5.

Algorithm
In this section, we give a detailed description of the proposed algorithm.
As mentioned in Section 1, we solve the unconstrained subproblem (10) holds.Global convergence does not depend on the exact solution of (15), although the linear system ( 15) is easy to be solved.For minimizer of (10) In Section 3, we always suppose that (18) holds.
In a typical AL algorithm, the update rule of   depends on the improvement of constraint violation.A commonly used update rule is that if ‖ +1 ‖ <   ‖  ‖, where 0 <   < 1, one may think that the constraint violation is reduced sufficiently and thus  +1 =   is a good choice.Otherwise, if ‖ +1 ‖ ≥   ‖  ‖, one thinks that current penalty parameter can not sufficiently reduce the constraint violation and increase it in the next iteration.In [20], Yuan proposed a different update rule of   for trust region algorithm.Specifically, if is increased.In (19),   is an auxiliary parameter such that     tends to zero.We slightly modify (19) is increased.
In typical AL method, next iteration point  +1 is obtained by minimizing (,   ,   ).In most AL methods,  +1 satisfies that ‖∇  ( +1 ,   ,   )‖ <   , where   is controlling parameter which tends to zero.As when   is sufficiently small,   −     is a good estimate of the next multiplier  +1 .As we obtain  +1 by minimizing   (), the critical point of   () has no direct relation to ‖∇  ( +1 ,   ,   )‖.Therefore, the update rule  +1 =   −     does not suit our algorithm.We obtain  +1 by approximately solving the following least squares problem: Most AL algorithms require that {  } is bounded to ensure the global convergence.Hence, all components of   are restricted to certain interval [, ].This technique is also used in our algorithm.Now, we give the detailed algorithm in the following.
Step 2 (determine the trial step).Evaluate the trial step   by solving min such that (18) holds.Compute the ratio between the actual reduction to the predicted reduction where Step 3 (update the penalty parameter Remark 3. In practical calculation, it is not required to solve (30) exactly to find λ+1 .In our implementation of Algorithm 2, we use the Matlab subroutine minres to find an approximate solution of the linear system       =     and take it as an approximation of λ+1 .

Global Convergence
In this section, we discuss the global convergence of Algorithm 2. We assume that Algorithm 2 can find an infinite set {  } and give some assumptions in the following.(33) Thus we can obtain (32).Now, we discuss convergence properties in two cases.One is that the penalty parameter   tends to ∞ and the other is that {  } is bounded.Proof.See Lemma 3.1 in Wang and Yuan [15].
In Lemma 5, if  * > 0, then any accumulation point of {  } is infeasible.Sometimes (1) is naturally infeasible; in other words, the feasible set { | () = 0} is empty.In this case, we wish to find a minimizer of constraint violation.Specifically, we wish to solve min The solution of this problem is characterized by In where g is defined in (7).With the help of Theorem 2 in Andreani et al. [21], (55) imply that  * is a KKT point or the RCPLD condition does not hold at  * .

The
holds for all sufficiently large .By the definition of , Equations ( 59)-(61) imply that {  } is convergent.Let It is clear that   > 0 ⇔   >  2 .By Taylor's theorem, it holds that holds for all sufficiently large  and thus The convergence of {  } and the boundedness of {  } imply that ‖∇Φ  (  +  k ) − g ‖ → 0. Therefore, for all sufficiently large ,   > 0. This implies that   >  2 and  +1 <   .
Lemma 10.Suppose that (A1)-(A2) hold and   =  0 for all  ≥ 0; then we have that Proof.Firstly, we prove that the sum of Ared  is bounded.Define the indices set where   is defined by Steps 0 and 4 in Algorithm 2. From Step 4 of Algorithm 2, we know that if  ∉ , then ‖ +1 ‖ >   and  +1 =   .Hence we have where ‖ max ‖ is the upper bound of {  }.From Step 4 and (67), we have Then, we have With the help of Lemmas 10 and 11, we can easily obtain the following result.
Note that, in Theorem 12, we do not suppose that RCPLD holds.

Numerical Experiment
In this Section, we investigate the performance of Algorithm 2. We compare Algorithm 2 with the famous Fortran package ALGENCAN.In our computer program, the parameters in Algorithm 2 are chosen as follows: (3) ‖  ‖ ≤ 10 −8 .All test problems are chosen from CUTEst collection [22].
The numerical results are listed in Table 1 where the name of problem is denoted by , the number of its variables is denoted by , the number of constraints is denoted by , the number of function evaluations is denoted by   , and the number of gradient evaluations is denoted by   .In Table 1, we list the results of 38 test problems.Considering the numbers of function evaluations (  ), Algorithm 2 is better than ALGENCAN for 30 cases (78.9%).Considering the numbers of gradient evaluations (  ), Algorithm 2 is better than ALGENCAN for 31 cases (81.6%).

Conclusions
In this paper, we present a new algorithm for equality constrained optimization.We add an adaptive quadratic term to the quadratic model of the augmented Lagrangian function.In each iteration, we solve a simple unconstrained subproblem to obtain the trail step.The global convergence is established under reasonable assumptions.
From the numerical results and the theoretical analysis, we believe that the new algorithm can efficiently solve equality constrained optimization problems.
the next theorem, we show that if {  } is not convergent to zero, at least one of the accumulation points of {  } satisfies (35).As {  } and {  } are bounded, we can deduce the boundedness of ‖  −      ‖ by (A2); that is, there exists some  > 0 By the update rule of   and the fact that   → ∞, we have that Pred  <     min {       sufficiently large  ∈  1 .If there exists an infinite subset  2 ⊆  such that ‖ g ‖ > 1 holds for all  ∈  2 , then by (48) we have that Pred  < (1)>  (40)holds for all sufficiently large .By the boundedness of   and   , we can conclude that there exists  > 0, such that holds for infinitely many .As ‖ g ‖ > 1 holds for all sufficiently  by (40), it is easy to see that (42) contradicts to (43) as     → 0 and {‖  ‖} is convergent.Thus we can prove the desired result.Lemma 7. Suppose that (A1)-(A2) hold,   → ∞, and ‖  ‖ → 0; then As     → 0, ‖  ‖ → 0, and ‖ g ‖ > , (50) implies that Theorem 8. Suppose that (A1)-(A2) hold.If   → 0 and   → ∞, then there exists one cluster point  * of {  } such that  * is a KKT point of(1)or the RCPLD condition does not hold at  * .Proof.Under the assumptions of this theorem, Lemma 7 implies that there exists an index set  such that {  |  ∈ } converge to some  * , lim →∞,∈ Case of {  } Being Bounded.In this subsection, without loss of generality, we assume that   =  0 for all  ≥ 0. Thus by the update rule (29), we have that   =  0 and We prove this lemma by contradiction.We will show that if ∑ ∈ (‖ g ‖/  ) is convergent, then  +1 <   holds for all sufficiently large  which contradicts to the fact that 1/  → 0, as  → ∞.

Table 1 :
Results of Algorithm 2 and ALGENCAN.