A Nonmonotone Trust Region Algorithm Based on the Average of the Successive Penalty Function Values for Nonlinear Optimization

We present a nonmonotone trust region algorithm for nonlinear equality constrained optimization problems. In our algorithm, we use the average of the successive penalty function values to rectify the ratio of predicted reduction and the actual reduction. Compared with the existing nonmonotone trust region methods, our method is independent of the nonmonotone parameter. We establish the global convergence of the proposed algorithm and give the numerical tests to show the efficiency of the algorithm.

Most traditional trust region methods are of descent type methods; namely, they accept only a trial point as the next iterate if its associated merit function value is strictly less than that of the current iterate. However, just as pointed out by Toint [10], the nonmonotone techniques are helpful to overcome the case that the sequence of iterates follows the bottom of curved narrow valleys, a common occurrence in difficult nonlinear problems. Hence many nonmonotone algorithms are proposed to solve the unconstrained and constrained optimization problems [11][12][13][14][15][16][17][18][19][20]. Numerical tests show that the performance of the nonmonotone technique is superior to those of the monotone cases.
Although the nonmonotone technique based on (2) works well in many cases, there are some drawbacks. Firstly, a good function value generated in any iteration is essentially discarded due to the maximum in (2). Secondly, in some cases, the numerical performance is heavily dependent on the choice of (see, e.g., [16,21]). To overcome these drawback, Zhang and Hager [21] proposed another nonmonotone algorithm, and they used the average of function values to replace the maximum function value in (2). The numerical tests show that their nonmonotone line search algorithm used fewer function and gradient evaluations, on average, than either the monotone or the traditional nonmonotone scheme.

ISRN Operations Research
Recently, Mo and Zhang [16] extended Zhang and Hager's nonmonotone technique to unconstrained optimization with trust region global scheme and discussed the global and local convergence of the proposed algorithm.
In this paper, we further extend the nonmonotone technique [16,21] to equality constrained optimization. To design our algorithm, we first introduce some notations as follows: denote ( ) = ∇ ( ) and ( ) = (∇ 1 ( ), ∇ 2 ( ), . . . , ∇ ( )) ∈ × . Assuming that ( ) has full column rank, we define the projective matrix and the Lagrange function where is a projective version of the multiplier vector as follows: For convenience, we denote the previous quantities at by , , , , , and . At each iteration, we calculate the trust region trial step as follows (see [22]): firstly, we calculate Then we solve the trust region subproblem where denotes the Hessian matrix of the Lagrange function ( , ), Δ > 0 is the trust region radius. Let be the solution of (8) and The trust region trial step is taken as To test whether the point + can be accepted as the next iteration, we use the Fletcher's exact penalty function as the merit function as follows: where > 0 is the penalty parameter.
To define our nonmonotone algorithm, we define where −1 ∈ [ min , max ], min ∈ [0, 1], max ∈ [ min , 1), and min , max are two chosen parameters. From (12) and (13), we observe that is a convex combination of the function values so is regarded as the weighted average of the merit function values. The paper is organized as follows. We describe our algorithm in Section 2 and analyze the global convergence in Section 3. The numerical tests are given in Section 4, and the conclusion is presented in Section 5.

Algorithm
In this section, we give the details of the nonmonotone trust region algorithm. We first recall the definition of a stationary point of problem (1). A point is called a stationary point of problem (1) if it satisfies We define the actual reduction from to + by and the nonmonotone actual reduction by The predicted reduction is defined as Furthermore, we define the monotone ratio by = Ared Pred (19) and the nonmonotone ratio by where is computed by (12) and (13). The description of the algorithm is given as follows.
Step 2. Compute the trust region trial step . Step and then set Step 4. Compute by (12) and (13), and compute the Nr by (20).

Global Convergence
In this section, we discuss the global convergence of Algorithm 1. The following assumptions are needed in our convergence analysis: Assumptions (A1) The sequence { } and { + } are contained in a compact set Ω.
We define two index sets as follows: The following lemmas (Lemmas 2-5) are helpful to analyze the convergence of the Algorithm 1, and the proofs are similar to [4].

Lemma 2.
Assume that (A1)-(A3) hold, and then there exists a positive constant 1 such that The following lemma shows the monotonicity property of the function sequence { }. Lemma 6. Suppose that { } is generated by Algorithm 1. Then the following inequality holds for all : Proof. We first prove that (29) holds for all ∈ ; that is, For ∈ I, according to Lemma 2, Assumptions (A1) and (A2), we obtain According to (8)-(13), we have the following inequality:
We consider two cases.

Theorem 7. Suppose that the Assumptions (A1)-(A3) hold and the sequence { } is generated by Algorithm 1. Then the algorithm is well defined.
Proof. Since the algorithm does not stop in Step 2, then we have either ‖ ‖ ̸ = 0 or ‖ ‖ ̸ = 0. We prove the conclusion by contradiction; if the conclusion is not true, by the algorithm, we have +1 = , but Case 1 (‖ ‖ ̸ = 0). Then from Lemmas 2 and 4, we have which means that > for large enough, according to Lemma 6, and we have that NAred , = ( + ) − ( + ) > Ared , so Nr ≥ > , which contradicts (43).
Case 2 (‖ ‖ = 0). In this case, we have ‖V ‖ = 0 and ‖ ‖ ̸ = 0, By Lemma 3, and we can have Combining with Lemma 4, we have Then similar to Case 1, we can get a contradiction. Combining Cases 1 and 2, we can get the conclusion.
Similar to Lemma 7.11 in [4], we get the proposition of the penalty parameter as follows.
Without loss of generality, we assume that = ⋆ for all . The following theorem gives the convergence proposition of the constraint sequence {‖ ‖}.
Proof. First, we prove that Assume by contradiction that (48) does not hold, then there exists a constant > 0 such that ‖ ‖ ≥ for all . According to Lemma 6, we have By using (13), we can prove that Adding all the previous inequalities and by Lemma 2, we have By Assumption (A1), we know that 1 − +1 is bounded, let → ∞, and we have Since ‖ ‖ ≥ for all , we have lim → ∞ Δ = 0. But similar to the proof of Theorem 7, we get Nr > , and therefore we have Δ +1 > Δ , which contradicts to lim → ∞ Δ = 0. This contradiction shows that (48) holds. Next we prove (47). Assume that (47) does not hold, then there exist a subsequence { } and a positive constant 1 such that On the other hand, according to (48) we know that there exists another subsequence { } such that for 2 = 1 /2, we have We define K = { | ≤ ≤ }. According to Lemma 2, we get the following inequality: By Assumption (A1), is bounded, so we have that min{ 2 /2, Δ } = 2 /2 can be true only finite number of times. Thus there exists 1 > 0 such that for > 1 , we have min{ 2 /2, Δ } = Δ . Hence for > 1 , we have Then we know that Now, for large , Since ( ) is continuous, thus for large enough we have ‖ − ‖ < 2 , and this contradicts to the assumption ‖ ‖ ≥ 2 2 , which means that (47) holds. Proof. Similar to the proof of Theorem 4 in [18].
Based on Theorems 9 and 10, we get the following global convergence result.

Numerical Tests
In this section, we test our algorithm for some typical problems. The program code was written in MATLAB and run in MATLAB 7.1 environment. The parameters in our algorithm are taken as follows: Δ 0 = 0.1, 0 = 1, = 0.1, 1 = 0.2, 2 = 0.8, 3 = 1.2, ≡ 0.75, and 0 = , and is updated by BFGS formulas as follows: where For deciding when to stop the execution of the algorithm declaring convergence we used the criterion ‖ ‖ + ‖ ‖ ≤ 10 −5 . We also stop the execution when 500 iterations were completed without achieving convergence and denoted by fail. Our test problems are chosen from [23], and the problems are numbered in the same way as in [23]. For example, HS28 is the problem 28 in [23]. To test the efficiency of our algorithm, we compare our algorithm with the algorithms in [15,18], where we choose the nonmonotone parameter = 5.
The test results are given in Table 1: here we use No. to denote the number of the test problems, and denote the number of gradient estimation and the function value estimation, and Time denotes the CPU time when the algorithm is terminated.
From Table 1, we see that our algorithm spend more CPU time than algorithms [15,18], but we use less function value estimation and gradient value estimation for most of the test problem. These numerical tests show that our algorithm works quiet well.

Conclusion
In this paper, we presented a nonmonotone trust region method based on the weighted average of the successive penalty values for equality constrained optimization. Compared with the existing nonmonotone trust region methods for constrained optimization, our method is independent on the nonmonotone parameter . The numerical comparison with some nonmonotone trust region methods shows the efficiency of our proposed method. How to obtain the local fast convergence of our method deserves further study, and we leave it as the future work. Table 1: Test results for our method and the methods in [15,18].

No.
Our method The method in [18]