A Nonmonotone Weighting Self-Adaptive Trust Region Algorithm for Unconstrained Nonconvex Optimization

A new trust region method is presented, which combines nonmonotone line search technique, a self-adaptive update rule for the trust region radius, and the weighting technique for the ratio between the actual reduction and the predicted reduction. Under reasonable assumptions, the global convergence of themethod is established for unconstrained nonconvex optimization.Numerical results show that the new method is efficient and robust for solving unconstrained optimization problems.


Introduction
Consider the following unconstrained optimization problem: where : R  → R is continuously differentiable.The trust region methods calculate a trial step   by solving the subproblem at each iteration, min   () =  (  ) + 1 2      + where   = ∇(  ) and   is symmetric matrix approximating the Hessian of () at   and Δ  > 0 is a trust region radius at   .Throughout this paper, ‖⋅‖ denotes the Euclidean norm on R  .Define the ratio and the numerator and the denominator are called the actual reduction and the predicted reduction, respectively.The nonmonotone line search technique is firstly proposed by Grippo et al. [1] in line search framework for Newton's method.At each iteration, the selected function value is taken as where (0) = 0, 0 ≤ () ≤ min{( − 1) + 1, },  is a positive integer.Although many algorithms based on (4) work well in many cases, a "good" function value generated as the iteration process is not selected because of the max function in (4), and the choice of  is sensitive to some numerical tests sometimes.To overcome shortages, Zhang and Hager [2] proposed a new nonmonotone line search technique and they used   to replace the function in (4), where where   =  −1  −1 + 1,  0 = 1, and  0 = ( 0 ),  −1 ∈ [0, 1].Numerical tests have shown that the new nonmonotone algorithm is more effective.Many researchers proposed many trust region methods by considering the ratio   and the updating trust region radius to solve effectively unconstrained optimization problem.Dai and Xu [3] proposed the following weighting formula: where  is some positive integer and Many self-adaptive adjustment strategies are developed to update the trust region radius, such as [4][5][6][7][8][9][10][11][12][13][14].In addition, many adaptive nonmonotonic trust region methods have been proposed in literatures [15][16][17][18][19][20][21].
In this paper, we propose a new self-adaptive weighting trust region method based on the nonmonotone technique (4) in [2], the weighting technique (5) in [3], and -function in [6].
The rest of the paper is organized as follows.In Section 2, we define -function to introduce a new update rule and a new nonmonotone self-adaptive trust region algorithm is presented.In Section 3, the convergence properties of the proposed algorithm are investigated.In Section 4, numerical results are given.In Section 5 conclusions are summarized.

𝐿-Function and the New Nonmonotone Self-Adaptive Trust Region Algorithm
To obtain the new trust region radius update rules, we recall -function (),  ∈ R.
Definition 1 (see [6]).A function () is called an -function if it satisfies the following. ( Now we describe the new nonmonotone self-adaptive trust region algorithm.

Convergence of Algorithm 2
In the section, we consider the convergence properties of Algorithm 2. We give the following assumption.
Proof.Since, from Taylor theorem, we have it follows from the definition of where (‖  ‖) arbitrarily decreases with   decreasing.
Lemma 5. Assume that the sequence {  } is generated by Algorithm 2. Then the sequence {  } ⊆ .
The next lemma shows that the loop through Step 3 to Step 5 cannot cycle infinitely and the sequence {  } is well defined.Lemma 6. Suppose that Assumption 3 holds.Assume also   ̸ = 0, and there exists a sufficiently small constant Δ > 0 such that Then holds.
Proof.By Assumption 3 and   ̸ = 0, there exist  > 0 and a positive index  0 such that Combining (11) By (15), we can choose sufficiently small Δ such that and furthermore, for sufficiently large  ≥  0 , For the above , we know that From ( 6) and ( 22), for sufficiently large  ≥  0 + , we know that r formulate always the form From ( 23), for sufficient large , we have 2 −  2 ≥ r ≥  2 .By Algorithm 2 and the definition of -function, we have Δ +1 ≥ Δ  , where Δ  falls below Δ.
We will show the global convergence of Algorithm 2.
Theorem 7. Suppose that Assumption 3 holds.Let the sequence {  } be generated by Algorithm 2. Then Proof.For the purpose of deriving a contradiction, suppose that there exists a positive constant  > 0 such that           ≥  > 0.
For convenience, we denote one index set as follows: First, assume that the set  has infinite elements.That is, for any  ∈ , r ≥  2 holds.For any  ∈ , using Algorithm 2 and (12), we have that Thus, from (27), From ( 5) and ( 28), we have that From Lemma 3.1 in [22] and Assumption 3 (i), we know the sequence {  } is nonincreasing and convergent.Then which contradicts (16).Next, we assume that the set  has finite elements.Then, for sufficient large , we have that r <  2 .From the definition of -function and Steps 5 and 6 in Algorithm 2, we have that the trust region Δ  is decreasing as the iteration process.Furthermore, the limit lim holds, which gives a contradiction to (16).The proof is completed.

Numerical Experiments
In this section, we present preliminary numerical results to illustrate the performance of Algorithm 2, denoted by NTRW.
In all algorithms in this paper, the matrix   is updated by BFGS formula [26,27].The trial step   , for smallscale problems, is computed by trust m file in Optimization Toolbox of Matlab, for middle-scale and large-scale problems, and is computed by CG-Steihaug algorithm in [26].The iteration is terminated by the following condition: where  = 10 −5 .In Tables 1, 2, 3, and 4, we give the dimension (Dim) of each test problem (P), the number  of iterations, the number  of function evaluations, and the CPU (cpu) time for solving the test problem.
In Table 2, we compare 43 small-scale problems for the four algorithms, and the results are concluded as follows: (i) 19 problems where NTRW was superior to BTR, (ii) 11 problems where BTR was superior to NTRW, (iii) 13 problems where NTR2 was superior to NTR1, (iv) 5 problems where NTR1 was superior to NTR2, (v) 25 problems where NTRW was superior to NTR2, (vi) 10 problems where NTR2 was superior to NTRW.
For problems 12, 18, 24, 30, and 36 especially, the iterations of four algorithms are similar while  of NTRW are much less than the others.It means that the number of subproblem evaluations of NTRW is much less than the others.Therefore, our self-adaptive technique is efficient.For problems 10, 20, 27, 32, and 38, NTRW is superior to the others clearly.And  of NTRW is less than the others.So the performance of our algorithm is better than the others.
In Table 3, we compare 25 middle-scale problems of the four algorithms.There are 12 problems that show NTRW is much superior than the others, 7 problems that show the performance of the four algorithms is similar, and only 4 problems that show NTRW is bad.
In Table 4, we compare 10 large-scale problems of the four algorithms.There are 5 problems that show NTRW is much superior than the others, 4 problems that show the performance of the four algorithms is similar, and only 1 problem that shows NTRW is bad.Note that, for problems 33 and 35, the iteration of our algorithm is similar to NTR2, while the CPU time is much more than NTR2.Exponential function called in Matlab environment maybe consume more time, which is contained in -function in (32).
Further result is shown in Figures 1 and 2, which is characterized by means of performance profile proposed in [28].The performance ratio () is the probability for solver  for the test problems, where a log-scale ratio is not greater than the factor .More details are founded in [28].As we can see from Figures 1 and 2, NTRW is obviously superior than NTR1 and NTR2 in the number of iterations and function evaluations.NTRW is superior than the other three algorithms in the number of function evaluations.

Conclusion
This paper presents a nonmonotone weighting self-adaptive trust region algorithm for unconstrained nonconvex optimization.The new algorithm is very simple and easily implemented.The convergence properties of the method    are established under reasonable assumptions.Numerical experiments show that the new algorithm is quite robust and effective, and the numerical performance is comparable to or better than that of other trust region algorithms in the same frame.

Figure 1 :
Figure 1: Performance profile comparing the number of iterations.

Figure 2 :
Figure 2: Performance profile comparing the number of function evaluations.

Table 2 :
Numerical comparisons for some small-scale test problems.
* * means that the algorithm reaches 500 iterations.

Table 3 :
Numerical comparisons for some middle-scale test problems.
* * means that the algorithm reaches 5000 iterations.