A Truncated Descent HS Conjugate Gradient Method and Its Global Convergence

Recently, Zhang 2006 proposed a three-term modified HS TTHS method for unconstrained optimization problems. An attractive property of the TTHS method is that the direction generated by the method is always descent. This property is independent of the line search used. In order to obtain the global convergence of the TTHS method, Zhang proposed a truncated TTHS method. A drawback is that the numerical performance of the truncated TTHS method is not ideal. In this paper, we prove that the TTHS method with standard Armijo line search is globally convergent for uniformly convex problems. Moreover, we propose a new truncated TTHS method. Under suitable conditions, global convergence is obtained for the proposed method. Extensive numerical experiment show that the proposed method is very efficient for the test problems from the CUTE Library.


Introduction
Consider the unconstrained optimization problem: where f is continuously differentiable.Conjugate gradient methods are very important methods for solving 1.1 , especially if the dimension n is large.The methods are of the form where g k denotes the gradient of f at x k , α k is the step length obtained by a line search and β k is a scalar.The strong Wolfe line search is to find a step length α k such that where δ ∈ 0, 1/2 and σ ∈ δ, 1 .In the conjugate gradient methods field, it is also possible to use the Wolfe line search 1, 2 , which calculates an α k satisfying 1.4 and In particular, some conjugate gradient methods admit to use the Armijo line search, namely, the step length α k can be obtained by letting α k max{βρ j , j 0, 1, 2, . ..} satisfy where 0 < β ≤ 1, 0 < ρ < 1, and 0 < δ 1 < 1. Varieties of this method differ in the way of selecting β k .In this paper, we are interested in the HS method 3 , namely, Here and throughout the paper, without specification, we always use • to denote the Euclidian norm of vectors, y k−1 g k − g k−1 and s k α k d k .
We refer to a book 4 and a recent review paper 5 about progress of the global convergence of conjugate gradient methods.We know that the study in the HS method has made great progress.In practical computation, the HS method is generally believed to be one of the most efficient conjugate gradient methods.Theoretically, the HS method has the property that the conjugacy condition always holds, which is independent of line search used.Expecting the fast convergence of the method, Dai and Liao 6 modified the numerator of the HS method to obtain DL method by using the secant condition of quasi-Newton methods.Due to Powell's 7 example, the DL method may not converge with exact line search for general function.Zhang 12 proposed the TTHS method.The sufficient descent property of the TTHS method is also independent of line search used.In order to obtain the global convergence of the TTHS method, Zhang truncated the search direction of the TTHS method.Numerical experiments in 12 show the truncated TTHS method is not very effective.In this paper, we further study the TTHS method.We prove that the TTHS method with standard Armijo line search is globally convergent for uniformly convex problems.To improve the efficiency of the truncated TTHS method, we propose a new truncated strategy to the TTHS method.Under suitable conditions, global convergence is obtained for the proposed method.Numerical experiments show that the proposed method outperforms the known CG DESCENT method.The paper is organized as follows.In Section 2, we propose our algorithm.Convergence analysis is provided under suitable conditions.Preliminary numerical results are presented in Section 3.

Global Convergence Analysis
Recently, Zhang 12 proposed a three-term modified HS method as follows where An attractive property of the TTHS method is that the direction always satisfies which is independent of the line search used.In order to obtain the global convergence of the TTHS method, Zhang truncated the TTHS method as follows where ε 1 and r are positive constants.Zhang proved that the truncated TTHS method converges globally with the Wolfe line search 1.4 and 1.6 .However, numerical results show the truncated TTHS method is not very effective.In this paper, we will study the TTHS method again.In the rest of this section, we will establish two preliminary convergent results for the TTHS method.
i Uniformly convex functions: converge globally with the standard Armijo line search 1.7 .
ii General functions: converge globally with the strong Wolfe line search 1.4 and 1.5 by using a new truncated strategy to the TTHS method.
In order to establish the global convergence of our method, we need the following assumption.
ii In some neighborhood N of Ω, f is continuously differentiable and its gradient is Lipschitz continuous, namely, there exists a constant L > 0 such that Under Assumption 2.1, It is clear that there exist positive constants B and γ such that Combining with

2.10
On the other hand, if α k / β, by the line search rule, then ρ −1 α k does not satisfy 1.7 .This implies By the mean-value theorem, there exists μ k ∈ 0, 1 such that This together with 2.11 implies Since g is Lipschitz continuous, the last inequality shows 2.14 That is

2.15
This implies that there is a constant M 1 > 0 such that

2.16
Inequality 2.10 together with 2.16 shows that with some constant M 2 > 0. Summing these inequalities, we obtain 2.7 .
The following theorem establishes the global convergence of the TTHS method with the standard Armijo line search 1.7 for uniformly convex problems.Theorem 2.3.Suppose that Assumption 2.1 holds and f is a uniformly convex function.Consider the TTHS method, where α k is obtained by the Armijo line search 1.7 , one has that Proof.We proceed by contradiction.If 2.18 does not hold, there exists a positive constant ε such that for all k g k ≥ ε.

2.19
From Lemma 2.2, we get

2.20
Since f is a uniformly convex function, there exists a constant μ > 0 such that

2.21
This means

2.23
This implies

2.24
This yield a contradiction with 2.20 .
We are going to investigate the global convergence of the TTHS method with the strong Wolfe line search 1.4 and 1.5 .Similar to the PRP method 8 , we restrict β HS k max{β HS k , 0}.In this case, the search direction 2.1 may not be a descent direction.Noting the search direction 2.1 can be rewritten as where β k β HS k .Since the term g T k y k−1 may be zero in practice computation, we consider the following search direction

2.29
From 2.26 , we have

2.30
Since u k are unit vectors, we have

Mathematical Problems in Engineering
Then we have

2.33
Now, we evaluate the quantity v k .If g T k y k−1 ≥ c g k 2 , by 1.5 , we have

2.34
By the strong Wolfe condition 1.5 and the relation 2.2 , we obtain

2.35
Inequalities 2.34 and 2.35 yield This implies The relation 2.37 also holds.It follows from the definition of r k , Lemma 2.2, 2.27 and 2.37 that

2.38
By 2.33 , we get the conclusion 2.28 .
The next theorem establishes the global convergence of method 2.26 with the strong Wolfe line search 1.4 and 1.5 .The proof of the theorem is similar to 15, Theorem 3.2 .Theorem 2.5.Suppose that Assumption 2.1 holds.Consider {x k } be generated by the method 2.26 , where α k is obtained by the strong Wolfe line search 1.4 and 1.5 , one has Proof.We assume that the conclusion 2.39 is not true, then there exists a constant ε > 0 such that for all k g k ≥ ε.

2.40
The proof is divided into the following three steps.

2.41
Step 2. A bound on the steps s k .This is a modified version of 8, Theorem 4.3 .Observe that for any l ≥ k, Taking norms and by the triangle inequality to the last equality, we get from 2.5 that

2.43
Let Δ be a positive integer, chosen large enough that where C 1 σγ 2 /ε 2 C 1 .By Lemma 2.4, we can chose k 0 large enough that i≥k 0

2.45
If j > k ≥ k 0 and j − k ≤ Δ, then by 2.45 and the Cauchy-Schwarz inequality, we have

2.46
Combining this with 2.43 yields Step

2.48
If g T l y l−1 < c g l 2 , then d l −g l , we know that the relation 2.48 also holds.Define S i 2C 2 s i 2 , we conclude that for l > k 0 ,

2.49
Proceeding the similar proof as the case III of 15, Theorem 3.2 , we get the conclusion.

Numerical Experiments
In this section, we report some numerical results.We tested 111 problems that are from the CUTE 13 library.We compared the performance of the method 2.26 with the CG DESECENT method.The CG DESECNT code can be obtained from Hager's web page at http://www.math.ufl.edu/hager/papers/CG.In the numerical experiments, we used the latest version-Source code Fortran 77 Version 1.4 November 14, 2005 with default parameters.We implemented the method 2.26 with the approximate Wolfe line search in 5 .Namely, the method 2.26 used the same line search and parameters as the CG DESECENT method.The stop criterion is that the inequality g x ∞ ≤ max{10 −8 , 10 −12 ∇f x 0 ∞ } is satisfied or the iteration number exceeds 4 × 10 4 .All codes were written in Fortran 77 and run on a PC with PIII 866 processor and 192 RAM memory and Linux operation system.Detailed results are posted at the following web site: http://hi.814e.com/wanyoucheng/results.htm.
We adopt the performance profiles by Dolan and Moré 14 to compare the performance between different methods.That is, for each method, we plot the fraction P of problems for which the method is within a factor τ of the best time.The left side of the figure gives the percentage of the test problems for which a method is the fastest; the right side gives the percentage of the test problems that are successfully solved by each of the methods.The ii mhs : the method 2.26 with the same line search as "cg-descent" and c 10 −8 .
From Figures 1-4, it is clear that the "mhs " method outperforms the "cg-descent" method.

Figure 1 :
Figure 1: Performance based on the number of iteration.

Figure 2 :Figure 3 :
Figure 2: Performance based on the number of function evaluations.

Figure 4 :
Figure 4: Performance based on CPU time.
0}.It is clear that the relation 2.2 always holds.For simplicity, we regard the method defined by 1.2 and 2.26 as the method 2.26 .Now, we describe a lemma for the search directions, which shows that they change slowly, asymptotically.The lemma is similar to 8, Lemma 3.4 .Suppose that Assumption 2.1 holds.Consider {x k } be generated the method 2.26 , where α k is obtained by the strong Wolfe line search 1.4 and 1.5 .If there exists a constant ε > 0 3. A bound on the direction d l determined by 2.26 .