A Globally Convergent Hybrid Conjugate Gradient Method and Its Numerical Behaviors

→ R n is continuous.Throughout this paper, this problem corresponds to optimality condition of a certain problem of minimizing f : Rn → R which may be not easy to calculate or cannot be expressed as elementary functions. When the dimension n is large, conjugate gradient methods can be efficient to solve problem (1). For any given starting point x 0 ∈ R , a sequence {x k } is generated by the following recursive relation:


Introduction
Consider the following problem of finding an  ∈   such that where   is the -dimensional Euclidean space and  :   →   is continuous.Throughout this paper, this problem corresponds to optimality condition of a certain problem of minimizing  :   →  which may be not easy to calculate or cannot be expressed as elementary functions.
When the dimension  is large, conjugate gradient methods can be efficient to solve problem (1).For any given starting point  0 ∈   , a sequence {  } is generated by the following recursive relation: with where   is a steplength,   is a descent direction,   stands for (  ), and   is a parameter.Different choices of   result in different nonlinear conjugate gradient methods.The Dai-Yuan (DY) formula [1] and the Hestenes-Stiefel (HS) formula [2] are two famous ones, and are given by respectively, where ‖ ⋅ ‖ means the 2-norm and  −1 =   −  −1 .Other well-known formulae for   such as the Fletcher-Reeves formula [3], the Polak-Ribière-Polyak formula [4,5], and the Hager-Zhang formula [6], please refer to [7,8] for further survey.
In [1], the steplength   is obtained by the following weak Wolfe line search:  (  +     ) ≤  (  ) +        ; (5) where 0 <  <  < 1.With the same line search, two hybrid versions related to the DY method and the HS method were proposed in [9], which generate the parameter   by respectively.And initial numerical results in [10] suggested that the two hybrid conjugate gradient methods (abbreviated as DYHS and DYHS+, resp.) are efficient, especially the DYHS+ method performed better.The line search plays an important role in the efficiency of conjugate gradient methods.Hager and Zhang [6] showed that the first condition (5) of the weak Wolfe line search limits the accuracy of a conjugate gradient method to the order of the square root of the machine precision (see also [11,12]); thus, in order to get higher precision, they proposed approximate Wolfe conditions [6,7], where 0 <  < 1/2 and  <  < 1, which are usually used combining with the weak Wolfe line search.However, there is no theory to guarantee convergence in [6][7][8].By following a referee's suggestion, we adapt the approximate Wolfe conditions to a Dai-Yuan hybrid conjugate gradient method and investigate its numerical performances.More recently, Dong designed a practical Armijo-type steplength rule [13] only using gradient, please see [14] for a more conceptual version, the steplength   is chosen by the following steps: choose 0 <  < 1, compute some appropriate initial steplength   > 0, determine a real number   (see (12)) and find   to be the largest  ∈ {    |  = 0, 1, 2, . ..} such that where  is a nonnegative integer and 0 <  < 1.
The main differences between the practical steplength rule and the weak Wolfe conditions are that the former does not require function evaluations, it is in high accuracy and has broader application scope.The feature of high accuracy is supported by the corresponding theory analysis in [6,11].Numerical results reported in [13] also imply that the line search ( 9) is efficient and highly accurate.So, it is meaningful to imbed the line search into the hybrid conjugate gradient method with parameter  DYHS+  and to check its efficiency.This paper is to solve problem (1), which corresponds to optimality condition of a certain problem of minimizing  :   → .If the original function  cannot be expressed as elementary functions, the weak Wolfe conditions cannot be applied directly, while the practical steplength rule (9) and the approximate Wolfe conditions (8) can be used to solve this kind of nonlinear unconstrained minimization problems.So, in order to investigate the numerical performances of the two modified methods with steplength rules (9) and (8) and to confirm their broader application scope, two classes of test problems are selected.One class is composed of unconstrained nonlinear optimization problems from the CUTEr library, and the other class is composed of some boundary value problems.
The rest of this paper is organized as follows.In Section 2, we give some basic definitions and properties used in this paper.In Section 3, we describe two modified versions of the Dai-Yuan hybrid conjugate gradient method with the line search (9) and the approximate Wolfe conditions (8) and illustrate that if  is Lipschitz continuous, the former version is convergent in the sense that lim inf  → +∞ ‖  ‖ = 0.In Section 4, we test the modified hybrid conjugate gradient methods over two classes of standard test problems and compare them with the DYHS method and the DYHS+ method.Finally, some conclusions are given in Section 5.

Preliminaries
In this section, we give some basic definitions and related properties which will be used in the following discussions.
By using the definition above, we know that if the gradient  :   →   is -Lipschitz continuous, then for any given   ,   ∈   , the gradient  must be   -monotone along the ray {  +   :  ≥ 0} for some   ∈ [−, ].Then, how to evaluate such   ?In [13], it is suggested to evaluate using the following approximation formula: Lemma 3.For any given ,  ∈   and for all  ≥ 0, if  :   →  is continuously differentiable and the given gradient  :   →   is -monotone along the ray { +  :  ≥ 0}, then the following inequality holds: Proof.Please see [13] for a detailed proof.

Algorithm and Its Convergence
In this section, we first formally describe the hybrid conjugate gradient method.Then, we give its two modified versions and illustrate that the modified version with steplength rule ( 9) is convergent.
Step 2. Compute the new iterate by ( 2) and the new search direction by Set  :=  + 1 and go to Step 1. Denote (i) We abbreviate Algorithm 4 as MDYHS+, if the steplength rule in Step 1 is finding   to be the largest holds, where   is defined in (12) and 0 <  < 1.
(ii) And we abbreviate Algorithm 4 as MDYHS + 1, if   in Step 1 is located to satisfy the approximate Wolfe conditions (8).It should be noticed that on the face of it, the approximate Wolfe conditions only use gradient information to locate the steplength, while they require function evaluations in practical implementations in [6][7][8], please refer to [8, pages 120-125] for details.So, the algorithm in [6][7][8] to generate the steplength satisfying the approximate Wolfe conditions ( 8) is not applicable.Here, we determine the steplength   of the MDYHS+1 method following the inexact line search strategies of [15, Algorithm 2.6].Detailed steps are described in Algorithm 6.The initial value of   is taken to be   .
Remark 5.The choice of   comes from [13].Since it is important to use current information about the algorithm and the problem to make an initial guess of  *  , the author of [13] uses the relation (  +  *    )    = 0 and to give an approximation to the optimal steplength through Now, we describe the line search algorithm in [15], which is very close to one suggested in [16].Algorithm 6. Step 0. Set  = 0 and V = +∞.Choose  > 0. Set  := 0.
Step 1.If  does not satisfy then set  :=  + 1, and go to Step 2. If  does not satisfy then set  :=  + 1, and go to Step 3. Otherwise, set   := , and return.
Next, we analyze the convergence properties of the MDYHS+ method.Lemma 7. Consider the previous MDYHS+ method.If   ̸ = 0 for all  ≥ 0, then the steplength   is well-defined, namely,   can be found to satisfy (16) after a finite number of trials.The search direction   satisfies the sufficient descent condition Furthermore,   +1   < 0.
Proof.We prove the desired results by induction.When  = 0, by using  0 = − 0 , we have We now show that the steplength  0 will be determined within a finite number of trials.Otherwise, for any  = 0, 1, 2, . .., the following inequality holds where 0 <  < 1.Since  is continuous, taking the limits with respect to  on the both sides of (23) yields this is a contradiction.Then,  0 is well-defined and Assume that (21) holds for  − 1 and     −1 < 0. A similar discussion to the  = 0 case yields   is well-defined.Multiplying    by we have that (33) Combining (33) with (16) yields which, together with (21), implies Next, we follow [13] to consider two possible cases.

Numerical Experiments
In this section, we did some numerical experiments to test the performances of the MDYHS+ method and the MDYHS + 1 method.One purpose of this section is to compare them with the DYHS method and the DYHS+ method.The other purpose is to confirm their broader application scope by solving boundary value problems.So, two classes of test problems were selected here.One class was drawn from the CUTEr library [17,18], and the other class came from [19].More information was described in the following subsections.
For the MDYHS+ method, we set  = 10 −4 and  = 0.5.For the MDYHS+1 method, we followed [8] to choose  = 0.9 and  = 0.1.And for the hybrid conjugate gradient methods DYHS and DYHS+, the values of  and  in ( 5) and ( 6) were taken to be 0.01 and 0.1, respectively.The initial value of   was taken to be 1/‖ 0 ‖ for the first iteration and  −1   −1  −1 /(     ) for  ≥ 1 (see [10]).For all the methods, the largest trial times of choosing steplength at each iteration was taken to be  max = 30, and the stopping criterion used was ‖  ‖ ∞ ≤ .In order to understand the numerical performance of each method deeply, we did numerical experiments with  = 10 − ,  = 3, 6, 9, 12.Our computations were carried out using Matlab R2012a on a desktop computer with an Intel(R) Xeon(R) 2.40 GHZ CPU and 6.00 GB of RAM.The operating system is Linux: Ubuntu 8.04.

4.1.
Tested by Some Problems in the CUTEr Library.In this subsection, we implemented four different hybrid conjugate gradient methods and compared their numerical performances.Because the DYHS method and the DYHS+ method  1.The first column "Prob." denotes the problem number, and the columns "Name" and "" denote the name and the dimension of the problem, respectively.Since we were interested in large-scale problems, we only considered problems with size at least 100.The largest dimension was set to 10,000.Moreover, we accessed CUTEr functions from within Matlab R2012a by using Matlab interface.
Our numerical results were reported in Tables 2, 3, 4, 5, and 6 in the form of //, where , , and  stand for the number of iterations, the total trial times of the line search and the CPU time elapsed, respectively.For the DYHS+ and the DYHS, we let   and   be the number of function evaluations and the number of gradient evaluations, respectively, and set  =   /3 +   by automatic differentiation (see [10,20] for details).Moreover, "-" means the method's failure to achieve a prescribed accuracy when the number of iterations exceeded 50, 000, and the test problems are represented in the form of #Pro.().
The performances of the four methods, relative to CPU time, were evaluated using the profiles of Dolan and Morè [21].That is, for the four methods, we plotted the fraction  of problems for which each of the methods was within a factor  of the best time.Figures 1, 2, and 3 showed the performance profiles referring to CPU time, the number of iterations and the total trial times of the line search, respectively.These figures revealed that the MDYHS+ method and the MDYHS + 1 method performed better than the DYHS method and the DYHS+ method.The performance profiles also showed that the MDYHS+ method and the MDYHS + 1 method were comparable and solved almost all of the test problems up to  = 10 −9 .Yet, the latter has no convergence.

Tested by Some Boundary Value Problems.
In this section, we implemented the MDYHS+ method and the MDYHS + 1 method to solve some boundary value problems.See [22, Chapter 1] for the background of the boundary value problems.
In order to confirm the efficiency of the MDYHS+ method and the MDYHS + 1 method to solve this class of problems, We drew a set of 11 boundary value problems from [19] and listed them in Table 7, where the test problems were expressed by #Pro.()(#Pro.denotes the problem number in [19] and  denotes the dimension), and the test results were listed in the form of //.
From Table 7, we can see that both of the MDYHS+ method and the MDYHS + 1 method are efficient in solving boundary value problems.The MDYHS + 1 method seems a little better but has no convergence.

Conclusions
This paper has studied two modified versions of a Dai-Yuan hybrid conjugate gradient method with two different line searches only using gradient information and has proven that with the line search (9), it is convergent in the sense of lim inf  → ∞ ‖  ‖ = 0.Then, we investigated the numerical behaviors of the two modified versions over two classes of standard test problems.From the numerical results, we can conclude that the two modified hybrid conjugate gradient methods are more efficient (especially in high precision) in solving large-scale nonlinear unconstrained minimization problems and have broader application scope.For example, they can be used to solve some boundary value problems, where functions are not explicit.revised version.This work was supported by National Science Foundation of China, no.60974082.

Figure 1 :
Figure 1: Performance profiles based on CPU time.

𝜖 = 10 Figure 2 :
Figure 2: Performance profiles based on the number of iterations.

Figure 3 :
Figure 3: Performance profiles based on the number of inner iterations.

Table 1 :
List of test problems from the CUTEr library.

Table 2 :
Numerical results for test problems from the CUTEr library.