Two Modified Three-Term Type Conjugate Gradient Methods and Their Global Convergence for Unconstrained Optimization

Two modified three-term type conjugate gradient algorithms which satisfy both the descent condition and the Dai-Liao type conjugacy condition are presented for unconstrained optimization. The first algorithm is a modification of the Hager and Zhang type algorithm in such away that the search direction is descent and satisfiesDai-Liao’s type conjugacy condition.The second simple three-term type conjugate gradient method can generate sufficient decent directions at every iteration; moreover, this property is independent of the steplength line search. Also, the algorithms could be considered as a modification of the MBFGS method, but with different z. Under some mild conditions, the given methods are global convergence, which is independent of the Wolfe line search for general functions. The numerical experiments show that the proposed methods are very robust and efficient.

Conjugate gradient method is very efficient for large-scale optimization problems.Generally, this method generates a sequence of iterations: where the step size   is obtained by carrying out some line search rules.The line search in conjugate gradient algorithms is often based on the following Wolfe conditions: where 0 <  ≤  < 1.The direction   is defined by where   is conjugate gradient parameter.Some conjugate gradient methods include the Fletcher-Reeves method (FR), the Hestenes-Stiefel method (HS), the Polak-Ribiere-Polyak method (PRP), the conjugate descent method (CD), the Liu-Storey method (LS), and the Dai-Yuan method (DY) [2][3][4][5][6][7][8].In these methods, the difference is parameter   ; the parameters   of these methods are specified as follows: where   =  +1 −   =     ,   = ∇(  ), and   =  +1 −   .Throughout this paper, we always use ‖ ⋅ ‖ to mean the Euclidean norm.

Motivation
In this paper, the motivation will be described as follows.
Firstly, we will introduce some modified three-term conjugate gradient methods.One of the first three-term conjugate gradient methods was proposed by Beale [9] as where   =  HS  ( FR  ,  DY  etc.), and   is a restart direction.McGuire and Wolfe [10] and Powell [11] made further study on Beale's three-term conjugate gradient method.Dai and Yuan [12] studied the general threeterm conjugate gradient method: where () is the number of the th restart iteration, showing that under some mild conditions the algorithm was global convergence.As we know, Dai and Liao [13] extended the classical conjugate condition    +1 = 0, suggesting the following one: where  ≥ 0 is a scalar.Recently, Andrei [14,15] developed two simple three-term conjugate gradient methods for unconstrained optimization problems.In [14], for the descent condition and Dai-Liao's type conjugacy condition, the threeterm conjugate gradient algorithm was satisfied at every step.The direction  +1 was computed as where   and   were parameters.Similarly, Andrei [15] presented another project three-term conjugate gradient algorithm.The search direction of the algorithms from this class had three terms and was computed as modifications of the classical conjugate gradient algorithms.The search direction also satisfied both descent property and Dai-Liao's type conjugacy conditions.Yao and Qin [16] proposed a hybrid of DL and WYL conjugate gradient methods.The given method [17] possessed the sufficient descent condition under the Wolfe-Powell line search and is global convergence for general functions.Thereafter, they proposed a new conjugacy condition and nonlinear conjugate gradient method; the given method [17] was global convergence under the strong Wolfe-Powell line search.By modifying the HS method or Hager and Zhang method [18], the above methods with satisfying the conjugacy condition type were concluded.
In this paper, we present two modified simple three-term type conjugate gradient methods which are obtained by a modified BFGS (MBFGS) updating scheme of the inverse approximation of the Hessian of the function () restart as the identity matrix at every step.Firstly, in order to interpret the idea of this paper, it is necessary to introduce the MBFGS method [1].If the objective function () is nonconvex, the classical Newton direction may not be a descent direction of () at   since the Hessian matrix   is not necessarily positive definite.To overcome this drawback, Li and Fukushima [1] generated a direction from where  was the unit matrix and a positive constant   was chosen so that   +    was a positive definite matrix.In order to obtain the global convergence, the sequence {  } was bounded above.For the sake of the superlinear convergence, the positive constants   should satisfy   → 0 as  → ∞.If the MBFGS method was quadratic convergence, the positive constants   should meet   ≤ ‖  ‖ for all .
In other words, the global and superlinear convergence of MBFGS method depends on the choice of {  }.Therefore, it was important to select {  } appropriately so that it was practicable and satisfied the above conditions.How to select the optimal {  } is very difficult for us.Actually, the MBFGS method was a modified quasi-Newton method for nonconvex unconstrained optimization problems.In order to better introduce our method, let us simply recall the MBFGS quasi-Newton method.The direction   in the MBFGS method is given by where   is obtained by the MBFGS formula: From the above MBFGS formula,   is computed as the following two cases.On the one hand, where   =  +1 −   and   ∈ [0, ];  is a constant.On the other hand, where   =   ‖  ‖ and   = 1 + max{0, −(    /‖  ‖ 2 )}.
The MBFGS updating of the inverse approximate of the Hessian of function () is In order to compass our object we take a little modification of the inverse MBFGS matrix   , namely,   = ; then (16) will be formulated as but with different   .In this paper, we compute   as follows: From the above formula, we can see that it does not include the   , so we need not choose the parameters {  }.Furthermore, there is more information in   , for instance, the gradient of () at   , that is,   and the constants , .The most important property of   is that   satisfies the Dai-Liao condition; that is, where Moreover,   also satisfies the following inequality: However, Li and Fukushima [1] proved that the   could satisfy the following inequality: The purpose of this paper is to overcome these drawbacks.
Observe that the direction  +1 can be written as therefore, the three-term type conjugate gradient algorithm is given by (2), where the direction is computed as where We organize the paper as follows.In the next section, we describe the three-term type conjugate gradient method and its global convergence.In Section 4, we discuss another modified sufficient descent three-term type conjugate gradient method for unconstrained optimization problems and its global convergence with Wolfe line search.In Section 5, some numerical results are given.Some conclusions and future works will be proposed in the last section.

Three-Term Conjugate Gradient Method and Its Global Convergence
In this section, we will introduce the three-term type conjugate gradient method.
Step 2. Test a criterion for stopping the iterations.If the test criterion ‖  ‖ ≤  is satisfied, then STOP!Otherwise, continue with Step 3.
Step 3. The direction   is computed as follows: where and ,  > 0 are constants.
Step 4. The Wolfe line search is to find a stepsize satisfying where  ∈ (0, 1) are constants.
Step 5. Update the variables: and then go to Step 2.  (25); then one has

Global
Proof.We will divide (25) into two cases as follows.
Case 2. If  > 0, combined with Lemma 2, then we have The proof of Lemma 3 is completed.
Lemma 4. Let assumptions (H1) and (H2) hold; the line search satisfies the Wolfe conditions (28); the search direction is computed by (25); then one has Dai-Liao's type conjugate condition where   > 0.
Proof.By direct computation, we get where since     > 0.
Theorem 6.Let assumptions (H1) and (H2) hold; the line search satisfies the Wolfe conditions (28); the search direction is computed by (25); then one has where where  = 1.

Another Modified Three-Term Type Conjugate Gradient Method and Its Global Convergence
Recently, Zhang et al. proposed a sufficient descent modified PRP conjugate gradient method with three terms [20] as and a sufficient descent modified HS conjugate gradient method with three terms [21] as A property of these methods is that they produce sufficient direction; that is, In the same context, Sun et al. proposed another sufficient conjugate gradient [22,23] as where and where Similar to Zhang and Sun's methods, in order to obtain the sufficient descent property, we will propose a modified three-term type conjugate gradient method; that is, where Algorithm 9 (a sufficient decent three-term conjugate gradient method).
Step 1 is initialization and date.
Step 3. The Wolfe line search is to find a stepsize   .
Lemma 10.Suppose that  0 is a starting point for which assumptions hold.Let {  } be generated by Algorithm 9; then one has Proof.According to Algorithm 9, we will divide (54) into two cases as follows.
Case 2. If  > 0, we get Lemma 10 is true.The sufficient descent property is independent of the line search.
Lemma 11.Suppose that  0 is a starting point for which assumptions hold.Let {  } be generated by Algorithm 9.In addition, there are constants  1 ,  2 , such that where Theorem 12. Let assumption (H1) hold and let {  } and {  } be generated by the three-term conjugate gradient Algorithm 9.
If stepsize   is obtained by the Wolfe line search, then one has Proof.By the use of Lemmas 10 and 11 and the same argument as Theorem 4.1 in [19], we will omit the proof here.

Numerical Experiments
Now, let us report some numerical results attained by our sufficient descent conjugate gradient methods.We compare the performance of Algorithms 1 and 9 with MBFGS method.The algorithm is implemented by MATLAB 7.0 code in double precision arithmetic.The tests are performed on a PC computer with CPU Pentium 4, 2.40 GHz, and Windows XP operation system.On the one hand, the type of objective function and the character of the problems being tested are listed in Tables 1,  2, and 3.In the experiments, for easily comparing with other codes, we use the gradient errors to measure the quality of the solutions; that is, we force the iteration to stop when where   is the gradient of objective function. represents the number of dimensions. represents the number of functions.Nf represents the number of function () evaluations.Ng represents the number of gradient () evaluations. represents the actual CPU-time costed in procedure operation. * represents the problems approximate solutions which are allowed in error range region. represents the problems in [24].First, we compare Algorithm 1, Algorithm 9, and MBFGS method and the numerical test reports and the results of comparison are listed in Tables 1-3.We choose some test problems as our numerical examples and numerical results can be seen in Tables 1-3.The test problems with the given initial points can be found at http://camo.ici.ro/neculai/ansoft.htmwhich were collected by Neculai Andrei.We can see that the problems of [22] are solved by our method.Table 1 shows the numerical results of the three-term conjugate gradient Algorithm 1.In Table 2, by adopting Algorithm 9, these problems have better solutions.Table 3 shows the numerical results of the MBFGS method.In Table 2, the CPU-time is less than 130 seconds and lots of the problems are less than 70 seconds.In Table 1, the iterations of the Penalty function II and Power singular are more than 110, but, in Table 2, the Penalty function II and Power singular can be solved in less than 86 iterations and the time is less than 72 seconds.Table 3 shows the performance of the MBFGS method relative to CPU-time, the number of iterations, the number of function evaluations, and the number of gradient evaluations, respectively.From Table 1 to Table 3, as can be easily seen, Table 3 is better than Table 1 but worse than Table 2 with respect to the number of iterations and CPUtime.Tables 1 and 2 show that the three-term type conjugate gradient methods also have better performance with respect to the number of iterations and gradient evaluations.From Tables 1, 2, and 3, a conclusion is made that Algorithm 9 is better than Algorithm 1 and MBFGS method; that is, the sufficient descent direction is most important for the unconstrained optimization.On the other hand, some of the test problems are from the CUTE collection established by Bongartz, Conn, Gould, and Toint.In Figure 1, we adopt the performance profiles proposed by Dolan and Moré [25] to compare the CPU-time of Algorithm 1, Algorithm 9, MBFGS method, Dai's method, and MPRP method.That is, for each method, we plot the fraction P of problems for which the method is within a factor  of the best time.The left side of the figure gives the percentage of the test problems for which method is the fastest.The right side gives the percentage of the test problems that are successfully solved by each of the methods.The top curve is the method that solved most of the problems in a time that is within a factor  of the best time.From Figure 1, we can see that Algorithm 1 method and Algorithm 9 method perform better than the MBFGS method and Dai's method and MPRP method in [1,19,20].Hence, the proposed methods not only possess better global convergence but also are superior to the three-term type conjugate gradient methods [1,19,20] in the numerical performance.

Conclusions
In this paper, on the one hand, we improve a three-term type conjugate gradient method which is obtained by a MBFGS.On the other hand, we show another sufficient decent threeterm type conjugate gradient method.In addition, under appropriate conditions, we indicate that the two methods are global convergence.Finally, some numerical experiments manifest the efficiency of the proposed methods.Certainly, we should further investigate more useful, powerful, and practical algorithms for solving large-scale unconstrained optimization problems, for instance, the hybrid conjugate gradient-GA and conjugate gradient-PSO methods and so on.

Table 1 :
The numerical results of Algorithm 1.

Table 2 :
The numerical results of Algorithm 9.