A New Modified Three-Term Hestenes – Stiefel Conjugate Gradient Method with Sufficient Descent Property and Its Global Convergence

This paper describes amodified three-termHestenes–Stiefel (HS)method.The originalHSmethod is the earliest conjugate gradient method. Although the HS method achieves global convergence using an exact line search, this is not guaranteed in the case of an inexact line search. In addition, the HS method does not usually satisfy the descent property. Our modified three-term conjugate gradient method possesses a sufficient descent property regardless of the type of line search and guarantees global convergence using the inexact Wolfe–Powell line search. The numerical efficiency of the modified three-term HS method is checked using 75 standard test functions. It is known that three-term conjugate gradient methods are numerically more efficient than two-term conjugate gradient methods. Importantly, this paper quantifies how much better the three-term performance is compared with two-termmethods.Thus, in the numerical results, we compare our newmodification with an efficient two-term conjugate gradient method. We also compare our modification with a state-of-the-art three-term HS method. Finally, we conclude that our proposed modification is globally convergent and numerically efficient.


Introduction
In the field of optimization conjugate gradient methods are a well-known approach for solving large-scale unconstrained optimization problems.The conjugate gradient (CG) methods are simple and have relatively modest storage requirements.This class of methods has a vast number of applications in different areas, especially in the field of engineering [1][2][3].
In (2),   > 0 is a general line search and   is a search direction given by where   is a parameter of the CG method.The six pioneering forms of   are defined in [4][5][6][7][8][9][10].Line searches may be exact or inexact.Exact line searches are time consuming, computationally expensive, and difficult and require large amounts of storage [11][12][13].Thus, inexact line search techniques are often adopted because of their efficiency and global convergence properties.Well-known This is known to be the first of all the CG parameters.This method ensures the global convergence of the exact line search.A nice property of the HS method is that it satisfies the conjugacy condition, regardless of whether the line search is exact or inexact [38].However, this method does not satisfy the global convergence property when used with an inexact line search.
In this paper, the method of Zhang et al. [25] is modified with the help of another efficient CG parameter proposed by Wei et al. [39].An attractive feature of the new threeterm HS method is that it satisfies the sufficient descent condition regardless of the line search used.Furthermore, our modification is globally convergent for both convex and nonconvex functions when using an inexact line search.Numerical experiments show that the new modification is more efficient and robust than the MTTHS algorithm proposed by Zhang et al. [25].The second aspect of this paper is to quantify the improvement of the three-term CG method over two-term approaches.To do this, we consider the efficient two-term CG method [40] given by This DHS [40] method is one of the more efficient CG techniques, as it possesses the sufficient descent property and offers global convergence under Wolfe-Powell line search conditions.The numerical results given by this method are also convincing.Therefore, this two-term CG method is compared with our new modification to quantify the improvement offered by three-term CG methods.The remainder of this paper is organized as follows.In Section 2, the motivation for and construction of the threeterm HS CG method is discussed, and the general form is presented in Algorithm A. Section 3 is divided into two subsections, with Section 3.1 covering the sufficient descent condition and the global convergence properties for convex and nonconvex functions and Section 3.2 presenting detailed numerical results to evaluate the proposed method.Finally, Section 4 concludes this paper.

Motivation and Formulas
Zhang et al. [25] proposed the first three-term HS (TTHS) method.This can be written as TTHS satisfies the descent property; if an exact line search is used, then it reduces to the original HS method.Further, to guarantee the global convergence properties of the search direction given by (8), a modified (MTTHS) algorithm was introduced with the search direction: where As MTTHS was introduced to prove the global convergence properties of the search direction in (8), the question arises as to why (8) is not used to prove the global convergence properties.Instead of ignoring (8), it should be made efficient and globally convergent.Thus, there is room to modify (8) so as to satisfy the global convergence properties.It is expected that such a modification would outperform the MTTHS algorithm numerically.
Wei et al. [39] proposed an efficient CG parameter given by It is known that the HS method does not converge globally when the objective function is nonconvex.Further, Gilbert and Nocedal [41] showed that the parameter    must be nonnegative to achieve convergence for nonconvex or nonlinear functions, i.e., Applying the same technique to our parameter    gives where  > 1.If the line search is exact, then the parameters    ,  +  , and    reduce to the original parameters    [4],  +  [41], and TTHS [25].The procedure of our proposed three-step CG method is described in Algorithm A.
Step 3. Determine the step size   > 0 by the Wolfe line search (4).

Results and Discussion
This section contains a theoretical discussion and numerical results.The first subsection considers the global convergence properties of our proposed method and the second presents the results from numerical computations.

Global Convergence Properties Assumptions
In some neighborhood N of R 0 , the gradient () is Lipschitz continuous on an open convex set  that contains R 0 , i.e., there exists a positive constant  > 0 such that Assumptions (A1) and (A2) imply that there exist positive constants  and  such that We now prove the sufficient descent condition independent of the line search      = −‖  ‖ 2 and also ‖  ‖ ≤ ‖  ‖.From ( 15), (11), and ( 14), we can write that is, Hence, the sufficient descent condition holds regardless of the line search.
The HS method is well known for its conjugacy conditions, such as By [15], CG methods that inherit (27) will be more efficient than other CG parameters that do not inherit this property.Dai and Liao [42] proposed the following conjugacy condition for an inexact line search: Using the exact line search     −1 = 0, (28) reduces to the conjugacy condition in (27).
Lemma 1 (see [43]).Suppose there is an initial point  0 for which Assumptions (A1) and (A2) hold.Now, consider the method in the form of (2), in which   is a descent direction and   satisfies the Wolfe line search condition (4).Then This is known as Zoutendijk's condition and is used for proving the global convergence of a CG method.This condition together with (26) shows that Definition 2. The function  is called uniformly convex [36] on R  if there exists a positive constant  such that We now show the global convergence of Algorithm A for uniformly convex functions.
Lemma 3. Let the sequences (  ) and (  ) be generated by Algorithm A and suppose that (31) holds.Then, where Proof.For details, see Lemma 2.1 of [44]. Proof.As Then, using the second Wolfe condition (4) and the sufficient descent condition, we have From ( 11), (32), and (36) and Assumption (A2), We are now going to prove the global convergence of Algorithm A for nonconvex functions.
The proof has two parts.(52) In the beginning of the proof, we suppose that lim →∞ inf‖  ‖ ̸ = 0.Then, there exist a positive constant  and some  > 0 such that ‖  ‖ >  > 0. Thus, which contradicts Assumption (A2), (30), and (52).Therefore, 3.2.Numerical Discussion.We now report the results of several numerical experiments.Zhang et al. [25] demonstrated the superior numerical efficiency of the MTTHS algorithm with respect to PRP+ [41], CG DESCENT [45], and L-BFGS [46] using the Wolfe line search, while Dai and Wen [40] reported the numerical efficiency of the DHS method.Thus, we compare the efficient three-term HS method proposed in this paper (named the Bakhtawar-Zabidin-Ahmad method, BZA) with MTTHS [25] and DHS [40].The BZA method was implemented using the Wolfe-Powell line search (4) with  = 0.1,  = 0.5, and  = 2.All codes were written in MATLAB 7.1 and run on an Intel Core i5 system with 8.0 GB RAM and a 2.60 GHz processor.Table 1 lists the numerical results given by BZA, MTTHS, and DHS for a number of test functions.In the Table 1, NI/CT/GE/FE represents number of iterations, CPU time, number of gradient evaluations and number of function evaluations.
According to Moré et al. [47], the efficiency of any method can be determined by its performance on a number of test functions.The number of test functions should not be too large or too small, with 75 considered ideal for testing the efficiency of any method.The test functions in Table 1 were taken from Andrei's test function collection [48] with standard initial points and dimensions ranging from 2 to 10000.
If the solution had not converged after 500 seconds, the program was terminated.Generally, convergence was achieved within this time limit; functions for which the time limit was exceeded are denoted by "F" for Fail in Table 1.
The Sigma plotting software was used to graph the data.We adopt the performance profiles given by Dolan and Moré [49].Thus, MTTHS, DHS, and BZA are compared in terms of NI/CT/GE/FE in Figures 1-4.For each method, we plotted the fraction  of problems that were solved correctly within a factor  of the best time.In the figures, the uppermost curve is the method that solves the most problems within a factor t of the best time.From Table 1 and Figures 1-4, the BZA method outperforms the MTTHS algorithm and DHS method in terms of NI, CT, GE, and FE.
The BZA method solves around 99.5% of the problems, and the performance of BZA is 85% better than that of DHS and 77% better than that of MTTHS.We can also conclude that, on average, three-term conjugate gradient methods are 85% better than two-term conjugate gradient methods (DHS).

Conclusion
We have proposed a modified three-term HS conjugate gradient method.An attractive property of the proposed method is that it produces a sufficient descent condition      = −‖  ‖ 2 , regardless of the line search.The global convergence properties of the proposed method have been established under Wolfe line search conditions.Numerical results show that the proposed method is more efficient and robust than state-of-the-art three term (MTTHS) and twoterm (DHS) CG methods.

Figure 1 :
Figure 1: Performance profiles based on number of iterations.

Figure 2 :Figure 3 :
Figure 2: Performance profiles based on CPU time.

Figure 4 :
Figure 4: Performance profiles based on function evaluation.