Modification of Nonlinear Conjugate Gradient Method with Weak Wolfe-Powell Line Search

and Applied Analysis 3


Introduction
The nonlinear CG method is a useful tool to find the minimum value of function for unconstrained optimization problems.Let us consider the following form min { () where  : R  → R is continuously differentiable and its gradient is denoted by () = ∇().The method to find a sequence of points {  } starting from initial point  0 ∈ R  is given by the iterative formula: where   is the current iteration point and   > 0 is the step size obtained by some line search.The search direction   is defined by where   = (  ) and   is known as the conjugate gradient coefficient.
Strong Wolfe-Powell (SWP) line search is the most popular inexact line search, which is depending on a reduction in function and decreasing the search area to find step length.In addition, it forces the step length to be closed to stationary point or local minimum of function, so it is useful method to find the step size.

𝑓 (𝑥
where 0 <  <  < 1.In fact, SWP line search is modified from weak Wolfe-Powell (WWP), so we find that the step length satisfies (4) and However, WWP line search may accept the step length far from stationary or local minimum of function.Dai [1] proposed two Armijo type line searches: the first one matches the global convergence for any   ≥ 0 using methods (2) and (3).By this line search, the global convergence for FR, nonnegative PRP, and CD methods have been established.
To match the global convergence of original PRP method, he designed another line search proposed as follows.
Given a constant  ∈ (0, 1),  > 0 and  ∈ (0, 1) determine the smallest integer  ≥ 0, if it defines   =   , then the vectors  +1 and  +1 given by ( 2) and ( 3) satisfy ( 4) with where  ∈ (0, 1/2) and  ∈ (, 1) are two constants.The most popular formulas for   are as follows: Hestenes-Stiefel (HS) [2], Fletcher-Reeves (FR) [3], Polak-Ribière-Polyak (PRP) [4], Conjugate Descent (CD) [5], Liu-Storey (LS) [6], Dai-Yuan (DY) [7], Wei et al. (WYL) [8], and Hager and Zhang (HZ) [9]. , where The global convergence of FR method with exact line search was achieved by Zoutendijk [10], Al-Baali [11] proved that FR method is globally convergent under strong Wolfe condition when  < 1/2, and later Liu et al. [12] extended the result to  ≤ 1/2.Its behavior on numerical computation is unpredictable.In few cases, it is as efficient as PRP method.However, generally, it is very slow.In addition, DY and CD have the same performance as FR method under exact line search with strong global convergence.Global convergence of PRP method for convex objective function under exact line search was proved by Polak and Ribière in 1969 [4].Later, Powell gave out a counterexample showing that there exists nonconvex function, which PRP method does not converge globally, although the exact line search is used.Powell suggested the importance of achieving the global convergence of PRP method, and it should not be negative.Gilbert and Nocedal [13] proved that nonnegative PRP method is globally convergent with the Wolfe-Powel line search.HS method and LS method have the same performance as PRP with exact line search.Therefore, PRP method is the most efficient method when it is compared to the other conjugate gradient methods.For more, the reader can see the following references [14][15][16][17][18][19].
In 2006, Wei et al. [8] gave a new positive CG method, and it seems like original PRP method which has been studied in both exact line search and inexact line search, and many modifications have appeared, such as the following [20][21][22][23], respectively.
A little modification from  WYL  , Zhang [21] presented the following CG method: In the same manner,  DPRP  construct the following CG by using the denominator of  NPRP  : In addition,  MLS *  is constructed by using the numerator of  WYL  : where  ≥ 0 and  ≥ 1.
The descent condition plays important rule in CG method given by If we extend (12) to the following form, then the search direction satisfies the sufficient descent condition.
In this paper, we will present the new formula and the algorithm in Section 2. Furthermore, we will establish the global convergence of our method with several line searches in Section 3. Numerical results with conclusion will be presented in Sections 4 and 5, respectively.

The Modified Formula
In this section, where ‖ ⋅ ‖ means the Euclidean norm, and  > 1.
Abstract and Applied Analysis 3 Algorithm 1.
Step 4. Compute   based on some line search; we use in numerical section WWP line search with  = 0.1 and  = 0.001.

Method
The following assumption is needed to be used in following theorems.
(II) In some neighborhood  of Ω,  is continuous and differentiable, and its gradient is Lipschitz continuous; that is, for any ,  ∈ , there exists a constant  > 0 such that ‖() − ()‖ ≤ ‖ − ‖.2), (3), and   satisfies the WWP line search ( 4) and (6), in which the search direction is descent.Then, the following condition holds:
Proof.We use proof by induction.From (3), we know that for  = 0 it is hold.Suppose that it is true until  − 1; that is, where  > 1.Take  = 1 − 1/ and complete the proof.

Global Convergence under WWP Line Search. Gilbert
and Nocedal [13] present an important theorem to find the global convergence for a nonnegative part of PRP method; it is summarized by Theorem 5.In addition, [13] Theorem 5 (see [13]).Consider that any CG method of form ( 2) and ( 3) achieves the following conditions that hold:

𝑘
satisfies Property * ,  HZ *  also achieves Property * ; for more we suggest that the reader reads Lemma 3.6 [24].The proof is completed.
The following corollary is a result from Theorem 5 and Lemma 3.

Numerical Results and Discussions
To analyze the efficiency of the new method, we selected some of the test functions in Table 1 from CUTEr [25], Andrei [26], and Adorio and Diliman [24].We performed a comparison with other CG methods, including NPRP and DPRP methods using weak Wolfe-Powell line search with  = 0.001.The tolerance  is selected to 10 −6 for all algorithms to investigate the rapidity of the iteration methods towards the optimal.The gradient value is taken as the stopping criteria.Here, the stopping criteria considered ‖  ‖ ≤ 10 −6 .Since the parameters NPRP and DPRP are tested based on weak Wolfe-Powell line search, the modified parameters HZ * are tested based on weak Wolfe line search with values of  = 0.1 and  = 0.001.In addition, the values of  = 2 and  = 2 are for HZ * and DPRP parameters, respectively.We used Matlab 7.9 subroutine program, with CPU processor Intel (R) Core (TM), i3 CPU, and 2 GB DDR2 RAM under strong Wolfe line search.The performance results are shown in Figures 1 and 2, respectively, using a performance profile introduced by Dolan and Moré [27].This performance measure was introduced to compare a set of solvers  on a set of problems .Assuming   solvers and   problems in  and , respectively, the measure  , is defined as the computation time (e.g., the number of iterations or the CPU time) required for solver  to solve problem .
To create a baseline for comparison, the performance of solver  on problem  is scaled by the best performance of any solver in  on the problem using the ratio: Let the parameter   ≥  , for all ,  be selected, and further assume that  , =   if and only if the solver  does not solve problem .As we would like to obtain an overall assessment of the performance of a solver, we defined the measure: Thus,   () is the probability for solver  ∈  that the performance ratio  , is within a factor  ∈ R of the best possible ratio.If we define the function   as the cumulative distribution function for the performance ratio, then the performance measure   : R → [0, 1] for a solver is nondecreasing and piecewise continuous function from the right.The value of   (1) is the probability that the solver achieves the best performance of all of the solvers.In general, a solver with high values of   (), which would appear in the upper right corner of the figure, is preferable.It is clear that HZ * parameter is strong competitive with NPRP parameter and slightly better in some cases for all graphs in Figures 1, 2, 3, and 4 which include the number of iterations, CPU times, gradient evaluations, and function evaluations.On the other hand, it is clear that HZ * parameter outperforms DPRP parameter in all performance profiles.

Conclusion
In this paper, we proposed a new modification of conjugate gradient method extended from NPRP methods.Our numerical results had shown that the new coefficient is comparable compared to other conventional CG methods.This method converges globally with several line searches with descent direction.However, in future, we will focus on speed using hybrid methods.Additionally, we will try to compare several line searches with modern CG method.

Figure 1 :Figure 2 :
Figure 1: Performance profile based on the CPU time with weak Wolfe-Powell line search.

Figure 3 :Figure 4 :
Figure 3: Performance profile based on the number of gradient evaluations with weak Wolfe-Powell line search.

Table 1 :
The test functions.