An Efficient Modified AZPRP Conjugate Gradient Method for Large-Scale Unconstrained Optimization Problem

Division of Computational Mathematics and Engineering, Institute for Computational Science, Ton Duc ang University, Ho Chi Minh City, Vietnam Faculty of Mathematics and Statistics, Ton Duc ang University, Ho Chi Minh City, Vietnam Faculty of Civil Engineering, Ton Duc ang University, Ho Chi Minh City, Vietnam Department of Mathematics, Faculty of Science, Jazan University, Jazan, Saudi Arabia Department of Mathematics, Faculty of Ocean Engineering Technology and Informatics, Universiti Malaysia Terengganu, Kuala Nerus 21030, Terengganu, Malaysia


Introduction
e conjugate gradient (CG) method aims to find a solution of optimization problems without constraint. Suppose that the following optimization problem is considered: where f: R n ⟶ R is continuous, the function is differentiable, and the gradient ∇f(x) is available. e iterative method is given by the following sequence: x k+1 � x k + α k d k , k � 1, 2, . . . , where x k is the starting point and α k > 0 is a step length. e search direction d k of the CG method is defined as follows: where ∇f(x) � g(x k ) and β k is a parameter.
To obtain the step length, we normally use the inexact line search, since the exact line search which is defined as follows, requires many iterations to obtain the step length. Normally, we use the strong version of Wolfe-Powell (SWP) [1,2] line search which is given by where 0 < δ < σ < 1. e weak Wolfe-Powell (WWP) line search is defined by (5) and where ∇f � g k � g(x k ). e famous parameters of β k are the Hestenes-Stiefel (HS) [3], Fletcher-Reeves (FR) [4], and Polak-Ribière-Polyak (PRP) [5] formulas, which are given by where y k � g k − g k− 1 .
Powell [6] shows that there exists a nonconvex function such that the PRP method does not globally converge. Gilbert and Nocedal [7] show that if β PRP+ k � max 0, β PRP k with the WWP and the descent property is satisfied, then it is globally convergent.
Regarding the speed, memory requirements, number of iterations, function evaluations, gradient evaluations, and robustness to solve unconstrained optimization problems which have prompted the development of the CG method, the readers are advised to refer references [10][11][12][13][14][15] for more information on these new formulas.

The New Formula and the Algorithm
Alhawarat et al. [15] presented the following simple formula: Dai and Laio [12] presented the following formula: where s k− 1 � x k − x k− 1 and t ≥ 0. e new formula is a modification of β AZPRP k and β DL+ k is defined as follows: where μ k � (‖s k− 1 ‖/‖y k− 1 ‖) and t > 0. We obtain the following relations (Algorithm 1):

Convergence Analysis of Coefficient β A k with CG Method
Assumption 1 is, a positive constant T exists such that (B) In some neighbourhoods N of Ψ, f is continuous and the gradient is available and its gradient is Lipschitz continuous; that is, for all x, y ∈ N, there exists a constant L > 0 such that is assumption shows that there exists a positive constant B such that e descent condition (17) plays an important role in the CG method. e sufficient descent condition proposed by Al-Baali [8] is a modification of (17) as follows: where c ∈ (0, 1). Note that the general form of the sufficient descent condition is (18) with c > 0.  (21) is satisfied with the SWP line search. e following theorem shows that β A k satisfies the descent condition. e proof is similar to that presented in [8].
Algorithm 1 shows the steps to obtain the solution of optimization problem using strong Wolfe-Powell line search.
Proof. By multiplying () by g T k , we obtain and (12), we obtain From (3), we obtain g T As hence, and e proof is complete. □ Theorem 2. Let g k and d k be obtained by using (2), (3), (5) and (6), then the descent condition holds.
By multiplying (3) by g T k , and substituting β A k , we obtain Start Set a starting point x 1 . Set the initial search direction d 1 = -g 1 . Let k: = 1.

Yes
If a stopping criteria is satisfied
Steps of CG method with new modification to obtain the stationary point of functions.

Journal of Mathematics
which completes the proof. Zoutendijk [16] presented a useful lemma for global convergence property of the CG method. e condition is given as follows.
□ Lemma 1. Let Assumption 1 hold and consider any method in the form of (2) and (3), where α k is obtained by the WWP line search (6) and (7), in which the search direction is descent. en, the following condition holds: (2) and (3), with the new formula (12), in which α k is obtained from the SWP line search (5) and (6) with σ ≤ 1/2. en,

Theorem 3. Suppose Assumption 1 holds. Consider any form of equations
e proof is similar to that presented in [8].
Proof. We will prove the theorem by contradiction. Assume that the conclusion is not true, then a constant ε > 0 exists such that Squaring both sides of equation (3), we obtain Divide (31) by ‖g k ‖ 4 , we get Using (6), (12), and (32), we obtain Repeating the process for (33) and using the relationship From (33), we obtain is result contradicts (32), thus liminf k⟶∞ ‖g k ‖ � 0. e proof is complete.

Numerical Results
To investigate the effectiveness of the new parameter, several test problems in Table 1 Figures 1 and 2 in which a performance measure introduced by Dolan and More [18] was employed. As shown in Figure 1, formula A strongly outperforms over CG_Descent in number of iterations. In Figure 2, we notice that the new CG formula A is strongly competitive with CG_Descent.

Multimodal Function with Its Graph.
In this section, we present six-hump camel back function, which is a multimodal function to test the efficiency of the optimization algorithm. e function is defined as follows: (37) e number of variables (n) equals 2. is function has six local minima, with two of them being global. us, this function is a multimodal function usually used to test global minima. Global minima are x * 1 � (− 0.0898, 0.7126) and x * 2 � (0.0898, − 0.7126). e function value is f(x * ) � − 1.0316. As its name describes, this function looks like the back of an upside down camel with six humps (see Figure 3 for a three-dimensional graph); for more information about two-dimensional functions, the reader can refer to [19].
Finally, note that CG method can be applied in image restoration problems and neural network and others. For more information, the reader can refer to [20,21].

Conclusions
In this study, a modified version of the CG algorithm (A) is suggested and its performance is investigated. e modified formula is restarted based on the value of the Lipchitz constant. e global convergence is established by using SWP line search. Our numerical results show that the new coefficient produces efficient and competitive results compared with other methods, such as CG_Descent 5.3. In the future, an application of the new version of CG method will be combined with feed-forward neural network (back-propagation (BP) algorithm) to improve the training process and produce fast training multilayer algorithm. is will help in reducing time needed to train neural network when the training samples are massive.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest regarding the publication of this paper.