A Descent Four-Term Conjugate Gradient Method with Global Convergence Properties for Large-Scale Unconstrained Optimisation Problems

)e conjugate gradient method is a useful method to solve large-scale unconstrained optimisation problems and to be used in some applications in several fields such as engineering, medical science, image restorations, neural network, and many others.)e main benefit of the conjugate gradient method is not using the second derivative or its approximation, such as Newton’s method or its approximation. Moreover, the algorithm of the conjugate gradient method is simple and easy to apply.)is study proposes a new modified conjugate gradient method that contains four terms depending on popular twoand three-term conjugate gradient methods. )e new algorithm satisfies the descent condition. In addition, the new CG algorithm possesses the convergence property. In the numerical results part, we compare the new algorithm with famous methods such as CG-Descent. We conclude from numerical results that the new algorithm is more efficient than other popular CG methods such as CG-Descent 6.8 in terms of number of function evaluations, number of gradient evaluations, number of iterations, and CPU time.


Introduction
To solve large-scale unconstrained optimisation problems, we prefer to use conjugate gradient (CG) since it is efficient and robust and does not use the second derivative. Consider the following problem: where f: R n ⟶ R is a continuous and differentiable function, and the gradient is available. e CG method generates a sequence of x k given by the equation e step length α k was obtained using line search. d k is a search direction of the CG method defined as follows: where g k � g(x k ) � ∇f and β k is known as the CG formula.
Powell [9] proposed an example to show that a function does not satisfy the convergence even if the exact line search is employed with the PRP formula. Gilbert and Nocedal [10] used Powell [9] suggestion for the PRP formula. ey proved that if β PRP+ k � max 0, β PRP k , the WWP line search is employed and the sufficient descent Wolfe-Powell line search condition is satisfied; then β PRP+ k is globally convergent. Moreover, Zoutendijk [11] showed that the FR formula with CG method and the exact line search has convergence analysis. Finally, Al-Baali [12] proved that the CG method with the FR coefficient is globally convergent when σ ≤ 1/2 SWP is employed.
In 2006, Wei et al. [13] presented a nonnegative CG formula that can satisfy the descent property and global convergence property as follows: where y k− 1 � g k − g k− 1 . In 2016, Alhawarat et al. [14] presented the following nonnegative CG formula with new restart property: where ‖ · ‖ represents the Euclidean norm and μ k is defined as follows: In 2020, Kaelo et al. [15] proposed a nonnegative CG formula and compared the new formula with β AZPRP k . e proposed equation is as follows: In 2001, Dai and Liao [16] proposed the following conjugacy condition: where s k− 1 � x k − x k− 1 and t ≥ 0. In the case of t � 0, equation (13) becomes the classical conjugacy condition. By using equations (2) and (12), Dai and Liao [16] proposed the following CG formula: However, β DL k inherits the same problem as β PRP k and β HS k ; that is, β DL k is not nonnegative in general. us, Gilbert and Nocedal [10] replaced equation (13) by In 2006, Hager and Zhang [17,18] presented the following CG formula based on (13): where (13), Andrei [19,20] proposed two three-term CG (TTCG) methods in 2013 as follows:

Mathematical Problems in Engineering
Also, based on equation (13), Babaie-Kafaki and Ghanbari [21] proposed the following CG method: where ς > 0. Similar to [19][20][21], Deng and Wan [22] in 2015 presented the following CG method: In 2019, Liu et al. [23] proposed with the following assumption: Moreover, Liu et al. [23] proved the sufficient descent condition for no convex functions to solve nonlinear monotone equations. To avoid using the condition Liu et al. [24] constructed the following three-term CG method: and they used it to solve equation (1). Liu et al. [24] proposed wide choices using equation (21) as follows: In 2020, Yao et al. [25] proposed three terms of CG with a new choice of t as follows: Based on the SWP line search, Yao et al. [25] selected t k to satisfy the descent condition as follows: Yao et al. [25] also proposed a theorem stating that if t k is close to (‖y k ‖ 2 /y T k s k ), then the search direction results in a zigzag search path. erefore, they selected the following choice for t k : As an application of the CG method, we can restore actual images from damaged images by minimising the z algorithm with an efficient number of iterations and CPU time. Moreover, we can test the quality using the root mean square error (RMSE) between the actual image (original) and the restored image as follows: Here, ζ and ζ k are the actual image and restored image, accordingly. Moreover, the CG method can be applied in several fields: neural network, image restoration, medical science, machine learning, finance and economics, and many other fields. e reader can refer to [26][27][28][29][30][31][32][33][34][35][36] for more about the CG method and its applications.
Mathematical Problems in Engineering 3

The New Search Direction
In the literature review, we note that the general form of the three-term CG method can be written as follows: where θ k and η k are CG parameters. Based on equation (13), we extend equation (27) to the following form: where t k � (‖s k ‖/‖y k− 1 ‖).
In this paper, we assume the following for equation (28): Note that equation (28) can be reduced to the HSCG method if we use an exact line search.
By using exact line search, we obtain and since g T Note that η k is similar to β DL k with different values of t k , where β DL k inherits the conjugacy condition with inexact line search. For θ k , it is worth noting that it contains the term g T k d k− 1 and is divided by the denominator of the HS method. Meanwhile, for (y k− 1 + s k− 1 ), the direction y k− 1 is an essential term in obtaining the descent condition when multiplied by negative gradient. It is also useful in terms of efficiency, i.e., if this term goes to zero, the search direction will restart using the steepest descent method, avoiding equation (28) to cycle without reaching the solution. Figure 1 describes the steps of the CG method to obtain the stationary point using SWP line search and equation (28) with stopping criteria of ‖g k ‖ ≤ 10 − 6 .

Global Convergence Analysis of the CG
Method with Figure 1 e following assumption is crucial to satisfy the convergence analysis of CG methods.
(B). In some neighbourhood Q of Ω, f is continuously differentiable, and its gradient is Lipschitz continuous; that is, for all, x, y ∈ Q, there exists a constant L > 0 such that In addition to this assumption, we can conclude that there exists a positive constant B such that e descent condition (downhill condition), is useful in studying the CG method and plays an important role in the proof of global convergence analysis. Abaali [12] modified (35) to the following form and used it to prove the FR method: where c ∈ (0, 1). Equation (36) is the sufficient descent condition. Moreover, using equation (36) is better than equation (35) since we can control the quantity of g T k d k by using ‖g k ‖ 2 .

e Descent Property of the New Search
Direction. e following theorem shows that the search direction in equation (28) ensures that the sufficient descent condition (35) is satisfied with the SWP line search. (1) and (28), computed by SWP line search. en, the sufficient descent condition holds.

Theorem 1. Let the sequences generate
Proof. Multiplying (28) by g T k yields From Wolfe-Powell search, we obtain and thus Zoutendijk [11] presented a useful lemma for analysing the global convergence property of the CG method. e lemma is given as follows.  (1) and (2), and α k satisfies the WWP line search of (5) and (6), in which the search direction is descent. en, the following condition holds: Compute the line search α k by the equations (5) and (6) Stop If ||g k || ≤ 10 -6 Set x k as given by equation (1) If Calculate d k based on (23) The stationary point is x 1 The stationary point is x 1 Figure 1: e steps of the CG method to obtain the optimum point. Figure 1 with Convex Functions. Dai and Liao [16] presented a useful theorem to obtain the global convergence theorem of the CG method as follows.

Global Convergence of
Theorem 2. Suppose that Assumption 1 holds. Consider any conjugate gradient method in the form of equations (1) and (2), where d k is a descent direction and α k is obtained from the strong Wolfe line search. If e following theorem shows that the new search direction strongly satisfies the convergences analysis with convex functions. (2) and (28) and d k as a descent direction, where α k is obtained using equations (3) and (4). If f(x) is a uniformly convex function, then liminf k⟶∞ ‖g k ‖ � 0.

Theorem 3. Suppose that Assumption 1 holds. Consider the CG method in the forms of equations
Proof. Because function f(x) is uniformly convex, there exists a positive constant ϖ such that for all x, y ∈ P. us, By using equation (28) and triangular inequality, we obtain By using equation (44), we have en, using triangular inequality and Assumption 1 yields By using Assumption 1, we now have By using eorem 2, we now have Figure 1 with General Nonlinear Functions. e following restriction for η k is very important to establish the convergence analysis for our new search direction. e main importance of this restriction is to avoid the CG method multiplayer, which will be nonnegative.

Mathematical Problems in Engineering
Lemma 2. Assume that Assumption 1 holds and the sequences g k and d k are generated using Figure 1, where the step size α k is computed via the SWP line search of (4) and (5) or weak Wolfe-Powell line search of (4) and (6), such that the sufficient descent condition holds. If β k ≥ 0, there exists a constant c > 0, such that ‖g k ‖ > c for all k ≥ 1. en, d k ≠ 0 and where u k � (d k /‖d k ‖).
Proof. First, if d k � 0, then, from the sufficient descent condition, we obtain g k � 0. us, we suppose that d k ≠ 0 and We define where Since u k is a unit vector, then Using the triangular inequality and δ k ≥ 0, we now have We define Using the triangular inequality, we obtain Now, utilising the equations of SWP of (4) and (5), we can conclude the two following inequalities given by Also, by using weak Wolfe-Powell line search of (4) and (6), we obtain Suppose that α k ≥ 1; thus where ζ is some positive constant. Since |θ k | ≤ ζ with α k ≥ 1, then |θ k | ≤ ζ with α k < 1. us |θ k | ≤ ζ for all α k > 0. By using the triangular inequality and Assumption 1, we obtain Let M � max ξ, ζ { }. us, the inequality in (60) can be written as follows: then From equation (54), we now have Mathematical Problems in Engineering By using equations (53) and (54), we obtain the following: which completes the proof. e following property, which is referred to as Property 1, was presented by Gilbert and Nocedal in [10]. □ Property 1. Consider a method of the form of (2) and (3). Assume that (54) is satisfied for all k ≥ 1.
Using SWP of (4) and (5) with equation (54), we obtain where By using Lemma 4.1 and Lemma 4.2 in [10], we obtain the following result. □ Theorem 4. Let the sequences x k and d k be generated by (2) and (3) with the CG method in equation (28), where the step size satisfies (4) and (5). en, we obtain the results by using Lemmas 2 and 3, as well as Lemmas 4.1 and 4.2 in [10], where liminf k⟶∞ ‖g k ‖ � 0.

Numerical Results and Discussion
To test the new search direction (28), we selected some test functions in Table 1 from CUTEr [37]. A comparison with other popular and strong CG coefficients is employed. e comparison includes the CG-Descent 6.8 and DL + CG formula based on the CPU time, number of iterations, number of function evaluations, and t number of gradient evaluations. We use the SWP line search to obtain the step   Figure 2 shows that FTCGHS strongly outperformed all methods in the number of iterations. Meanwhile, Figure 3 presents the number of function evaluations, and we can note that the FTCGHS method also strongly outperforms CG-Descent 6.8 and DL+ methods. From Figures 4 and 5, we can note that FTCGHS method slightly outperforms CG-Descent and DL+, since we use SWP line search. However, let the SWP line search in equation (5) be extended as follows: σ 1 g T k d k ≤ g x k + α k d k T d k ≤ σ 2 g T k d k , 0 < δ < σ 1 ≤ σ 2 < 1.  (73), the reader can refer to [39].
In addition, we present an example for two-dimensional three-hump camel back function, which is given by the following formula: Here, we have the following: Number of variables (n): 2 Initial points: (1, 1), (5,5), (10,10), and (15,15) is function has three local minima, one of which is global. erefore, this function is a multimodal function usually used to test global minima. e global minimum is x * � (0, 0) and the function value is f(x * ) � 0. is function looks like the back of an upside-down camel. See Figure 6 for the three-dimensional graph.

Conclusions
In this paper, we proposed a new four-term CG method as presented in equation (28) by using the following direction: e search direction inherits the following properties: (1) It presents four-term CG method (2) It satisfies the sufficient descent condition (3) e convergence analyses for convex and nonconvex functions are obtained (4) e numerical results show that the new modification outperforms popular methods such as CG-Descent and DL+ methods (5) e new search direction can be developed by using different values of t In addition, we will attempt to extend the SWP line search in the future by different values of sigma to reduce the number of gradient evolutions and CPU time.

Data Availability
e data used to support the findings of this study are available within the article.