A Convex Combination between Two Different Search Directions of Conjugate Gradient Method and Application in Image Restoration

*e conjugate gradient is a useful tool in solving largeand small-scale unconstrained optimization problems. In addition, the conjugate gradient method can be applied in many fields, such as engineering, medical research, and computer science. In this paper, a convex combination of two different search directions is proposed. *e new combination satisfies the sufficient descent condition and the convergence analysis. Moreover, a new conjugate gradient formula is proposed. *e new formula satisfies the convergence properties with the descent property related to Hestenes–Stiefel conjugate gradient formula. *e numerical results show that the new search direction outperforms both two search directions, making it convex between them.*e numerical result includes the number of iterations, function evaluations, and central processing unit time. Finally, we present some examples about image restoration as an application of the proposed conjugate gradient method.


Introduction
e conjugate gradient (CG) method is widely used to solve large-scale unconstrained optimization problem. Moreover, it can be applied in many fields, such as neural network, medical science, engineering, and image restoration. We consider the following problem: where f: R n ⟶ R is continuous, and its derivative is available. A sequence of iterations x k generated by the CG method is given as follows: x k+1 � x k + α k d k , k � 1, 2, . . . , where x k is the current point and α k > 0 is a step size offered by a line search. e search direction d k is given by the following equation: where g k � g(x k ) and β k is known as the CG formula. To find the step size α k , we usually employ Wolfe or strong Wolfe-Powell line search [1,2], where the latter is defined as follows: and with 0 < δ < σ < 1. e weak Wolfe-Powell (WWP) line search is defined by equation (4) and the following equation: e most well-known formulas for β k methods, for example, Hestenes-Stiefel (HS) [3], Polak-Ribiere-Polyak (PRP) [4], Liu and Storey (LS) [5], Fletcher-Reeves (FR) [6], Fletcher (CD) [7], and Dai and Yuan (DY) [8], are given as follows, respectively: where g k � g(x k ) and y k− 1 � g(x k ) − g(x k− 1 ). e global convergence properties of the FR method were investigated by [9]. e global convergence of the PRP method for convex objective function under exact line search was proved by [4]. Later, Powell [10] gave a counterexample showing the existence of a nonconvex function in which PRP and HS can cycle infinitely without getting a solution. Powell suggested the importance to achieve the global convergence of the PRP and HS methods, which should not be negative. Moreover, Gilbert and Nocedal [11] proved that nonnegative PRP, i.e., β k � max β PRP k , 0 , is globally convergent under complicated line searches.
Besides, Hager and Zhang [12] suggested a new CG parameter satisfying the descent property (refer to equation (43)) for any inexact line search with g T k d k ≤ − (7/8)‖g k ‖ 2 . is new CG parameter satisfies the global convergence properties when the Wolfe or approximate Wolfe line search is employed. is formula is given as follows: where β N k � (1/d T k y k )(y k − 2d k (‖y k ‖ 2 /d T k y k )) T g k , η k � − (1/(‖d k ‖min η, ‖g k ‖ )), and η > 0 is a constant. In the numerical experiments, η � 0.01 was fixed in equation (8). Note that equation (8) is known as CG-Descent 5.3. Al-Baali et al. [13] compared between β HZ k and the (G3TCG) method using extensive choices of CG parameters.
Apart from that, Wei et al. [14] proposed a nonnegative formula referred to as the WYL coefficient defined as follows: is parameter is similar to the PRP coefficient, possesses global convergence with both an exact line and inexact line searches, and provides a sufficient descent condition. Many modifications of the WYL coefficient have been suggested, including which were presented in [15,16], respectively. In addition, Alhawarat et al. [17] presented the following simple formula: where ‖·‖ represents the Euclidean norm, while μ k is defined as follows: Using the quasi-Newton method, the BFGS method, the limited memory (LBFGS) method, and equation (3), Dai and Liao [18] proposed the following conjugacy condition: where s k− 1 � x k − x k− 1 and t ≥ 0. In the case of t � 0, equation (13) becomes the classical conjugacy condition. Moreover, using equations (3) and (13), Dai and Liao [18] proposed the following CG formula, given by However, β DL k is not a nonnegative parameter. Dai and Liao [18] expressed β DL k as follows: Also, Dai and Liao proposed the CG formula as follows: where μ > 0. Andrei [19] proposed a useful book about nonlinear conjugate gradient methods for unconstrained optimization problems. e main chapters are linear conjugate gradient methods, standard conjugate gradient methods, acceleration of conjugate gradient methods, hybrid, modifications of the standard scheme, memoryless BFGS preconditioned, and three-term.
Mishra et al. [20] presented a q-Polak-Ribière-Polyak conjugate gradient algorithm for unconstrained optimization problems. A transformation of the accelerated double step size model for unconstrained optimization, which supports our idea in Section 2, was presented in [21,22]. Petrović et al. [23] presented a note on the hybridization process applied on transformed double step size model based on one specific accelerated gradient iteration. Alhawarat et al. [24] presented very useful hybrid conjugate gradient method with the strong Wolfe-Powell line search. Finally, Ahmad et al. [25] presented a developed version from β AZPRP k for large-scale unconstrained optimization problem. e paper is organized as follows. Section 2 describes the search direction and the algorithm, while Section 3 introduces the convergence analysis of the CG algorithm with the new coefficientβ ( k (1)). Moreover, Section 4 describes numerical results and discussion for some results with application in image restoration.

The New Search Direction and the Algorithm
e motivation to propose equation (17) is to use direction determination as a convex combination of two directions, leading to the advantages of using both directions in convergence analysis and numerical experiments. Based on Assumption 1, we proposed a convex combination between two different search directions, where every search direction has its own CG formula. e new search direction is defined as follows: where 0 ≤ λ ≤ 1, We choose β ( k (1)) and β ( k (2)) for equations (18) and (19) as follows: where ‖·‖ represents the Euclidean norm. Note that μ k is defined as follows: We choose β CG− DESCENT 5.3 k because it is a well-known CG formula satisfying the decent condition without any line search with g T k d ( k (2)) ≤ − (7/8)‖g k ‖ 2 . is will be portrayed later in eorem 3. Note that the reader can use any CG formula that satisfies the convergence properties descent condition.
In addition, we can extend )to the following form given by e following lemma corresponds to Lemma 3.1 in [17]; thus, we omit the proof.

Lemma 1. Let Assumption 1 holds, and the Beale-Powell restart condition is violated for a nonrestart search direction;
then, ‖g k ‖ 2 > μ k |g T k g k− 1 | holds. The following algorithm ( Figure 1) illustrates the steps of the CG method using equation (17) to obtain the solution of an optimization problem.
In Figure 1, note that, after the step k � k + 1, the iterates x k � x k+1 takes place after every iteration. e other iterations are updated in a similar manner as x k .

The Convergence Analysis of the CG
Algorithm with the New Coefficient β ( k (1)) and New Search Direction implying that a positive constant M exists such that ‖x‖ ≤ M, ∀x ∈ Ω. (b) In some neighbourhood Nof Ω, fis continuous and the derivative is available. In addition, its gradient is Lipschitz continuous, implying for all x, y ∈ N, and there exists a constant L > 0 such that is assumption implies that there exists a positive constant B such that e descent condition plays an important role in the CG method. Al-Baali [9] modified inequality (26) to the following form and used it to prove the FR method: Mathematical Problems in Engineering where c ∈ (0, 1). Equation (27) is the sufficient descent condition. Note that the general form of the sufficient descent condition is given by inequality (27) with c > 0.
We then divide both sides of equation (28) by ‖g k ‖ 2 . Now, using Lemma 2 to substitute ,and by using equation (5), we obtain . then en, we have the following two cases.

Theorem 2.
Let sequences x k and d k be obtained by using equations (2) and (18), which is computed by SWP line search in equations (4) and (5); then, the descent condition Start Set a starting point x 1 . and the initial search direction d 1 = -g 1 . Let k := 1.

Yes If a stopping criterion is satisfied No
Calculate d k based on equation (8). Calculate α k using equations (3) and (4).
End Figure 1: e algorithm of CG method.
Proof. By multiplying equation (3) with g T k and by substituting with β ( k (1)), we obtain Let c � 1, and we obtain which completes the proof.
□ Theorem 3. Suppose the sequences x k and d k are generated using equations (2) and (17), where α k is generated by the SWP line search in equations (4) and (5) or any line search. If g T k d ( k (1)) and g T k d ( k (2)) are descent, then the sufficient descent condition (27) holds with equation (17).
From eorem 1, we obtain and from [12], we obtain Using equation (37), we now have Since we obtain e following lemma, which is referred to as the Zoutendijk condition [24], is useful for analyzing the global convergence property of the CG method. □ Lemma 3. Let Assumption 1 holds. Consider any method in the form of equations (2) and (3), where α k satisfies the WWP line search, and the search direction is descent. en, the following condition holds: In addition, equation (43) holds for the exact and SWP line searches. e proof is shown in [24]. From (43), we conclude that the Zoutendijk condition (43) implies the following form: Dai et al. [18] presented the following theorem, which is also useful to prove the global convergence properties of the CG method.  (2) and (3), where d k is a descent direction and α k is obtained from the strong Wolfe line search. If The following property, called Property 1, is presented by [26]. is property is useful to obtain the convergence property of the CG method.

Mathematical Problems in Engineering
Using equations (20) and (47), we obtain By using Assumption 1, the restart criteria in (20) can be written as follows: us, we can write Also, using equation (48) in [14], we obtain Since we now have (1/L) ≤ μ k < 1.
Using (52) and (54), we obtain Using Assumption 1 and the definition of λ, we obtain Case 2. μ k ≥ 1. By using Assumption 1 and equation (19), we obtain us, in all cases, e proof is complete.
To obtain |β ( k (1))| ≤ 1/2b, we consider the following cases: □ Lemma 5. Consider the CG method of forms (2) and (18) (47), and Assumption 1 yield If ‖s k ‖ ≤ λ, we obtain the following: e following lemmas correspond to Lemma 4.1 and Lemma 4.2 in [11]. □ Lemma 6. Let Assumption 1 holds. Let the sequences x k and d k be generated by Figure 1 with equation (18), where α k is computed by the WWP line search holding the sufficient descent condition (27). We assume that the method has Property 2. Suppose that equation (47) holds. en, there exists λ > 0 such that, for any Δ ∈ N and any index k 0 , there exists an index k > k 0 satisfying where u k � (d k /‖d k ‖). By Lemmas 4-7, the global convergence of Figure 1 with equation (18) and SWP line search can be established similar to that of eorem 4.3 in [10]. erefore, the proof of the following theorem is omitted. Theorem 5. Let the sequences g k and d k be generated by (2) and (18) with the CG formula β ( k (1)) and the step size satisfying (4) and (5). If Lemmas 4-7 are true, then Theorem 6. Let the sequences x k and d k be generated using (2) and (17), where α k is computed by the SWP line search (4) and (5) or any line search. If the CG method with d ( k (1)) and d ( k (2)) obtains the result liminf k⟶∞ ‖g k ‖ � 0, then yields the same result given by lim inf k⟶∞ ‖g k ‖ � 0.

Numerical Results and Discussion
To discuss the efficiency of the proposed method, we employ standard test functions in Table 1 similar to that used in CG_Descent 6.8, giving a strong comparison for the proposed method. e test functions can be obtained from the CUTEST library [27]. e numerical results of d ( k (2)) (CG_Descent 5.  (17) are obtained by executing the modified code of [28]. is was performed with memory equal to zero, change of search direction to be convex of d ( k (1)) and d ( k (2)), and SWP line search with σ � 0.1 and δ � 0.01. Here, λ � 0.5 since we want to use both search directions d ( k (1)) and d ( k (2)) in equal form. Note that the proposed method is limited with SWP line search between σ ∈ (0, 1/2). e comparisons are made based on the number of iterations that the algorithm require to obtain the stationary point, the CPU time required for every algorithm to obtain the solution, and the number of function evaluation, i.e., how many times the algorithm uses the function. Note that the number of function evaluation and gradient evaluations are strongly related to the line search. For a fair comparison between the proposed methods and CG_Descent 5.3, we use the same test functions for all algorithms similar to that used by CG_Descent 5.3. Moreover, we use approximate line search with CG_Descent 5.3 as introduced by authors to obtain the results. In addition, the minimum time was 0.2 for all algorithms. Also, we use Ubuntu 20.04 as an operating system to run the code for all algorithms, which is more accurate than windows for scientific calculations.
e results are shown in Figures 2-4, representing a performance measure introduced by [29]. is performance measure was introduced to compare a set of solvers S on a set of problems ρ. Assuming n s solvers and n p problems in S and ρ, respectively, the measure t p,s is defined as the computation time (e.g., the number of iterations or the CPU time) required for the solver s to solve the problem p.
To create a baseline for comparison, the performance of the solver s on the problem p is scaled by the best performance of any solver in S on the problem using the ratio Let that the parameter r M ≥ r p,s for all selected p, s. Also, assume that r p,s � r M if and only if the solver s does not solve the problem p. Since we would like to obtain an overall assessment of the performance of a solver, we define the measure as follows: us, P s (t) is the probability for a solver s ∈ S that the performance ratio r p,s is within a factor t ∈ R of the best possible ratio. Suppose we define the function p s as the cumulative distribution function for the performance ratio. In that case, the performance measure R for a solver is nondecreasing and piecewise continuous from the right. e value of p s (1) is the probability that the solver achieves the best performance of all of the solvers. In general, a solver with high values of P s (t), which would appear in the upper right corner of the figure, is preferable.
In all figures, we see that the convex search direction outperforms both search directions in terms of iteration number, function evaluation, and CPU time. In all figures, we rename the methods as follows: A1 means the CG method with search direction d ( k (1)) A2 means the CG method with search direction d ( k (2)) A3 means the CG method with search direction d k in equation (17).
We note from Figure 2 that method A3 slightly outperforms method A1and A2 in the number of iterations, where A1 outperforms A2. Figure 3 shows that the new method A3 strongly outperforms method A1 and slightly outperforms method A2 in CPU time, where A1 is strongly comptetive with A2. Moreover, from Figure 4, we can indicate that method A3 outperforms A1 and A2 methods in the number of function evaluation where A1 and A2 are strongly comptetive. It is worth to note that the functions with large dimension, i.e., large unconstrained optimization problems cost more CPU time, number of iterations, and gradient evalutions. us, we advice the readers to use very high quality of computers such as work station computer with the Ubuntu operating system.

Image Restoration.
In this section, we will use Figure 1 to restore the destroyed image. We use well-known images such as Cameraman, Boat, Coins, Moon, and Baboon. We added noise by using Gaussian noise with a standard    Mathematical Problems in Engineering 9    deviation equal to 25% to original images. We then restore these images by β DL+ k CG algorithm and Figure 1. Note that we use if the descent condition holds, otherwise, we restart the algorithm by using the steepest descent method. is procedure is similar to that used by Dai and Liao [18] when they obtained the numerical results. For both algorithms, we use the SWP line to obtain the step size with σ � 0.1 and δ � 0.01.
To evaluate the quality of the restored image, we use the root-mean-square error (rmse) between the true image (original) and the restored image as follows: Here, ζand ζ k are the true image and restored images. e restored image quality depends on the value of rmse, i.e., good quality with small rmse. e stopping criterion is Here, ω � 10 − 3 . Note that if ω � 10 − 4 or ω � 10 − 6 , then rmse has the same value, i.e., more iterations with fixed rmse. Table 2 presents a comparison between the DL + CG algorithm and Figure 1 using some numerical experiments as mentioned before. e comparison includes the number of iterations, CPU time, and rmse. We can observe that the     (R2010a), where the host computer is an AMD A4-7210 with RAM 2 GB. us, the CPU time becomes more than usual. We note that our new algorithm is efficient for restoring images based on rmse, number of iteration, and CPU time. It is worth to note that if we increase the Gaussian noise, the cost to restore image will increase. In addition, the image with large size needs more CPU time than images with small size.
In Table 3, we present some numerical results of Figure 1 to restore destroyed images using Gaussian noise with a standard deviation of 25%. We note that Figure 1 is an efficient method to restore destroyed images.

Conclusions
In this study, we investigated a convex between two different search directions. e new combination satisfies the global convergence and descent property. Our numerical results show that the new combination produces efficient results better than both search directions and is sometimes competitive. In addition, we propose a new CG method satisfying descent property and convergence analysis. Finally, we present an example of image restoration using the new algorithm.

Data Availability
All data generated and analyzed during this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.