It is well known that the nonlinear conjugate gradient algorithm is one of the effective algorithms for optimization problems since it has low storage and simple structure properties. This motivates us to make a further study to design a modified conjugate gradient formula for the optimization model, and this proposed conjugate gradient algorithm possesses several properties: (1) the search direction possesses not only the gradient value but also the function value; (2) the presented direction has both the sufficient descent property and the trust region feature; (3) the proposed algorithm has the global convergence for nonconvex functions; (4) the experiment is done for the image restoration problems and compression sensing to prove the performance of the new algorithm.
National Natural Science Foundation of China61772006Science and Technology Program of GuangxiAB17129012Science and Technology Major Project of GuangxiAA17204096Special Fund for Scientific and Technological Bases and Talents of Guangxi2016AD05050Special Fund for Bagui Scholars of Guangxi1. Introduction
Consider the following model defined by(1)minfxx∈Rn,where f:Rn⟶R is a continuous function. The above problem (1) has many practical applied fields, such as economics, biology, and engineering. It is well known that the nonlinear conjugate gradient (CG) method is one of the most effective methods for (1). The CG algorithm has the following iterative formula with(2)xk+1=xk+αkdk,k=0,1,2,…,where αk denotes the steplength, xk is the kth iterative point, and dk is the search direction designed by(3)dk=−gk+βkdk−1,if k≥1,−gk,if k=0,where gk=∇fxk is the gradient and βk is a scalar which determines different CG algorithms ([1–7], etc.), where the Polak–Ribière–Polak (PRP) formula [6, 7] is one of the well-known nonlinear CG formulas with(4)βkPRP=gkTgk−gk−1gk−12,where gk−1=∇fxk−1 and ⋅ is the Euclidean norm. The PRP method has been studied by many scholars, and many results are obtained (see [7–12], etc.) since the PRP algorithm has the superior numerical performance but for convergence. At present, under the weak Wolfe–Powell (WWP) inexact line search and for nonconvex functions, the global convergence of the PRP algorithm is still open, and it is one of the well-known open problems in optimization fields. Based on the PRP formula, many modified nonlinear CG formulas are done ([13–16], etc.) because many scholars want to use the perfect numerical attitude of it. Recently, Yuan et al. [17] open up a new way by modifying the WWP line search technique and partly proved the global convergence of the PRP algorithm. Further results are obtained (see [18–20], etc.) by this technique. It has been proved that the nonlinear CG algorithms can be used in nonlinear equations, nonsmooth optimization, and image restoration problems (see [21–24], etc.).
We all know that the sufficient descent property designed by(5)dkTgk≤−cgk2,c>0,plays an important role for convergence analysis in CG methods (see [13, 14, 24], etc.), where c>0 is a constant. There is another crucial condition about the scalar βk≥0 that has been pointed out by Powell [10] and further emphasized in the global convergence [11, 12]. Thus, under the assumption of the sufficient descent condition and the WWP technique, a modified PRP formula βkPRP+=max0,βkPRP is presented by Gilbert and Nocedal [13], and its global convergence for nonconvex functions is established. All of these observations tell us that both property (5) and βk≥0 are very important in the CG algorithms. To get one of the conditions or both of them, many scholars made a further study and got many interesting results. Yu [25] presented a modified PRP nonlinear CG formula designed by(6)βkmPRP1=gk+1Tykgk−μykgk+1Tdkgk4,where μ>1/4 is a positive constant and yk=gk+1−gk, which has property (5) with c=1−1/4μ. Yuan [12] proposed a further formula defined by(7)βkmPRP2=gk+1Tykgk2−mingk+1Tykgk2,μykgk+1Tdkgk4,which possesses not only property (5) with c=1−1/4μ but also the scalar βkmPRP2≥0. To get a greater drop, a three-term FR CG formula is given by Zhang et al. [26]:(8)dk+1=−θkgk+1+βkFRdk,θk=dkTykgk2,where it has (5) with c=−1. Dai and Tian [27] gave another CG direction designed by(9)dk+1=−1+βkdkTgk+1gk2gk+1+βkdk,if k≥0,−gk,if k=0,which also possesses (5) with c=−1. The global convergence of the above CG method is proved by Dai and Tian [27] for βk=βkYuanPRP and βk=βkYuPRP. For nonconvex functions and the effective Armijo line search, they did not analyze them. One the main reasons lies in the trust region feature. To overcome it, we [28] proposed a CG formula designed by(10)dk+1=−gk,if k=0,−gk+1+βkdk−βkdkTgk+1/gk+12gk+1γk,if k≥0,where γk=βkdk/gk+1, which possesses not only (5) with c=−1 but also the trust region property. It has been proved that the CG formula will have better numerical performance if it possesses not only the gradient value information but also the function value [29]. This motivates us to present a CG formula based on (10) designed by(11)dk+1=−gk,if k=0,−gk+1+βk∗dk−βk∗dkTgk+1/gk+12gk+1γk∗,if k≥0,where γk∗=βk∗dk/gk+1, βk∗=βkmPRP2∗=gk+1Tyk∗/gk2−mingk+1Tyk∗/gk2,μyk∗gk+1Tdk/gk4, and yk∗=yk+ρk with ρk=maxϱk,0/sk2 and ϱk=2fxk−fxk+1+gk+1+gkTsk and sk=xk+1−xk. The new vector yk∗ [30] has been proved that it has some good properties in theory and experiment. Yuan et al. [29] use it in the CG formula and get some good results. These achievements inspire us to propose the new CG direction (11), and this paper possesses the following features:
The sufficient property and the trust region feature are obtained
The new direction possesses not only the gradient value but also the function value
The given algorithm has the global convergence under the Armijo line search for nonconvex functions
The experiments for image restoration problems and compression sensing are done to test the performance of the new algorithm
The next section states the given algorithm. The convergence analysis is given in Section 3, and experiments are done in Section 4, respectively. The last section proposes one conclusion.
2. Algorithm
Based on the discussions of the above section, the CG algorithm is listed in Algorithm 1.
Initial step: given any initial point x0∈Rn and positive constants ϵ∈0,1,σ0>0,δ∈0,1,μ>1/4, σ∈0,1, set d0=−g0=−∇fx0 and k:=0.
Step 1: stop if gk≤ε is true.
Step 2: find αk=σ0σik such that
fxk+αkdk≤fxk+δαkgxkTdk,
where ik=min0,1,2,… satisfying the equation in Step 2 mentioned in Algorithm 1.
Step 3: set xk+1=xk+αkdk.
Step 4: stop if gk+1≤ε holds.
Step 5: compute dk by (11).
Step 6: set k:=k+1 and go to Step 2.
Theorem 1.
The direction dk is defined by (11); then there exists a positive β>0 satisfying(12)dkTgk=−gk2,∀k≥0,(13)dk≤βgk,∀k≥0.
Proof.
By (11), we directly get (12) and (13) for k=0 with β=−1. If k≥0, using (11) again, we have(14)gk+1Tdk+1=gk+1T−gk+1+βkmPRP2∗dk−βkmPRP2∗dkTgk+1/gk+12gkγk∗=−gk+1Tgk+1+βkmPRP2∗gk+1Tdk−βkmPRP2∗dkTgk+1/gk+12gk+1Tgk+1βkmPRP2∗dk/gk+1=−gk+12+βkmPRP2∗gk+1Tdk−βkmPRP2∗dkTgk+1βkmPRP2∗dk/gk+1=−gk+12,then (12) is true. By (11) again, we can get(15)dk+1=−gk+1+βkmPRP2∗dk−βkmPRP2∗dkTgk+1gk+12gk+1/γk∗≤gk+1+βkmPRP2∗dk+βkmPRP2∗dkgk+1/gk+12gk+1βkmPRP2∗dk/gk+1=gk+1+2βkmPRP2∗dkβkmPRP2∗dk/gk+1=3gk+1,which implies that (13) holds by choosing β∈3,+∞. We complete the proof.
Remark 1.
The relation (13) is the so-called trust region feature, and the above theorem tells us that direction (11) has not only the sufficient descent property but also the trust region feature. Both these relations (12) and (13) will make the proof of the global convergence of Algorithm 1 be easy to be established.
3. Global Convergence
For the nonconvex functions, the global convergence of Algorithm 1 is established under the following assumptions.
Assumption 1.
Assume that the function fx has at least a stationary point x∗, namely, gx∗=0 is true. Suppose that the level set defined by L0=xfx≤fx0 is bounded.
Assumption 2.
The function fx is twice continuously differentiable and bounded below, and its gx is Lipschitz continuous. We also assume that there exists a positive constant L>0 such that(16)gx−gy≤Lx−y,x,y∈Rn.
Now, we prove the global convergence of Algorithm 1.
Theorem 2.
Let Assumption 1 be true. Then, we get(17)limk⟶∞gk=0.
Proof.
Using (12) and the Step 2 of Algorithm 1, we obtain(18)fxk+αkdk≤fxk+δαkgxkTdk<fk−δαkgxk2,which means that the sequence fxk is descent and the following relation(19)δαkgxk2≤fxk−fxk+αkdk,is true. For k=0 to ∞, by summing the above inequalities and Assumption 1, we deduce that(20)∑k=0∞δαkgxk2≤fx0−f∞<+∞,holds. Thus, we have(21)limk⟶∞αkgxk2=0.
This implies that(22)limk⟶∞gxk=0,or(23)limk⟶∞αk=0.
Suppose that (22) holds, the proof of this theorem is complete. Assuming that (23) is true, we aim to get (17). Let the stepsize αk satisfy the equation in Step 2 in Algorithm 1; for αk∗=αk/σ, we have(24)fxk+αk∗dk>fxk+δαk∗dkTgxk.
By (12) and (13) and the well-known mean value theorem, we obtain(25)fxk+αk∗dk−fxk=αk∗dkTgxk+αk∗2Odk2=−αkσgxk2+αk2σ2Odk2>δαk∗dkTgxk=−δαkσgxk2,which implies that(26)αk>σ1−δgxk2Odk2≥σ1−δO1β,is true. This is a contradiction to (23). Then, only relation (22) holds. We complete the proof.
Remark 2.
We can see that the proof process of the global convergence is very simple since the defined direction (11) has not only the good sufficient descent property (12) but also the perfect trust region feature (13).
4. Numerical Results
The numerical experiments for image restoration problems and compression sensing will be done by Algorithm 1 and the normal PRP algorithm, respectively. All codes are run on a PC with an Intel (R) Core (TM) i7-7700T CPU @ 2.9 GHz, 16.00 GB of RAM, and the Windows 10 operating system and written by MATLAB r2014a. The parameters are chosen as σ=0.5, σ0=0.1, δ=0.9, and μ=300.
4.1. Image Restoration Problems
Setting x be the true image which has M×N pixels and i,j∈A=1,2,…,M×1,2,3,…,N. At a pixel location i,j, xi,j denotes the gray level of x. Then, defining a set N by(27)N≔i,j∈Aζ¯i,j≠ζi,j,ζi,j=smin or smax,which is the index set of the noise candidates. Suppose that ζ is the observed noisy image of x corrupted by salt-and-pepper noise, we let ϕi,j=i,j−1,i,j+1,i−1,j,i+1,j be the neighborhood of i,j. By applying an adaptive median filter to the noisy image y, ζ¯ is defined by the image obtained. smax denotes the maximum of a noisy pixel, and smin denotes the minimum of a noisy pixel. The following conclusions can be obtained: (i) if i,j∈N, then ζi,j must be restored. A pixel i,j is identified as uncorrupted, and its original value is kept, which means that wi,j∗=ζi,j with the element wi,j∗ of the denoised image w by the two-phase method. (ii) If i,j∉N holds, wm,n∗=ζm,n is stetted and ζm,n is restored if m,n∈ϕi,j∩N. Chan et al. [31] presented the new function fα and minimized it for the restored images without a nonsmooth term, which has the following form:(28)fαw=∑i,j∈N∑m,n∈ϕi,j∖Nψαwi,j−ζm,n+12∑m,n∈ϕi,j∩Nψαwi,j−ζm,n,where α is a constant and ψα is an even edge-preserving potential function. The numerical performance of fα is noteworthy [32, 33].
We choose Barbara 512×512, man 256×256, Baboon 512×512, and Lena 256×256 as the tested images. The well-known PRP CG algorithm (PRP algorithm) is also done to compare with Algorithm 1. The detailed performances are listed in Figures 1 and 2.
Restoration of Barbara, man, Baboon, and Lena by Algorithm 1 and the PRP algorithm. From left to right: a noisy image with 20% salt-and-pepper noise and restorations obtained by minimizing z with the PRP algorithm and Algorithm 1.
Restoration of Barbara, man, Baboon, and Lena by Algorithm 1 and the PRP algorithm. From left to right: a noisy image with 40% salt-and-pepper noise and restorations obtained by minimizing z with the PRP algorithm and Algorithm 1.
Figures 1 and 2 tell us that these two algorithms (Algorithm 1 and the PRP algorithm) are successful to solve these image restoration problems, and the results are good. To directly compare their performances, the restoration performance is assessed by applying the peak signal-to-noise ratio (PSNR) defined in [34–36] which is computed and listed in Table 1. From the value of Table 1, we can see that Algorithm 1 is competitive to the PRP algorithm since its PSNR value is less than that of the PRP algorithm.
PSNR Algorithm 1 and the PRP algorithm.
20% noise
Barbara
Man
Baboon
Lena
Average
Algorithm 1
31.115
38.0355
29.4393
41.0674
34.9143
PRP algorithm
31.1118
37.9583
29.4534
41.356
34.969
40% noise
Barbara
Man
Baboon
Lena
Average
Algorithm 1
27.5415
34.0063
25.8947
36.6496
31.0230
PRP algorithm
27.6153
34.5375
25.8571
36.701
31.1777
4.2. Compressive Sensing
In this section, the following compressive sensing images are tested: Phantom 256×256, Fruits 256×256, and Boat 256×256. These three images are treated as 256 vectors, and the size of the observation matrix is 100×256. The so-called Fourier transform technology is used, and the measurement is the Fourier domain.
Figures 3–5 turn out that these two algorithms work well for these figures, and they can successfully solve them.
Phantom: (a) the general images, (b) the recovered images by Algorithm 1, and (c) the recovered images by the PRP algorithm.
Fruits: (a) the general images, (b) the recovered images by Algorithm 1, and the (c) recovered images by the PRP algorithm.
Boat: (a) the general images, (b) the recovered images by Algorithm 1, and (c) the recovered images by the PRP algorithm.
5. Conclusion
This paper, by designing a CG algorithm, studies the unconstrained optimization problems. The given method possesses not only the sufficient descent property but also the trust region feature. The global convergence is proved by a simple way. The image restoration problems and compressive sensing problems are tested to show that the proposed algorithm is better than the normal algorithm. In the future, we will focus on the following aspects to be paid attention: (i) we believe there are many perfect CG algorithms which can be successfully used for image restoration problems and compressive sensing; (ii) more experiments will be done to test the performance of the new algorithm.
Data Availability
All data are included in the paper.
Conflicts of Interest
There are no potential conflicts of interest.
Acknowledgments
The authors would like to thank the support of the funds. This work was supported by the National Natural Science Foundation of China under Grant no. 61772006, the Science and Technology Program of Guangxi under Grant no. AB17129012, the Science and Technology Major Project of Guangxi under Grant no. AA17204096, the Special Fund for Scientific and Technological Bases and Talents of Guangxi under Grant no. 2016AD05050, and the Special Fund for Bagui Scholars of Guangxi.
DaiY.YuanY.A nonlinear conjugate gradient with a strong global convergence properties20001017718210.1137/s1052623494268443FletcherR.19872ndNew York, NY, USAJohn Wiley and SonsFletcherR.ReevesC. M.Function minimization by conjugate gradients19647214915410.1093/comjnl/7.2.149HestenesM. R.StiefelE.Methods of conjugate gradients for solving linear systems195249640943610.6028/jres.049.044LiuY.StoreyC.Efficient generalized conjugate gradient algorithms, part 1: theory199169112913710.1007/bf009404642-s2.0-0026142704PolakB. T.The conjugate gradient method in extreme problems1969949411210.1016/0041-5553(69)90035-42-s2.0-0001931644PolakE.RibièreG.Note sur la convergence de méthodes de directions conjuguées1969316354310.1051/m2an/196903r100351DaiY.Convergence properties of the BFGS algorithm200213369370110.1137/S10526234013834552-s2.0-0042659410DaiY.Analysis of conjugate gradient methods1997Beijing, ChinaInstitute of Computational Mathematics and Scientific/Engineering Computing, Chese Academy of SciencesPh.D. ThesisPowellM. J. D.Nonconvex minimization calculations and the conjugate gradient method19841066Berlin, GermanySpinger-Verlag122141PowellM. J. D.Convergence properties of algorithms for nonlinear optimization198628448750010.1137/10281542-s2.0-0022915175YuanG.Modified nonlinear conjugate gradient methods with sufficient descent property for large-scale optimization problems200931112110.1007/s11590-008-0086-52-s2.0-67650498880GilbertJ. C.NocedalJ.Global convergence properties of conjugate gradient methods for optimization199221214210.1137/0802003HagerW. W.ZhangH.A new conjugate gradient method with guaranteed descent and an efficient line search200516117019210.1137/0306018802-s2.0-33144465578HagerW. W.ZhangH.Algorithm 851200632111313710.1145/1132973.11329792-s2.0-33745245573WeiZ.YaoS.LiuL.The convergence properties of some new conjugate gradient methods200618321341135010.1016/j.amc.2006.05.1502-s2.0-33845762927YuanG.WeiZ.LuX.Global convergence of BFGS and PRP methods under a modified weak Wolfe-Powell line search20174781182510.1016/j.apm.2017.02.0082-s2.0-85018762891LiX.WangS.JinZ.PhamH.A conjugate gradient algorithm under Yuan-Wei-Lu line search technique for large-scale minimization optimization models2018201811472931810.1155/2018/47293182-s2.0-85045903481YuanG.ShengZ.WangB.HuW.LiC.The global convergence of a modified BFGS method for nonconvex functions201832727429410.1016/j.cam.2017.05.0302-s2.0-85022335122YuanG.WeiZ.YangY.The global convergence of the Polak-Ribière-Polyak conjugate gradient algorithm under inexact line search for nonconvex functions201936226227510.1016/j.cam.2018.10.0572-s2.0-85057051000CaoJ.WuJ.A conjugate gradient algorithm and its applications in image restoration202015224325210.1016/j.apnum.2019.12.002YuanG.LiT.HuW.A conjugate gradient algorithm for large-scale nonlinear equations and image restoration problems202014712914110.1016/j.apnum.2019.08.022YuanG.LuJ.WangZ.The PRP conjugate gradient algorithm with a modified WWP line search and its application in the image restoration problems202015211110.1016/j.apnum.2020.01.019YuanG.MengZ.LiY.A modified Hestenes and Stiefel conjugate gradient algorithm for large-scale nonsmooth minimizations and nonlinear equations2016168112915210.1007/s10957-015-0781-12-s2.0-84954387852YuG.Nonlinear self-scaling conjugate gradient methods for large-scale optimization problems2007Guangzhou, ChinaSun Yat-Sen UniversityThesis of Doctors DegreeZhangL.ZhouW.LiD.Global convergence of a modified Fletcher-Reeves conjugate gradient method with Armijo-type line search2006104456157210.1007/s00211-006-0028-z2-s2.0-33749239442DaiZ.-F.TianB.-S.Global convergence of some modified PRP nonlinear conjugate gradient methods20115461563010.1007/s11590-010-0224-82-s2.0-80052968044CaoJ.WuJ.A conjugate gradient algorithm and its applications in image restoration201915224325210.1016/j.apnum.2019.12.002YuanG.WeiZ.ZhaoQ.A modified Polak-Ribière-Polyak conjugate gradient algorithm for large-scale optimization problems201446439741310.1080/0740817x.2012.7267572-s2.0-84897585714YuanG.WeiZ.Convergence analysis of a modified BFGS method on convex minimizations201047223725510.1007/s10589-008-9219-02-s2.0-77956768212ChanR. H.HoC. W.LeungC. Y.NikolovaM.Minimization of detail-preserving regularization functional by Newton’s method with continuationProceedings of IEEE International Conference on Image ProcessingSeptember 2005Genova, Italy12512810.1109/ICIP.2005.15297032-s2.0-33749630481CaiJ. F.ChanR. H.MoriniB.Minimization of an edge-preserving regularization functional by conjugate gradient types methods2007Berlin, GermanySpringer10912210.1007/978-3-540-33267-1_7DongY.ChanR. H.XuS.A detection statistic for random-valued impulse noise20071641112112010.1109/tip.2006.8913482-s2.0-34047221782BovikA.2000New York, NY, USAAcademicRahpeymaiiF.AminiK.AllahviranlooT.MalkhalifehM. R.A new class of conjugate gradient methods for unconstrained smooth optimization and absolute value equations20195610.1007/s10092-018-0298-82-s2.0-85057606963YuG.HuangJ.ZhouY.A descent spectral conjugate gradient method for impulse noise removal201023555556010.1016/j.aml.2010.01.0102-s2.0-77949489196