A Nonmonotone Gradient Algorithm for Total Variation Image Denoising Problems

The total variation (TV)model has been studied extensively because it is able to preserve sharp attributes and capture some sparsely critical information in images. However, TV denoising problem is usually ill-conditioned that the classical monotone projected gradient method cannot solve the problem efficiently. Therefore, a new strategy based on nonmonotone approach is digged out as accelerated spectral project gradient (ASPG) for solving TV. Furthermore, traditional TV is handled by vectorizing, which makes the scheme farmore complicated for designing algorithms. In order to simplify the computing process, a new technique is developed in view of matrix rather than traditional vector. Numerical results proved that our ASPG algorithm is better than some state-ofthe-art algorithms in both accuracy and convergence speed.


Introduction
Blur and noise are unavoidable during the acquisition and transmission of images, and this may be introduced by the poor lighting conditions, faulty camera or unperfect transmission channels, and so on.It is interesting and challenging to restore the ideal image from its degraded version.Several effective mathematical models and algorithms have emerged such as partial differential equations [1], sparse-land models [2,3], and regularization [2].
This paper mainly elaborated on one of the most popular and effective tools for image denoising problem named TV [3][4][5], as it has been shown to preserve sharp edges both experimentally and theoretically.In view of the good performance in image processing, TV has been extensively studied in many image processing fields such as denoising [3,4], deblurring [6], and inpainting [5,7].Meanwhile, various algorithms [3,[5][6][7][8] for solving TV are also proposed.Even though there are so many algorithms, most of them are merely related to vectorization [3,6].As images are twodimensional signals, it is more intuitive and suitable to process by matrix than by vector.Furthermore, it is more efficient and simpler for image processing in a matrix view.In this work, some universal and useful properties of TV are given in matrix version, while it also applies to other algorithms even without any modification.
Over the last few decades, there have been many different variations of the projected gradient method (PG) to solve TV denoising problems [9][10][11].Generally, the classical PG method has complexity result of (1/) as one of its drawbacks [12], where  is the iteration counter.Many authors devoted themselves into enhancing PG.Birgin et al. [13,14] imposed a nonmonotone technique to classical PG and developed the spectral projected gradient (SPG) method.Nesterov [15] gave a concise and efficient strategy which improved the convergence speed of PG to (1/ 2 ).Recently, similar techniques were tactfully managed to different traditional algorithms [8,10,11,16,17] and corresponding optimal complexity results were obtained: Beck and Teboulle [10] proposed a fast duality-based gradient projection methods to solve TV; Chambolle and Pock [11] and Goldfarb et al. [16] accelerated the classical primal-dual algorithm and alternating direction augmented Lagrangian methods, respectively; more recently, Ouyang et al. [17] adopted the same technique to enhance the convergence of alternating direction method of multipliers (ADMM).Though so many studies have been done based on Nesterov's theory, to the best of our knowledge, no work explores the possibility between Nesterov's scheme [15] and the spectral projected gradient algorithm.Although the SPG approach is interesting, the strong disturbance always occurs before its convergence and thus casts a shadow over the approach; meanwhile, the key thought in Nesterov's work [15] is that a combination of past iterates is used to compute the next iterate.This work is purposed to reduce the SPG's disturbance by Nesterov's method and propose an accelerated spectral projected gradient (ASPG) for TV denoising problem.

Primal-Dual Framework of Total Variation Denoising
In this section, we consider the TV denoising problem which is formulated as where  and  denote an ideal and the noised image, respectively, the box region Ω is practically selected as [0, 1] or [0, 255] to constrain the value of , and ‖‖ TV is called TV seminorm usually divided into isotropic and anisotropic types which are, respectively, defined as the paper of Chambolle [9].It is worth mentioning that our approach can be applied to TV  and TV  .However, for the sake of compactness, we will put emphasis on TV  .
For convenience, we write C =  (−1)× ×  ×(−1) .Thus, we can define the inner product as 1) , ( = 1, 2) . ( The rest of this paper is organized as follows: In the following subsections, some important operators in matrix version for solving TV, followed by the algorithm for TV-based image denoising, are explained.In Section 3, the spectral projected gradient algorithm is first recapped and then the accelerated version ASPG is derived by it.To show the effectiveness of the proposed approach, numerical comparisons with existing state-of-the-art methods are carried out in Section 4. Finally, Section 5 contains the conclusion.

Some Preliminaries for TV Denoising.
In this section, we will define some useful operators which are necessary for our algorithm.The operators are defined in matrix version instead of the traditional vector version [3,6], which makes TV more understandable and concise.
(a) Subset. is a subset in the inner product space C as follows: where  ,(,) is the entry of   (for  = 1, 2).
It should be mentioned that the operators (b), (c), and (d) are defined with the assumption of periodic boundary which is adopted in the rest of this paper.

The Primal Dual Solution of TV.
As the denoising problem (1) is not differentiable, the traditional optimal methods are not applicable.Though there are so many techniques to overcome the difficulty, we employ the scheme of Chambolle [9,18] for its effectiveness and simpleness.Now, we give the TV-based image denoising algorithm in primal-dual view which mainly contains two steps.

Theorem 2. The TVID algorithm can obtain an analytic solution of the TV image denoising problem (1).
Proof.According to the primal-dual approach [9,18], the TV seminorm can be written as follows: Substituting ( 9) into (1), then we get min By the min-max theorem, the above equation is equivalent to max Obviously, the inner part of the above equation, min has an analytic solution: where  Ω () denotes the projection of  onto the convex set Ω.

Accelerated Spectral Projected Gradient Based TV Denoising
In this section, first there will be a brief on spectral projected gradient method [14].Next an accelerated strategy will be used to enhance SPG.

Introduction to Spectral
Projected Gradient Method.By combining Barzilai-Borwein nonmonotone [19] ideas with traditional projected gradient algorithm, Birgin et al. [13,14] developed a new gradient algorithm that is SPG, which is suitable for convex optimization problems with projection easy to be computed on feasible set.By incorporating a spectral step length and a nonmonotone globalization strategy, SPG speeds up the traditional PG and maintains it to be simple and easy to code.The SPG method aims to minimize  on a closed and convex set Ω; that is, As the traditional projection method, the SPG method has the form where   is the step length, and the search direction   has been defined as [14] In the classical SPG method, the nonmonotone decrease criterion is realized by an integer parameter  ≥ 1 which guarantees the object function decrease every  iterations.
In the line search stage, a sufficient decrease parameter  ∈ (0, 1) is used to satisfy Armijo criterion.Besides, some other parameters are also required: safeguarding parameters 0 <  min <  max < ∞ for the spectral step length and safeguarding parameters 0 <  1 <  2 < 1 for the quadratic interpolation.For later reference, we now recall the theorem by Birgin et al. [14] in which convergence of classical SPG is shown.
Theorem 3 (see [14]).The spectral projected algorithm is well defined, and any accumulation point of the sequence {  } that it generates is a constrained stationary point of ( 14).And the gradient of ( 17) is

Accelerated
where (⋅) can be computed by the gradient operator in Section 2.2.Substituting ( 18) into ( 16) and simplifying them together with (13), we get the search direction of ASPG: where   (⋅) is defined as the projected operator in Section 2.2.The nonmonotone linear search and the spectral step length are updated just as the classical SPG [14].Following nonmonotone line search, ASPG adopts Nesterov's [15] technique to improve the performance of classical SPG.Finally, it should be mentioned that the ASPG algorithm is used for the dual step in TVID algorithm; that is, the variation P in ASPG is an element of the inner product space C. Consequently, the inner product involving ASPG should be computed as (2).

Convergence Analysis of ASPG.
Since our method can be seen as an accelerated version of classical SPG in the dual space, its convergence can be deduced from Theorem 3. Now, we summarize the convergence of ASPG in the following theorem.
Theorem 5.The ASPG for TVID is well defined, and any accumulation point of the sequence {P  } is a constrained stationary point of (17); furthermore, the output  * of ASPG is a stationary point of TVID problem (1).
Proof.By (13),  * generated by ASPG is a stationary point of (1) if and only if P is a stationary of (17).
Equation (18) shows that the th gradient of ( 17) can be gotten by   ; that is, ∇(P  ) = −2( Ω (  )).Substituting it into (16), we get the search direction of ASPG that is (19).This means that the search direction of ASPG is the same as that of the SPG direction except that (19) is in the dual space C. Furthermore, the definition of gradient operator (⋅) shows that (  ) is an element of C. Therefore, the inner product ⟨  , (  )⟩ can be computed by (2).Moreover, both the nonmonotone line search and the spectral step length mainly involve the inner products ⟨  , (  )⟩, ⟨  ,   ⟩, and ⟨  ,   ⟩.Consequently, the nonmonotone line search and the spectral step length of ASPG are just different from those of the classical SPG in the inner product.
Step 3 in ASPG is an additional step besides the classical SPG.It can be known that the step is a linear combination of the previous two results, which will not change the convergence of the whole algorithm.Therefore, from Theorem 3, it follows that P is a stationary of (17).This implies the conclusion.

Numerical Experiments
In this section, the proposed ASPG algorithm (ASPG Proposed) is employed to solve the TVID problem, followed by the comparison with some state-of-the-art algorithms: the classical projected gradient given by Chambolle In the following, the applicability of the proposed algorithm on image denoising problems is illustrated.The degraded images are generated by adding Gaussian noise with zero mean and standard deviation of size 0.05 to the original images (Figure 1).Throughout the experiments, the parameters involved in SPG and ASPG are set as follows:  min = 10 −30 ,  max = 10 30 ,  1 = 0.1,  2 = 0.9,  = 10 −4 , and  = 3.For the value of the regularization parameter  for all schemes, tune it manually and choose the one that gives the most satisfactory results for all four images.
The PSNR and the relative error of the four images with different algorithms are shown in Table 1.To exclude the random fact to the results, the last column of Table 1 shows the average metrics of the four tested images, and the metrics of the best method for each test case are highlighted in boldface.Similar to FPG Beck improving PG Chambolle by Nesterov's strategy, the proposed ASPG improves the classical SPG Birgin.Table 1 illustrates that ASPG and FPG Beck are better than SPG Birgin and PG Chambolle, respectively.This fact shows that the accelerated scheme is valid for the ASPG as well as the FPG Beck.Furthermore, we can find that ASPG is superior to FPG Beck in that the former is based on the nonmonotonic line search.Table 1 shows that, among all the algorithms, the proposed ASPG algorithm performs the best in both PSNR and relative error.
The following goes with a deep comparison of the mentioned algorithms.To illustrate this more clearly, in Figure 2  here, which are related to the iterative times for noised Barbara (generated from Figure 1(a)).From Figure 2, we can see that the classical SPG has strong disturbance before its convergence while the ASPG reduces the disturbance and accelerates the convergence speed.Meanwhile, Figure 2 also illustrates that the nonmonotone line search improves all the metrics PSNR, relative error, and SSIM.One more point to mention is that here we only show the relative error, PSNR, and SSIM value which are generated from the degraded version of Figure 1(a), while, for the other figures  Remote sensing images are widely used in so many fields [20][21][22][23].However, noise is unavoided during the acquisition and transmission of images.To restore the the ideal image from its degraded version is an interesting field of remote sensing image processing.Here, we employ the proposed ASPG algorithm to denoise a remote sensing image (Figure 4).The results are shown as that from Figures 4(a)-4(f), and a selected area is magnified to better prove them in more detail.Both the original and magnified areas are highlighted in green.

Conclusions
In this paper, an accelerated scheme for spectral project gradient is proposed for TV denoising problems.To handle TV in matrix space, a concise version for total variation model is introduced.Numerical examples have been done to verify the performance of algorithm, which show that it is much better than that of some existing state-of-the-art methods.