An Adaptive Total Generalized Variation Model with Augmented Lagrangian Method for Image Denoising

We propose an adaptive total generalized variation (TGV) based model, aiming at achieving a balance between edge preservation and region smoothness for image denoising. The variable splitting (VS) and the classical augmented Lagrangian method (ALM) are used to solve the proposed model. With the proposed adaptive model and ALM, the regularization parameter, which balances the data fidelity and the regularizer, is refreshed with a closed form in each iterate, and the image denoising can be accomplished without manual interference. Numerical results indicate that our method is effective in staircasing effect suppression and holds superiority over some other state-of-the-art methods both in quantitative and in qualitative assessment.


Introduction
In the past few decades, many variation or partial differential equation (PDE) based restoration models [1][2][3][4][5][6][7] have been proposed to recover images from degraded observations, due to the ability of preserving significant image features such as edges or textures.Among these models, the total variation (TV) model, also named the Rudin-Osher-Fatemi (ROF) model [1], is distinguished for excellent edge preserving ability and becomes one of the most widely used regularizers in image restoration [1,2,[8][9][10].In particular, the TV denoising problem is in the following form: where Ω is an open bounded domain in two dimensions,  is the image to be restored,  0 is the observation containing Gaussian white noise, and  is the regularization parameter which balances the regularization term and the data fidelity term.∫ Ω |∇|dx is the TV seminorm of the bounded variation (BV) space BV(Ω).The TV model is highly effective in preserving edges and corners, compared with the quadratic Tikhonov model.However, only when the original image is piecewise constant, the TV model is proved to be optimal.In fact, staircasing effect usually appears because most of natural images are not piecewise constant.Staircasing effect cannot meet the demands of human vision, due to the new artificial edges which do not exist in original images.
To overcome the drawback of the TV model, researchers suggest introducing the higher-order derivatives of image functions [3][4][5][6][7][12][13][14][15][16].In order to eliminate the staircasing effect of TV model, Chambolle and Lions [14] proposed the following infimal-convolution minimization functional: where discontinuous components of the image are allotted to  − V while regions of moderate slopes are assigned to V. The above model was proved to be practically efficient.Later, a modified form of (2) was proposed in [5] and its regularizer is of the following form: That is, the second-order derivative in (2) is substituted by the Laplacian in (3).The similar use of Laplacian operator can also be seen in some PDE-based methods [3].
Since the classical TV model could not distinguish jumps from smooth transitions, Chan et al. [12] considered an additional penalization of the discontinuities in images.Precisely, they adopt as the regularization term, where  is a real-valued function whose value approaches 0 while |∇| approaches infinity.The absence of the staircasing effect for this choice was verified in [15].Bredies et al. [17] proposed the concept of total generalized variation (TGV), which is considered to be the generalization of TV.The TGV model is defined as where  ≥ 1 denotes the image dimension, and, throughout this paper, we assume  = 2; Sym  (R  ) is the space of symmetric -tensors on R  ;    (Ω, Sym  (R  )) is the space of compactly supported symmetric tensor field;   is fixed positive parameter.From the definition of TGV   , we learn that it involves the derivatives of  of order one to .When  = 1 and  0 = 1, TGV   degenerates to the classical TV.Thus TGV can be seen as a generalization of TV.
TGV involves and balances higher-order derivatives of .Image reconstruction with TGV regularization usually leads to result with piecewise polynomial intensities and sharp edges.Therefore, TGV can effectively suppress the staircasing effect.In [17], an accelerated first-order method of Nesterov [18] was proposed to solve the TGV-regularized denoising problem.
In this paper, we propose an adaptive second-order TGVregularized model for denoising and derive an augmented Lagrangian approach to handle the suggested model.Our denoising model is as follows: According to the standard Lagrange duality, for a given , there exists a nonnegative  such that argmin is equivalent to (6).However, with (6), we can automatically estimate the regularization parameter .We first utilize an indicator function of the feasible set to transform problem (6) into an unconstrained one; then the variable splitting technique is applied to transform the resulting unconstrained problem into a problem with linear penalizing constraints; finally, the obtained constrained problem is solved by the alternating direction method of multipliers (ADMM) [19][20][21][22], which is an instance of the classical ALM.The resulting image denoising algorithm is effective in staircasing effect suppression compared with some TV-based denoising methods, due to the second-order TGV regularizer.Besides, it achieves the adaptive estimation of the regularization parameter without inner iterative scheme.It is worth noting that the idea of this paper can be extended to TGV models with higher order than two.However, for simplicity, we only treat the second-order model and this is adequate for a large class of natural images.
Our method differs from the previous works on at least two aspects.On one hand, compared with [16], which adopted the accelerated first-order method of Nesterov [18] to handle the unconstrained TGV-based denoising problem (7), we apply ALM to the constrained TGV-based denoising problem (6) and achieve the automatic estimation of the regularization parameter .Our strategy avoids the extra cost on the manual selection of  by try-and-error.On the other hand, compared with the existing TV-based adaptive methods [10,23,24], we propose a more complicated adaptive method based on TGV, and it is apt to achieve more attractive results than the TV-based methods.
The outline of the rest of the paper is organized as follows.Section 2 provides the description of the adaptive second-order TGV-based model for image denoising.Based on the Lagrange duality, an equivalent form of TGV 2  is suggested.The derivation of the proposed method is presented in Section 3. Section 4 gives the numerical results that demonstrate the effectiveness of the proposed method.At last, Section 5 ends this paper with a brief conclusion.

Adaptive Second-Order TGV-Based Model for Image Denoising
The space of bounded generalized variation (BGV) functions of order  with weight  is defined as Correspondingly, the BGV norm is defined as The TGV seminorm rather than the BGV norm is usually used as a regularizer.
In this paper, we just take  = 2 into consideration for simplicity.The second-order TGV can be written as where the divergences are defined as In fact, Sym 2 (R  ) is equivalent to the space of all symmetric  ×  matrices.The infinite norms in (10) are given by For the convenience of the derivation of our algorithm, we apply the discrete form in the following and the tensors and vectors are denoted in bold type font.In order to make use of ADMM, we apply an equivalent definition of TGV 2  [17,22] based on the Lagrange duality.With this definition, we have where u ∈ R  denotes an  ×  image, p ∈ R  × R  belongs to the two-dimensional 1-tensor field, and  denotes the symmetrized derivative operator.Suppose that  , and p , denote the (, )th components of u and p, respectively.Then we have p , = [ ,,1 ,  ,,2 ] ∈ Sym 1 (R 2 ) and the (, )th component of (p) is given by where where  Φ (u) is the indicator function of the feasible set defined by Note that Φ is a closed Euclidean ball centered at u 0 with radius √.
The solution of problem ( 16) suffers from its nonlinearity and nondifferentiability. Referring to the variable splitting, we introduce three auxiliary variables to simplify the solution process of ( 16): a variable w for liberating u out from the constraint of the feasible set; a variable y and a variable z for liberating (p) and ∇u − p out from the nondifferentiable 1norms, respectively.Then problem ( 16) can be transformed into the following equivalent constrained problem: In order to liberate u out from the feasible set constraint, we introduce auxiliary variable w.Similar operation can also be found in [10].Without this operation, we should resort to an inner iterative scheme to update the regularization parameter.
The corresponding augmented Lagrangian functional of ( 18) is defined as where , , and  are Lagrange multipliers and  1 ,  2 , and  3 are penalty parameters which should be positive.
According to the classical ADMM, we should solve the following iterative scheme:

Solution of the Subproblems.
With the auxiliary w, the u subproblem becomes quadratic and irrelevant to the constraint of the feasible set.It allows the following objective: The minimization problem (21) can be solved by the following equation: With the circulant boundary condition of images, we can solve (22) with several FFTs and IFFTs [8,10].
Following the same way, the subproblem with respect to p is also quadratic and we have the objective functional as follows: Then, for p +1 1 , we have and for p +1 2 , we have where p 1 and p 2 are the combinations of  ,,1 and  ,,2 (0 ≤  ≤ , 0 ≤  ≤ ), respectively.Similar to the solution of (22), problems ( 24) and ( 25) can also be solved conveniently through several FFTs and IFFTs under the assumption of the circulant boundary condition.The subproblem for y can be written as The z subproblem is given by Input: u 0 , .
The convergence of Algorithm TGV 2 ID-ADMM follows from the convergence analysis for the TV-based ADMM in [11,25], due to the convex property of TGV 2  .In this paper, we do not repeat the lengthy analysis procedure.However, we have the following essential convergence theorem for the proposed method.Theorem 1.For fixed  1 ,  2 ,  3 > 0, the sequence {u  , p  , w  , y  , z  ,   ,   ,   ,   } generated by Algorithm TGV 2 ID-ADMM from any initial point (u 0 , p 0 , w 0 , y 0 , z 0 ,  0 ,  0 ,  0 ) converges to (u * , p * , w * , y * , z * ,  * ,  * ,  * ,  * ), where u * is the solution of functional (15) and  * is the regularization parameter corresponding to the feasible set constraint u ∈ Φ.

Experiment Results
In this section, we illustrate the effectiveness of the proposed algorithm on suppressing staircasing effect and removing Gaussian noise in image.Besides, we also show the robustness ) , where u clean is the original image that contains no noise.Besides, in subsections 4.1 and 4.2, we set the penalty parameters as  1 = 10 (BSNR/10−1) × and  2 =  3 =  = 0.3 for TGV 2 ID-ADMM ( 2 = 0 for TGV 1 ID-ADMM) to achieve consistently promising result with fast speed, where BSNR is the blurred signal-to-noise ratio defined by BSNR = 10 log 10 (var(u 0 )/ 2 ) (var(u 0 ) denotes the variance of u 0 ).

Staircasing Effect Reduction by the Proposed Method.
We first compare Algorithm TGV 2 ID-ADMM with Algorithm TGV 1 ID-ADMM to illustrate the effectiveness of TGV 2  model in staircasing effect reduction.We use ‖u +1 − u  ‖ 2 2 / ‖u  ‖ 2 2 ≤ 10 −6 as the stopping criteria for these two algorithms, where u  denotes the restored result in the th iteration.For the second-order case, we set ( 0 ,  1 ) = (3, 1), whereas for the one-order case, we set ( 0 ,  1 ) = (0, 1).
In this experiment, we use a synthetic piecewise affine image shown in Figure 1 as the test image.The original image is contaminated by Gaussian noise of standard variance  = 15 at first.Then we imply TGV 2 ID-ADMM and TGV 1 ID-ADMM to remove the noise.Table 1 shows the results in terms of RMSE, PSNR, total iterations, and CPU time.The ground truth, noised, and restored images by the two algorithms are displayed in Figure 1.Furthermore, for better visualization, we additionally provide the threedimensional close-ups of the marked regions of the two restored images in Figure 1.From Table 1 we observe that TGV 2 ID-ADMM does better than TGV 1 ID-ADMM in terms of both RMSE and PSNR. Figure 1 shows that the denoised image of TGV 2 ID-ADMM almost contains no artificial edges in affine regions.In contrast, the restored result of TGV 1 ID-ADMM contains obvious staircasing effect in affine regions.The three-dimensional closed-ups vividly demonstrate this phenomenon.This illustrates that our TGV-based algorithm is effective in staircasing effect reduction.
Table 1 also shows that, to accomplish the denoising task, TGV 2 ID-ADMM usually costs more CPU time than TGV 1 ID-ADMM, since TGV 2  model involves much more calculation.However, the cost is worthy due to the impressive improvement on both quantitative and qualitative restoration quality.Figure 2 displays the evolutions of s and PSNRs achieved by the two algorithms.It is learnt that, the regularization parameters of both converge to the optimal points at last, which guarantees the automatic implementation of the two algorithms.

Comparison in Accuracy.
In this subsection, we compare TGV 2 ID-ADMM with the other two famous adaptive TVbased denoising algorithms: Chambolle's projection algorithm [23] and Split Bregman algorithm [24], both possessing public online implementations at "http://www.ipol.im/".Two natural images, Lena and Peppers both of size 512 × 512 shown in Figure 3, are used for comparison.The parameter setting for TGV 2 ID-ADMM is the same as that in the previous subsection.We obtain the test results of the two competitors through online experimental operation.
We add Gaussian noise of standard variances of 20, 30, and 40 to Lena and Peppers to obtain the noised observations, respectively.Then we apply these three algorithms to restore the noisy images.Table 2 shows the comparison results in terms of RMSE and PSNR.The best result for each comparison item is highlighted in bold type font.Table 2 shows that TGV 2 ID-ADMM holds superiority on both RMSE and PSNR for all the tested cases.Figure 4 displays the noised Lena under Gaussian noise of  = 30 and the restorations by the three algorithms, whereas Figure 5 exhibits the noised Peppers under Gaussian noise of  = 40 and the corresponding restorations.Figures 4 and 5 demonstrate that TGV 2 ID-ADMM obtains results with better visual impression and efficiently suppresses the staircasing effect.In contrast, both TV-based Chambolle's projection algorithm and TV-based Split Bregman algorithm achieve results with obvious staircasing effect.Since we apply test images with different levels of noise, the robustness of our algorithm towards the noise level is verified to a certain extent.ADMM are commonly influenced by the choice of the penalty parameters to a certain extent in practice.As suggested by a referee, we add an experiment to show the robustness of the results of TGV 2 ID-ADMM with respect to the penalty parameters, under the two denoising background problems mentioned above, that is, the Lena denoising problem under Gaussian noise of  = 30 and the Peppers denoising problem under Gaussian noise of  = 40.We still set  1 = 10 (BSNR/10−1) ×  and  2 =  3 =  but change  from 0.01 to 1 with a step size of 0.01.In Figure 6, we plot PSNR versus  for the denoised Lena and Peppers.Figure 6 demonstrates that the optimal  should be focalized in [0.2, 0.3] and its     location is robust towards the variation of image and noise level.The results of our method possess sufficient robustness with respect to the variation of penalty parameters to a certain extent, since the absolute error between the maximum and the minimum of PSNR is less than 0.18 dB in the experiment, and this error could not introduce obvious distinction in visual quality.In the former two experiments, the setting of  = 0.3 is approximately optimal for the proposed algorithm.

Concluding Remarks
We propose an adaptive TGV-based model for noise removal in this paper.The variable splitting (VS) and the classical augmented Lagrangian method are used to handle the proposed model.From the experimental results, we observe that the proposed algorithm is effective in suppressing staircasing effect and preserving edges in images, and it is superior to some other famed adaptive denoising methods both in quantitative and in qualitative assessment.Besides, our work can be smoothly generalized to image deblurring problems.

Figure 1 :
Figure 1: First row: ground truth and Gaussian noised ( = 15) piecewise affine images; second row: restored images by TGV 2 ID-ADMM and TGV 1 ID-ADMM; third row: three-dimensional close-ups of the marked regions in the two restored images.

Figure 6 :
Figure 6: PSNR versus  for (a) Lena image under Gaussian noise of  = 30 and (b) Peppers image under Gaussian noise of  = 40, respectively.The absolute error between the maximum and the minimum of PSNR for each image is less than 0.18 dB. argmin argmin

Table 1 :
Results of the staircasing effect reduction experiment.

Table 2 :
Comparison results in accuracy.