A New Fast Algorithm for Constrained Four-Directional Total Variation Image Denoising Problem

A new four-directional total variation (4-TV) model, applicable to isotropic and anisotropic TV functions, is proposed for image denoising. A dual based fast gradient projection algorithm for the constrained 4-TV image denoising problem is also reported which combines the well-known gradient projection and the fast gradient projection methods. Experimental results show that this model provides in most cases a better signal to noise ratio when compared to previous models like the reference TV, the total generalized variation, and the nonlocal total variation.


Introduction
Variational models have found a wide variety of applications in image processing and computer vision, in particular in restoration tasks such as denoising, deblurring, and blind deconvolution.One of the major concerns in machine vision remains to preserve important image features (edges, lines, and also textures) while removing noise.Total variation (TV) based image restoration models were first introduced by Rudin et al. (ROF) in their pioneering work [1] for edge preserving image denoising.It has been extended to many other problems and modified in a variety of ways to improve its performance.The unconstrained TV based denoising model has the following form: where ‖ ⋅ ‖ is the Euclidian norm,  is the image to be recovered,  is the observed image, and ‖ ⋅ ‖ TV stands for the discrete TV norm.The positive parameter  balances the measurements and the noise sensitivity.
The methods of much interest to this paper are the dual approach proposed by Chambolle [20][21][22], the methods on the dual approach for the constrained denoising problem reported by Beck and Teboulle [23,24], and the fourdirectional total variation (4-TV) proposed by Sakurai et al. [16].Chambolle developed for the denoising problem a globally convergent first-order primal-dual algorithm, which is much faster than the conventional gradient descent algorithm.Beck and Teboulle used the gradient projection (GP) and the fast gradient projection (FGP) methods.Sakurai et al. [16] first proposed the four-directional total variation for anisotropic case, but a complete mathematical proof was not provided (it was admitted as such in their paper [16]).The FGP method relies on papers published by Nesterov [26,27] where a fast first-order method is derived by coupling the gradient-based method with smoothing techniques.The rate of convergence of FGP is of the order (1/ 2 ) as opposed to the slower (1/) rate of convergence of GP, where  is the number of iterations.
In this paper, we provide the complete mathematical proof for both anisotropic and isotropic cases together with a proper definition of the 4-TV model.Moreover, we propose a fast constrained 4-TV algorithm for image denoising problem.To our knowledge, this is the first time that the double information in both time and space domains is jointly used.
The paper is organized as follows.In Section 2, the theoretical derivation of the 4-TV model is presented, and a new fast algorithm for this new model is described.Section 3 reports our experimental results including a comparison with the total generalized variation (TGV) and the nonlocal total variation (NLTV).A discussion is provided in Section 4. Finally, a short conclusion is given in Section 5.

2.1.
The Discrete Four-Directional TV Model.We first consider the discrete ROF model which is a convex but nonsmooth minimization problem where the discrete total variation in ( 1) is represented by TV(⋅).
Here, we only consider images defined on a rectangular domain, so we have  ∈  × and  ∈  × .
The isotropic TV is defined as and the anisotropic TV as The discrete 4-TV model is an extended version of the conventional TV proposed by Sakurai et al. [16].It relies on four different directional components (horizontal, vertical, and two different diagonal directional components) while the various conventional 2-directional TV just adopt 2 different components (horizontal and vertical directional components) as shown in Figure 1 (extracted from [16]).
The anisotropic 4-TV is defined by Sakurai et al. [16] as where, in the above formula, Sakurai et al. did not consider the boundary conditions.Moreover, the isotropic case was not taken into consideration in their paper.
In order to get a more accurate 4-TV based model, we construct a new image   .
(a) The corrupted image  is The images in (a) and (b) share the following relations: Therefore, the anisotropic 4-TV can be defined as x i,j x i,j+1 x i+1,j (a) Two-directional TV x i,j x i,j+1 x i+1,j x i+1,j+1 (b) Four-directional TV Figure 1: Total variation components [16].
and the isotropic 4-directional as where the above boundary is expanded by constructing the new image   ∈  (+2)×(+2) .Moreover, the domain of   ∈  (+2)×(+2) plays an important role in the 4-directional TV based dual method.Consequently, the discrete 4-directional TV model is defined by min  ‖ − ‖ 2 + TV 4 () . ( By adopting the above 4-directional TV model, each iteration step uses double information in the space domain.For this reason, the new model can be expected to have more general and effective properties than the standard one in image denoising problem.In addition, both anisotropic and isotropic TV cases can be dealt with.

Constrained 4-Directional Total Variation
Based Denoising 2.2.1.The Dual Approach.We consider the constrained 4directional TV based denoising problem which corresponds to min where  is a closed convex subset of  ≡  × and the nonsmooth functional 4-directional TV is either anisotropic TV or isotropic TV.In order to avoid repetition, we will mainly consider the isotropic TV, and the results for the anisotropic TV will be briefly presented.
The TV function is characterized by the nonsmoothness.The characteristic of the nonsmoothness is the key difficulty in problem equation (2).Chambolle [21] proposed a dual approach to surmount this shortcoming.Beck and Teboulle [23] and Sakurai et al. [16] followed the same approach and we also follow it for constructing our constrained 4-directional dual method.
First, we define Here, we do not assume reflexive boundary conditions owing to constructing the new image space   .Now, let  be the set of matrix-group ( 1 ,  2 ,  3 ,  4 ) that satisfies Then, we also have the relation: We introduce the linear operation Λ ∈  × , which is defined by So, we can write where Because the above function is concave in ( 1 ,  2 ,  3 ,  4 ) and convex in , we can exchange max and min [28].
Let  [min,max] be the orthogonal projection operator on the set [min, max].So,  [min,max] is given by The optimal solution of the constrained 4-directional TV based denoising model in (12) is By neglecting the constant term ‖‖ 2 in (18), we obtain max Let ( 1 ,  2 ,  3 ,  4 ) ∈  be the optimal solution of min The operator Λ  ∈  × which is the adjoint to Λ is given by We consider the function  ∈  × defined by Equation ( 22) can be rewritten as And the gradient of (⋅) is given by Therefore, From the above derivation, we see that the dual problem expressed by ( 22) is a convex minimization problem where the function is also continuously differentiable in a constraint set.Thus, the first-order gradient-based algorithms can be applied.
The only difference between the isotopic TV and anisotropic TV cases is contained in the relation: replacing (15).The gradient of ( 24) has been obtained.In order to solve the dual problem equation (22), we also need to calculate the Lipschitz constant of the gradient objective function [23].Let (ℎ) be the Lipschitz constant of the gradient objective function given by (22).
The Euclidian norm of the matrix-pairs ( 1 ,  2 ,  3 ,  4 ), where For every two groups of matrices 2 ), we have (32) The overall procedure to implement this constrained 4-directional gradient projection (4-GP) algorithm can be summarized as shown in Algorithm 1.
In the constrained case, a group Output:  * -Obtained optimal solution of ( 12) up to a tolerance.

The Fast Dual Approach.
It has been shown from the above derivations that the dual problem equation ( 22) is a convex minimization problem where the function is also continuously differentiable and in a constraint set.The original fast gradient projection algorithm can be traced back to the gradient mapping approach proposed by Nesterov [26].Since then, a number of new algorithms, inspired by Nesterov's work, have been reported [17,20,23,24,27,29].
Here, we used the constrained 4-directional fast gradient projection (4-FGP) algorithm for the denoising problem.The 4-FGP algorithm has a convergence rate in (1/ 2 ) by utilizing double information (most recent two steps) in the time space better than the convergence rate (1/) of the 4-GP algorithm.Following the FGP algorithm described in [23], the 4-FGP algorithm can be described in Algorithm 2.

Experimental Results
These experiments were conducted on images widely used in the computer vision literature.We selected two samples among this trial set, the "Cameraman" and the "Moon" pictures, to illustrate the effectiveness of the proposed method.These two images, by their different contents, are representative of the large spectrum of data sets that can be considered.A comparison of our methods (4-GP and 4-FGP) was performed with the GP, FGP [23], TGV [11], and NLTV [12] methods.The peak signal to noise ratio (PSNR), the convergence rate, and the robustness to noise were used for the evaluation of the denoised image quality.All algorithms have been implemented on a PC Intel Duo Core CPU E8400 3 GHz, RAM 8 GB with MATLAB R2011b.
The 300 × 300 gray-scale test image "Cameraman" and the 348 × 300 gray-scale test image "Moon" (Figures 2 and  3) were scaled in intensity to [0, 1].A normally distributed zero-mean Gaussian noise was then added, with standard deviations equal to 0.11 and 0.07 for the "Cameraman" image and the "Moon" image, respectively.
The parameters were set to  = 0.1 for the "Cameraman" image and to  = 0.05 for the "Moon" image in all experiments.The tolerance value for the convergence test was set to 0.0001 dB.
For the TGV method [11], the parameters were set to  0 = 1,  1 = 2,  = 0.1, and  = 0.2 throughout this paper.For the NLTV method [12], we set the patch size as 5 × 5, the number of neighbors as  = 20, and the searching window as 11 × 11.
The PSNR values obtained in the above cases for the different methods are indicated in Figures 2 and 3 captions.They show that the convergence values when using 4-GP or 4-FGP are almost identical.They also show that the 4-GP and 4-FGP methods lead to a PSNR gain, ranging, for instance, from 0.3 dB to 1 dB for the "Cameraman" image.
From Figures 2 and 3, we can see that all methods have their own advantages and drawbacks.The GP and FGP methods remove the noise but still produce a staircasing Input: -Observed image.-Regularization parameter.-Number of iterations.Output:  * -Obtained optimal solution of ( 12) up to a tolerance.
Step 0. Take ( 1  1 ,  2 1 ,  4 1 ,  1 1 ) = ( 1 0 ,  2 0 ,  3 0 ,  4 0 ) = (0, 0, 0, 0),   effect in the flat and smooth regions.The TGV method clears up this effect while preserving the edges when the prior information is very close to the original image, but it leads to false image features when the prior information has been corrupted by a high level of noise.The NLTV method reduces the staircasing effect but some details are lost.The 4-GP and 4-FGP methods well preserve the edges and capture more details because the four different directional components are taken into consideration.
As it was expected, the convergence is much faster when using 4-FGP instead of 4-GP.A detailed analysis of the convergence process makes it clear that the number of iterations is image dependent and much higher for GP than for FGP; for the "Cameraman" image, this number is equal to 124 for 4-GP and to 41 for 4-FGP; for the "Moon" image, they are, respectively, equal to 103 and 51.The time computation varies accordingly; it goes for the "Cameraman" image from about 55 seconds for 4-GP to approximately 18 seconds for 4-FGP and for the "Moon" image from about 53 seconds for 4-GP to approximately 27 seconds for 4-FGP.
From Figure 4, we can find that the convergence values of 4-GP and 4-FGP are almost the same.They are similar to those obtained by GP and FGP.So, we will just consider FGP and 4-FGP in the next experiments in order to avoid repetition.
In the examples described above, the values of the noise and  were a priori set.Let us now examine the sensitivity of these parameters to explain the rationale behind our choices.We varied here the values of  from 0 to 0.1 and from 0 to 0.05 for the "Cameraman" image and the "Moon" image, respectively.
The results are provided in Figure 5 and show that the 4-FGP method performs always better than the GP and NLTV methods in terms of PSNR for the two images, the only exception being for the TGV method when the value of  is small.
By taking into account the previous results, the noise effect was analyzed.The selected parameters were  = 0.1 for the "Cameraman" image and  = 0.05 for the "Moon" image.The noise level was varied from 0 to 0.3 in the first case and from 0 to 0.15 in the second case.Figure 6 depicts the evolution of PSNR.
The performances of the FGP, TGV, and NLTV methods are inferior to the performance obtained with 4-FGP method when the value of noise is large and superior to the performance of the 4-FGP method when the value of noise is rather low.

Discussion
These methods were compared on three other images and additional comments are provided in this section.The objective was to see if they were stable enough to be generalized to any type of images.Two images, "Lena" and "Woman, " were used by Chambolle in [21] and the third one, the "Louvre" image, was used in [20].The experimental conditions were a Gaussian noise level 0.1 and  = 0.1 for all three images.Table 1 presents the PSNR obtained by the different methods.These results confirm the interest of 4-TV even if the benefits remain in a modest range.

Conclusion
In this paper, 4-GP and 4-FGP methods were proposed for image denoising.We added the diagonal components to the conventional TV model and we provided the complete mathematical proof of the relevance of the new model.Moreover, the 4-FGP algorithm makes for the first time use of the double information in both time and space domains.Experimental results show that the 4-GP and 4-FGP methods lead to better denoising results in most cases.In the future work, we will address the weighted 4-GP and 4-FGP frames.