A Fast Alternating Minimization Algorithm for Nonlocal Vectorial Total Variational Multichannel Image Denoising

The variational models with nonlocal regularization offer superior image restoration quality over traditional method. But the processing speed remains a bottleneck due to the calculation quantity brought by the recent iterative algorithms. In this paper, a fast algorithm is proposed to restore the multichannel image in the presence of additive Gaussian noise by minimizing an energy function consisting of an l 2 -normfidelity term and a nonlocal vectorial total variational regularization term.This algorithm is based on the variable splitting and penalty techniques in optimization. Following our previous work on the proof of the existence and the uniqueness of the solution of the model, we establish and prove the convergence properties of this algorithm, which are the finite convergence for some variables and the q-linear convergence for the rest. Experiments show that this model has a fabulous texturepreserving property in restoring color images. Both the theoretical derivation of the computation complexity analysis and the experimental results show that the proposed algorithm performs favorably in comparison to the widely used fixed point algorithm.


Introduction
The restoration of multichannel image is one of the most important inverse problems in image processing.Among all the methods for restoration, the vectorial total variational (VTV) regularization is most popular for preserving discontinuities in restored images.VTV has been introduced by [1] into multichannel image restoration as a vectorial extension of total variation (TV) in gray scale image processing.
The famous total variational (TV) methods have drawbacks due to its local property while calculating the gradient of the image.Thus the images that contain many small structures or textures cannot get satisfying restored results.As a result, TV method based on nonlocal (NL) operator is generally used in gray scale image processing because of its structure preserving property.This method is generalized from the Yaroslavsky denoising and the block method.The main idea is using the similar blocks in the whole image to estimate current pixel value.It was introduced by Efros and Leung [2] while studying texture synthesis.Then Buades et al. [3] generalized nonlocal method and proposed the wellknown neighbor denoising filter, namely, nonlocal means (NLM).After that, Gilboa and Osher [4] united nonlocal method into the regularization framework by defining the nonlocal formed TV model including NL-TV-ℓ 1 model, NL-TV-ℓ 2 model, and NL-TV-G model and proposed the algorithms separately.NL-TV model can both preserve the edges and the other small structures in images and has excellent application effect.
Our previews works extends NL-TV regularization to NL-VTV regularization for multichannel image restoration (see [5]).The corresponding algorithms are based on the variation calculus by solving the Euler-Lagrangian equations by fixed point iterations.This kind of methods is tested effectively but has a low convergence rate.
Recently, the optimization algorithms based on variablesplitting and penalty techniques have attracted much attention in image and signal processing, such as the dual methods, the Bregman methods, and the alternating direction method of multipliers (ADMM) methods (see [6][7][8]).Correspondingly, these methods have been extended for multichannel images.Bresson and Chan [6] has extended Chambolle's dual algorithm for TV problem to vectorial images by defining the standard dual definition of VTV.Wang et al. [9,10] has applied the alternating minimization algorithm to VTV model for multichannel image deblurring.
In this paper, we propose a fast alternating algorithm for NL-VTV model to restore color images from noisy observations.The following sections are organized as follows.Section 2 is the brief introduction of the NL-VTV model and classical fixed point algorithm.In Section 3, we propose the alternating algorithm for NL-VTV model.In Section 4, we do the convergence analysis of this algorithm.In Section 5, we applied this algorithm to color image denoising to present the texture-preserving effect of the NL-VTV model and the high convergent rate of the proposed algorithm.

Nonlocal Vectorial Total Variation Model
In this section, we will briefly introduce the nonlocal vectorial total variational model for multichannel image processing.The usually used iterative algorithm will be introduced as well, which will be compared to our algorithm in Section 5.

Vectorial Total Variation
where The regularized VTV model for multichannel image reconstruction could be written as where u 0 is the observed noisy multichannel image and  > 0 is the regularized parameter.The vectorial gradient of u is defined as ( We refer to [6] for more information about the wellposedness analysis of the VTV model.
where   is the Gaussian kernel with the standard deviation of : where ℎ is the filter parameter.Generally, ℎ corresponds to the noise level, which is usually set as the standard deviation of the noise.It is shown that the weight function above could be calculated by the image blocks centered around , .The value of the weight function reflects the similarity of the two blocks.The estimation of the value at pixel  by the nonlocal means is where the normalized factor () = ∫ Ω  ] (, ).It can be seen that this method is a kind of special filter algorithm, and the estimation of the value at pixel  is weighted by the values of all the pixels  ( ∈ Ω) in the image.Let the 2-dimensional image domain be Ω ⊂ R 2 , pixels  ∈ Ω, and () is a real function on Ω → R. The derivation under the nonlocal frame is    () = ( () −  ())  (, ) , (7) where  : Ω×Ω → R is the nonnegative symmetrical weight function defined as (4), which is calculated by the reference image.
The nonlocal gradient ∇  (, ) is defined as a vector composed by all the partial derivations at position : ∇  (, ⋅), where ∇   : Ω → Ω × Ω: which satisfies the conjugation with the nonlocal gradient; that is, Thus the nonlocal Laplacian is defined as The NL-TV norm of  ∈  1 (Ω) based on nonlocal operators, that is, the weighted gradient ∇  , is defined by where function (, ) is defined as (4).The NL-TV model can be written as The well-posedness of NL-TV model is guaranteed in [11].

Nonlocal Vectorial Total Variation Model and the Fixed
Point Iterative Algorithm.In this section, the nonlocal gradient will be introduced to reconstruct the variation norm in the vectorial total variation model to deduce the NL-VTV model.
The vectorial nonlocal gradient could be derived by generalizing the nonlocal gradient defined previously: where ∇    = (  () −   ())√  (, ),  = 1, . . ., , represents the gradient of each channel separately, which is calculated by the image of its own channel.
Use the vectorial nonlocal gradient ∇  u to replace the gradient ∇u in VTV norm; we get the NL-VTV model: where ∫ Ω |∇  u|x is the VTV regularization based on the nonlocal gradient, which could be denoted by The existence and the uniqueness of the solution to (15) have been proven in our previews work, in which we proposed a fixed point iterative algorithm to solve (15) as well.Here we will give out a brief introduction as follows.
The Euler-Lagrangian equation of the functional (15) is Calculate the nonlocal curvature channel by channel; that is, where According to (19), when the weight function is fixed, the Euler-Lagrange equation is linear, which could be solved by the gradient descent kind of algorithm as follows.
First of all, we discretize (19).Denote    to be the pixel value at position  of the th channel image, where 1 ≤  ≤ , 1 ≤  ≤  (if the size of the image is  × ,  =  × ).  , is the discrete value of the weight function (, ), that is, the weight function value with respect to pixel  and pixel .The pixel  traverses the image channel where the pixel  is in.Denote the neighbor set to be N  := { :  , > }, where  denotes a threshold value as a judgment of the similarity between the two blocks centered at pixel  and pixel , respectively.Let ∇  be the discrete ∇  : And let div  be the discrete div  : div  ( , ) := ∑ ∈N  ( , −  , ) √   , .As for functions, the discrete inner product is ⟨, V⟩ := ∑  (  V  ), while, as for vectors, the discrete dot product is (, )  := ∑  ( ,  , ), the discrete inner product is ⟨, ⟩ := ∑  ∑  ( ,  , ).The norm of vector is Then for channel , where 1 ≤  ≤ , the discrete nonlocal curvature is Furthermore, it could be written as According to the fixed point iteration, we use the multichannel image u() in step  to update image u( + 1) in step  + 1.The weights are calculated by the original observation image.Then we get the iteration formula as where  ∈ Ω,  ∈ N  .
The convergence property of the fixed point algorithm has been proved in our in-press work.However, when |∇  u| = 0, the NLTV functional is nondifferentiable, which is similar to the classical TV model.Thus a regularization √|∇  u| 2 +  is involved in the calculation of   , to escape the denominator to be zero.Consequently, this algorithm takes many iteration steps to obtain the best restoration result.Thus we consider designing a fast alternating minimization algorithm instead.

The Alternating Minimization Algorithm
For the sake of conquering the nondifferentiability of the VTV functional, we design an alternating minimization algorithm in this section to solve the problem (15).The main idea of the alternating minimization is introducing an auxiliary variable.Then the original problem could be split into two subproblems of which one is differential with respect to the original variable, while the other is convex with respect to the auxiliary variable, respectively.
Yang et al. [10] have already studied the alternating minimization algorithm for a general model: Model (26) reduces to (15) when   =   ⊗ ∇  ,   = 1, and  = , where   is the identity matrix of order  and ⊗ represents the Kronecker product.Yang et al. pointed out in [10] that if   is the finite difference operator   , then, under some boundary conditions for u and some other assumptions for the blurring kernel , the problem (26) can be fast transformed by fast Fourier transform, which diagonalizes the block circulant matrices:   , , and their transpose, and then can be solved by the fast algorithm such as the Gaussian elimination method.However, the nonlocal gradient operator ∇  in (15) is not block circulant.We will figure out another method to speed up the algorithm.We briefly propose the algorithm in this section.The convergence of this algorithm will be analysed in Section 4, as well as others properties of this algorithm.

Variable Splitting (Half-Quadratic Formulation of (15)).
We introduce an auxiliary variable d to approximate the nonlocal gradient of u, that is, ∇  u.Then the minimization problem ( 15) is transformed into an equivalent constrained problem: In order to solve problem (27), we use the penalty method to reformulate it as where  is the penalty parameter.It could be easily proved that, as  → ∞, the solution of problem (28) is convergent to the solution of problem (15).

Alternating Minimization Algorithm.
The problem of minimizing (28) with respect to d for fixed u becomes The functional in (29) is convex with respect to d.According to Lemma 3.3 in [10], the problem (29) has the unique solutions given by where the multidimensional shrinkage operator shrink(x, )= (x/|x|) ⋅ max{|x| − , 0}.The functional in (31) is differential with respect to u.Thus the solution to (31) satisfies the first order optimized condition; that is, the Euler-Lagrange equation of the functional (31) is After some steps of simple calculation, we get Then according to (11), we get Finally, the multichannel image u is calculated as follows: Since the graph Laplacian Δ  is negative semidefinite, the operator I − (2/)Δ  is diagonally dominant.Therefore we can solve u +1 by a Gauss-Seidel algorithm.However, the subproblem (35) consists in solving the linear equations with the scale of  ×  for each image channel.As a result, the calculation complexity of the Gauss-Seidel algorithm is heavy.We are inspired by Chen and Cheng [12] to simplify, and, without loss of generality as well, we chose  =  2 .Furthermore, using Taylor expansion to approximate, there is where W = (  )  2 ,=1 is the weight matrix of similar blocks of each channel, which is normalized to a symmetrical matrix with the sum of every row equal to 1. Thus the solution to (35) is According to (9), the divergence at pixel  is div     = ∑  (   −    ) √   ,  = 1, . . ., .To conclude this section, we rewrite the alternating minimization algorithm as in Table 1.

Convergence Analysis
Since it has been proved in our in-press work that the model (15) has the unique solution, in this section, we prove that the solution obtained by Algorithm 1 is the unique one.First, the convergence of the quadratic penalty method as the penalty parameter goes to infinity is well known [13].That is, as  → ∞, the solution of (31) converges to that of (15).We analyze the convergence properties of Algorithm 1 for a fixed  > 0.
In this section, we show that, for any fixed , the algorithm of minimizing u and d alternately has finite convergence for some variables and -linear convergence for the rest.Proof.According to (30) and (37), the functional (u, d) in (28) satisfies Therefore, the sequence {(u  , d  )} is nonincreasing.Thus it is convergent in R  because of its nonnegativity as well.We denote the limit of the sequence by m.We prove that m = (u * , d * ) followed, where u * and d * are the solution of (31).
Because  is coercive and the sequence {(u  , d  )} is convergent, the sequence {(u  , d  )} is bounded.Thus there exists a subsequence {(u  , d  )}, which converges to an accumulation point (û, d) when   → ∞.According to (30) and (37), for all   ∈ N, d, and, for all   ∈ N, u, We denote d as the accumulation point of the sequence {d   +1 }, because of that the functional  is continuous, according to (38), Because of that the threshold function, shrink, in (30), is continuous, we compute the limits of both sides of the equation and get We compute the limits of (39) and (40), According to (43), (0, 0)  ∈ (û, d).Thus the limit points of all of the subsequence of {(u  , d  )} are the solutions to (28).As a result, {(u  , d  )} converges to (u * , d * ).
where ] < 1 is a positive constant.Then we have the following conclusion.

Experiments
In this section, we apply Algorithm 1 to color image denoising to test its efficiency.We compare this method to that in [10] to show the detail-preserving property of the nonlocal VTV model.Then we compare the convergent rate of this algorithm with the fixed point algorithm presented in Section 2.

Texture-Preserving Property of the NLVTV Model.
We choose the color Barbara picture size 300 × 300 × 3 and the color Baboon picture size 256 × 256 × 3 for testing, both of which contain many textures, such as the woman's scarf and the baboon's beard.We add the Gaussian white noise with standard deviation 20 to the two original pictures.Then we use the VTV model and the NLVTV model to restore them, separately.In this experiment, the signal to noise rate (SNR) is used to estimate the restoration effect of the methods, which is defined as follows: where var() represents the variance of all the elements of matrix .
As is shown in Figure 1 and Table 1, the NLVTV model obtains similar SNR to VTV model.In the experiment of Figure 2 shows the results.Figure 3 shows the zoom part of the restored image, that is, the woman's scarf.We can see that the NVTV model preserves the details better than the VTV model.Figures 4 and 5 show the results of the experiment of restoring the color Baboon picture.Although the NLVTV model gets lower SNR than the VTV model, the textures in this picture are better preserved by the NLVTV model rather than the VTV model.

Computation Complexity Analysis and the Convergent
Rate Comparison.In this section, we compare the computation complexity and the convergent rate of this algorithm to the fixed point algorithm.First, we analyze the computation complexities of the two algorithms, respectively.Assume the size of the multichannel image is ××.The size of the nonlocal searching window is  × .We do not consider the computation complexity of the calculation of the nonlocal weight, because that is executed exactly the same in both algorithms.
At each iterative step of Algorithm 1, namely, (30) and (37), the dominant calculation of (30) contains the calculation of the nonlocal gradient and the threshold computation of () 2 times.Thus the computation complexity is ( 4 ) + (() 2 ).The dominant calculation of (37) contains a step of nonlocal divergence computation with the computation complexity of ( 2 ) and a matrix multiplication with the computation complexity of ( 4 ).As a result, the total computation complexity for each iteration is ( 4 ) + (() 2 ) + ( 2 ) + ( 4 ).
On the other hand, each iterative step of the fixed point iteration contains the nonlocal curvature with the computation complexity of ( 4 ) and three matrix multiplications with the computation complexity of ( 4 ) + ( 4 ) + ( 4 ).As a result, the total computation complexity for each iteration is ( 4 ) + ( 4 ) + ( 4 ) + ( 4 ).
Consequently, the computation complexity of the proposed algorithm is much less than the fixed point algorithm.The convergence rate can be represented by the relationship between the iterative step and the standard deviation of the restored image, intuitively.(51) Figure 6 shows that the alternating minimization algorithm converges faster.

Conclusion
In this paper, we propose a fast alternating minimization algorithm for multichannel image processing, based on the new nonlocal vectorial total variational model introduced recently.We use the variable splitting and penalty techniques to split the energy minimization problem into two optimization subproblems.One is a simple ℓ 1 minimization problem, which is solved by the threshold method.The other is a simple ℓ 2 minimization problem, which is solved by the variation calculus.As a result, this algorithm is faster than the classical algorithm because of avoiding dealing with the nondifferentiability of the nonlocal TV norm.
The main contribution of this paper is introducing the alternating minimization algorithm for the nonlocal vectorial total variational model, which is proved fast and can be applied to many multichannel image processing problems.In this paper, we apply this algorithm to the RGB color image denoising as an example.It can be easily extended to other vectorial problems, such as color image deblurring, and to other color image models, such as hue-saturation-value (HSV) model, as well.
Model.Consider a vector valued function u, such as the multichannel images, defined on a bounded open domain Ω ⊂ R  :
Thus d = arg min d (û, d).According to (41), d = arg min d (û, d).On the other hand,  is strictly convex with respect to d, then we have d = d because of the uniqueness of the minimizer of min d (û, d).Similarly, we prove that ũ = û.

Figure 2 :
Figure 2: Comparison between the NLVTV and the VTV in the Barbara image.

Figure 3 :Figure 4 :
Figure 3: Zoom in the Barbara image.

Figure 5 :
Figure 5: Zoom in the Baboon image.

Figure 6 :
Figure 6: Comparison of the convergence rate.

Figure 6
shows the comparison of the convergence rate between the fixed point iteration and the proposed alternating minimization algorithm.The horizontal axis represents the iterative step.And the vertical axis represents the standard deviation of the restored image in each step.The standard deviation is calculated asstd (      u  − u −1      ) ,(50)where u  and u −1 represent the restored images of step  and step  − 1, respectively.When calculating std(), we reshape matrix  into a vector  whose elements are taken columnwise from matrix .Then std ( The similarity of the two pixels in the image ,  could be measured as the following weight function  ] (, ): [3].Nonlocal Functional.The neighborhood filter NLM proposed by Buades et al. developed to a famous image denoising method[3].It uses the pixels similar to the current pixel to estimate the value of the current pixel.Let the reference image be ].

Table 1 :
Comparison of the highest SNR obtained.