Variable Splitting Based Method for Image Restoration with Impulse Plus Gaussian Noise

Image denoising is a fundamental problem in realm of image processing. A large amount of literature is dedicated to restoring an image corrupted by a certain type of noise. However, little literature is concentrated on the scenario of mixed noise removal. In this paper, based on the model of two-phase method for image denoising proposed by Cai et al. (2008) and the idea of variable splitting, we are capable of decomposing the image denoising problem into subproblems with closed form. Numerical results illustrate the validity and robustness of the proposed algorithms, especially for restoring the images contaminated by impulse plus Gaussian noise.


Introduction
When recording a real-world scene by camera, we desire to acquire an ideal image which is a faithful representation of the scene.However, most observed images are more or less involved in blurry and noise.Hence, deblurring and denoising to the observed image are fundamental aspects in image processing.Let x ∈ R  1 × 2 be an ideal image scaled in [0, 1].Hereafter, we represent an  1 ×  2 image x as  ∈ R  ( =  1  2 ) by stacking its columns as a vector. ∈ R × is matrix representation of spatially invariant point spread function (PSF) which characterizes the blurry effects on imaging system and n ∈ R  is additive noise with some statistical distributions (e.g., Gaussian noise follows normal distribution, impulse noise follows binomial distribution or uniform distribution, and uniform white noise follows uniform distribution).Accordingly, the observed image  0 ∈ R  involved in the blurring matrix  and the additive noise n can be boiled down to The main task of image restoration is to recover the ideal image  from the observed image  0 .There are two difficulties in finishing this task.The first one is that the singular values of  decay asymptotically to zero which results in a large condition number, and as a consequence ( 1) is an ill-posed inverse problem in mathematical sense.The second difficulty is that the inevitably additive noise n in the observed image makes (1) difficult to handle.These two difficulties make the linear least squares solvers generally fail in deriving ideal image  and regularization strategy is therefore introduced to improve the numerical performances, for example, Tikhonov regularization [1], Mumford-Shah regularization [2], and total variation (TV) regularization [3].The main superiority of TV regularization is that it can regularize images without blurring their edges.The common models utilizing TV regularization for image restoration are the constrained model min ‖|∇|‖ 1 s.t.
−  0      ≤ , and it is equivalent to (or rather equivalent under a proper , see [4]) the unconstrained model where ,  are parameters depending on noise variance; ∇ is first-order derivative operator and ‖|∇ ⋅ |‖ 1 is the TV-norm 2 Mathematical Problems in Engineering (details in Section 2); ‖ ⋅ ‖  denotes the -norm in Euclidean space (if  = ∞, ‖ ⋅ ‖   herein should be comprehended as ‖ ⋅ ‖ ∞ ).Commonly,  is hinged on the statistical distribution of the additive noise n; for example,  = 1 for impulse noise removal,  = 2 for Gaussian noise removal, and  = ∞ for uniform white noise removal.It is therefore pivotal to figure out statistical information of n before opting for mathematical model.
Recently, Huang et al. [5] which is identical with (4).They employed the alternating minimization (AM) algorithm to handle model ( 4) in variables  and , respectively.The numerical results therein indicated that the model was attractive.Lately, they extended model (4) to the mixed noise (impulse plus Gaussian noise) removal in [6].
where A ⊂ Ω fl {1, 2, . . ., } contains the noise-free pixels (pixels are not contaminated by impulse noise);  A is the characteristic function of set A. Specifically, for any  ∈ R  ,  ∈ {1, 2, . . ., },  A () can be defined as which is convex and nonsmooth.Then, by using the incomplete data set in A, TV minimization method is constructed for image restoration and suitable values can be filled in the detected noisy pixel locations.
As interposition, we expand in this paragraph about mixed noise removal.Gaussian noise [3] generally results from the analog-to-digital conversion of measured voltages and it corrupts an image by exerting subtle perturbations on the gray scale of all pixels.Impulse noise [7] typically arises from the malfunctioning arrays in camera sensors or faulty memory locations in hardware and it affects an image by seriously devastating the gray scales of some pixels (but keeping the other pixels noise-free).Salt-and-pepper noise [8] and random-valued noise [9] are two categories of impulse noise.Let [ min ,  max ] be the dynamic interval of image .For given noise level (also the so-called noise intensity)  ∈ (0,1), salt-and-pepper noise and random-valued noise contaminate image , respectively, by the following: (i) Salt-and-pepper noise (see [10]) (ii) Random-valued noise (see [11]) where N imp : R  → R  represents the procedure of contaminating image  with impulse noise and   satisfies uniform distribution in interval [ min ,  max ].
Figure 1 shows distinctions of image contaminated by saltand-pepper noise and random-valued noise.Visually, the salt-and-pepper noise corrupted image possesses black-white speckles while the random-valued noise corrupted image has speckles with various gray scales.The observed image  0 is contaminated by mixed noise via where n  is additive Gaussian noise.It is tricky to retrieve the ideal image  in (11) from its observation  0 because the statistical distribution of noise in  0 is superimposed.Cai et al. [12] proposed a two-phase method for mixed noise removal.They first detect the noise-free set A by median-type filter [10,13] and then execute the procedure of image deblurring on set A. The model therein is where Γ ⊂ Ω is the set of image edges and the option for  is determined experimentally.Model ( 12) is attractive to render restoration with fine structures in image, such as neat edges and textures.However, the nonconvexity of objective function in model (12) makes it difficult to be minimized.A convex approximation of model ( 12) is considered therein.However, the computational effort on minimizing that convex approximation is time-consuming.Some literature is devoted to the case of mixed noise removal recently because of its realistic applications, for example, [6,12,[14][15][16].
As aforementioned, Huang et al. [6] proposed a fast method for mixed noise removal by solving model (7).The two-phase pattern is exploited as in [12].Concretely, one has the following.(i) Phase 1: Noise Detection.Median-type filters are favorable for impulse noise detection.Adaptive median filter (AMF) [10] is adopted for salt-andpepper noise detection and adaptive center-weighted median filter (ACWMF) [13] is chosen for randomvalued noise detection.Assuming that  0 is the observed image and  0 is the restored image by median-type filter, the set of pixels corrupted by impulse noise, denoted by N, is detected via (ii) Phase 2: Minimization on model (7).Model (7) is solved by AM algorithm.Specifically, the -related subproblem is solved by Chambolle's projection algorithm [17] and the -related subproblem is solved by preconditioned conjugate gradient (PCG) method.
The numerical results in [6] illustrated that their algorithm is efficient and promising.It is about 8-10 times (resp., 15-20 times) faster than the algorithm proposed in [12] for impulse noise (resp., mixed noise removal).However, the properties of model ( 7) can be further exploited by variable splitting strategy and the resulting subproblems can be tractably tackled with closed-form solutions.This is also the main goal of our paper.Since model ( 4) is an exceptional case of model (7) with A = Ω and  = 2, we concentrate on handling model (7) in this paper by variable splitting strategy.The contributions of this paper are as follows.Firstly, we extend models (4) and (7) to deal with different kinds of image denoising, that is, Gaussian noise, salt-and-pepper noise plus Gaussian noise, and random-valued noise plus Gaussian noise in various ways of blur kernels.Secondly, the variable splitting strategies with efficient closed-form solution are investigated.Concretely, we propose the recursions of optimizations ( 4) and ( 7) by the classical alternating direction method (ADM).Furthermore, we contemplate coping with (7) with three-block ADM scheme.To guarantee the convergence of algorithm, the ADM with Gaussian-backsubstitution is considered for handling model (7).Overall, compared with some standard methods, our proposed models and algorithms exhibit the attractive performance of restoring the images contaminated by impulse plus Gaussian noise.
The rest of the paper is organized as follows.In Section 2, we summarize some basic notations and properties that will be useful in the subsequent sections.In Sections 3 and 4, algorithms based on variable splitting are presented and followed by some numerical experiments in Section 5.The numerical comparisons with the algorithm for Gaussian noise removal in [5] and the algorithm for mixed noise removal in [6] are implemented therein.Finally, we end the paper with some conclusions in Section 6.

Preliminaries
In this section, we summarize some basic concepts and properties that will be used in subsequent analysis.Let  ∈ R  be a vector and  ≥ 1 be an integer.By ‖‖  fl (∑  =1 |  |  ) 1/ , we denote the -norm of  and ‖‖ denotes the Euclidean norm for brevity.The inner product in Euclidean space is denoted ⟨⋅, ⋅⟩.For any positive definite matrix  ∈ R × , the matrix norm of  is defined by ‖‖  fl √⟨, ⟩.We denote by  the identity operator.For any For any  > 0, the proximity function of () = ‖||‖ 1 is the well-known soft-thresholding (also the so-called shrinkage).
With special denotation S : where all operations are calculated in componentwise and The proximity operator of , denoted by prox  : R  → R  , is defined as [18,19] prox Specially, the projection operator on a nonempty closed convex set V ⊆ R  , denoted by Π V , is the proximity operator of  V .
In TV based image processing, the boundary conditions are indispensable to be taken into account for avoiding unwanted artifacts near the boundary.A common technique is assuming a certain behavior of the sharp image outside the boundary.The conventional boundary conditions are zero, periodic, and reflexive boundary conditions.We consider the reflexive boundary condition herein.By denoting ∇ fl [ ∇ 1 ∇ 2 ] as the first-order derivative operator, the horizontal direction first-order derivative ∇ 1 and the vertical direction first-order derivative ∇ 2 under reflexive boundary condition are defined as (for notational convenience, x ∈ R  1 × 2 is used for this definition) By denoting C as the discrete cosine transform (DCT) and C −1 as its inverse, ∇  ∇ (also the so-called Laplacian operator) under reflective boundary condition can be diagonalized by DCT (see [20]) as where Λ  ∈ R × is a diagonal matrix.Similarly, the blurring matrix  (under the circumstance that its PSF h is double symmetry) can be diagonalized by DCT (see [21,Chapter 4]) as where Λ  ∈ R × is a diagonal matrix.

Alternating Direction Method for Model (7)
Consider the following optimization problem with special structure: min  1 () +  3 () where  1 : R  1 → (−∞, +∞] and  3 : R  3 → (−∞, +∞] are l.s.c.proper convex functions;  ∈ R × 1 and  ∈ R × 3 are given matrices;  ∈ R  is a known vector; X ⊆ R  1 and Z ⊆ R  3 are nonempty closed convex sets (we denote  3 () and  instead of  2 () and  in (20) as being merely for convenience of analysis in subsequent sections).Suppose that ( * ,  * ) is an optimal solution of (20).For some  *  ∈   ( * ) ( = 1, 3), it follows from the first-order optimality condition in optimization that where  ∈ R  is the Lagrange multiplier of (20) [4,22].Owing to the special structure of (20), that is, both objective function and linear constraint are separable in variables  and , the methods based on variable splitting are recommended, for example, the classical ADM which was proposed originally in [23] and was studied intensively in literature (e.g., [24][25][26][27][28]).Numerous practical applications in diversified realms have illustrated its robustness and efficiency (see [29] for an overview of ADM).By introducing the augmented Lagrangian function of (20 where  > 0 is the penalty parameter of the linear constraint, the recursions of the well-known ADM can be summarized as Algorithm 1. We now concentrate on dealing with model ( 7) by ADM, that is, the second phase in [6].The case  = 2 in model ( 7) is Choose arbitrary  > 0,  0 ∈ R  3 and  0 ∈ R  for  = 1, 2, . . .do (1) Algorithm 1: Alternating direction method.
considered in this section.By introducing auxiliary variable V ∈ R  , we rewrite model (7) as By denoting  fl (, V), the linear constraint in ( 23) can be reformulated as + = .Furthermore, by setting X fl R  , Z fl R  × R  , and the optimization (23) complies with the pattern of optimization (20) and hence can be handled by ADM.We elaborate on the recursions for optimization (23) by utilizing ADM as follows.Firstly, the augmented Lagrangian function for (23) can be specified as (i) The -related subproblem for (23) exploiting ADM amounts to which can be approximated efficiently by Chambolle's projection algorithm [17].
(ii) The -related subproblem (involved in the duplets (, V)) for ( 23) utilizing ADM is According to the first-order optimality condition and the fact that Consequently,  +1 and V +1 can be solved orderly by We hence only need to concentrate on tackling the first linear system in (30).It can be solved by preconditioned conjugate gradient (PCG) method as suggested in [6] because its coefficient matrix is positive definite.
Remark 1. Specially, if A = Ω, the model ( 7) with  = 2 reduces to the model (4) proposed in [5].Because of  A =  under this circumstance, the -related subproblem is identical to (27) while the -related subproblem decays to Likewise, PCG method is a better option for the first linear system in (31).Specially, if the PSF of blurring matrix  is doubly symmetric, it can be diagonalized by DCT (see [21]).Hence, linear system (31) can be easily solved.
Note that when  = 1 is adopted in model ( 7), the resulted recursion for -related subproblem is identical to the case of  = 2 in (27).However, the -related subproblem is quite different with the case of  = 2.The -related subproblem of model (7) with  = 1 is which is tricky to be solved because of the nondifferential term ‖ A ( −  0 )‖ 1 (although it can be solved by iterative methods, the computational effort is intensive).It hence inspires our further investigation in following section.

Alternating Direction Method with Three Variables for Model (7)
Although the algorithm of ADM in Section 3 is valid for model (7) with  = 2 and it is favorable to handle model (4) for Gaussian noise removal (i.e., model (7) with  = 2 and  A = ), we have deadlock over the model (7) with  = 1.Consequently, we contemplate coping with model ( 7) by ADM with three separable variables.
Thereby, the linear constraint in (36) can be realigned as in the form of  +  +  = .Furthermore, by denoting the objective function in (36) can be split into three-block functions in variables , , and , respectively.Overall, optimization (36) favors the pattern of (33).Similarly, the augmented Lagrangian function of (36) can be specified as where We now elucidate the procedures of handling optimization (36) by Algorithm 2, and it will be shown that all the resulting subproblems possess closed-form solutions (either linear least squares problem or problem involved in softthresholding).
(i) The -related subproblem for (36) utilizing Algorithm 2 can be formulated as which essentially is a linear least squares problem.By firstorder optimality condition in optimization, it follows that  +1 satisfies the linear system As aforementioned, when the first-order partial derivatives operator ∇ is defined as in (17), the above linear system is tractable to be solved by DCT as in (18).
(ii) The -related subproblem for (36) exploiting Algorithm 2 can be rewritten as which is also a linear least squares problem.Likewise, by the first-order optimality condition, it follows that which is homogeneous to the first linear system in (31).Accordingly, it can be handled as promoted in Remark 1.
(iii) The -related subproblem for (36) applying Algorithm 2 is involved in the triplets (, V, ) as follows: It is obvious that the unconstrained optimization ( 44) is separable in variables , V, and .Hence,  +1 = ( +1 , V +1 ,  +1 ) can be acquired simultaneously (in parallel manner if possible) as follows: (a) The -related subproblem is and its closed-form solution is obtained by the softthresholding defined in (14).Concretely, we have which can be facilely solved by Combining the definition of  A , the following follows.
Remark 2. Literally, we take a fixed penalty  for the foregoing inferences.In practice, it is attractive to choose  dynamically, for example, the continuation strategy (e.g., [30]) and self-adaptive strategy (e.g., [28]).Moreover, it is reasonable to choose  differently in the last three terms of (39) so as to penalize those three different linear constraints in (36); that is, the augmented Lagrangian function in (39) can be rewritten as where   > 0,  = 1, 2, 3, are penalties corresponding to different linear constraints.Hence, the recursions for -, -, and -related subproblems can be regained accordingly.Generally, those skills on penalties can numerically render impressive performance (see, e.g., [7,31,32] and references therein).
Recently, A series variant algorithms for the ADM with more than two variables [33][34][35] are proposed, which are proved to be convergent and illustrate attractive numerical performances in tackling optimization problems with separable structure.Generally, the three-block ADM may not converge, which was demonstrated by Chen et al. [36] using counterexamples.Nevertheless, if all the objective functions are strongly convex, Han and Yuan [37] proved the global convergence of the three-block ADM scheme.Then, this condition was relaxed and only two or more functions in the objective are required to be strongly convex to ensure the convergence.More recently, it was studied in [38,39] that the global convergence of the three-block ADM scheme can be guaranteed only when one of the objective functions is strongly convex.However, neither of the three-block functions in the objective function in (36) is strongly convex and then the convergence of Algorithm 2 is still ambiguous.Under some special circumstances (e.g., the matrix involved in the linear constraint is the identity matrix or it is sparse), the performances of some variant algorithms can be matchable with that of Algorithm 2, even fairly close to that of Algorithm 2 [36].We recommend herein the ADM with Gaussian-back-substitution algorithm proposed in [40] for handling model (7).
where  is blurring matrix and  is defined in (37), ADM with Gaussian-back-substitution algorithm for the concrete optimization ( 36) is summarized in Algorithm 3.
The augmented Lagrangian function L(, , , ) in Algorithm 3 is as defined in (39).The detailed recursions for solving (36) utilizing Algorithm 3 can be inferred analogically as the scenario of using Algorithm 2. The convergence analysis of Algorithm 3 was expanded in [40].The option on parameter  ≈ 1 is recommended for practical numerical experiments.

Numerical Experiments
We test the aforementioned algorithms on the fundamental image processing problem, image denoising (both Gaussian noise removal and mixed noise removal).We first concentrate on the scenario of Gaussian noise removal in Section 5.1, with numerical comparisons to the algorithm in [5] (abbr, HNW08) and then turn to the case of mixed noise removal in Section 5.2, with numerical comparisons to the algorithm in [6] (abbr, HNW09).We abbreviate hereafter Algorithms 1, 2, and 3 as "ADM2," "ADM3," and "ADMGB," respectively.All the codes are written in matlab 7.9 and are run on a desktop computer with Intel(R) Core (TM) 2.66 HZ and 2 G memory.All the testing images for numerical experiments in this section are illustrated in Figure 2. To measure the quality of restored image, we define the signal-to-noise ratio (SNR) as where  is the restoration of the ideal image .The stopping criterion for testing algorithms is set as As mentioned in [12], it is an old and open problem for choosing the optimal parameters in algorithms; we therefore tone those parameters (e.g., , , and ) by trial and error.As suggested in Remark 2, we opt for the penalty  differently in Algorithms 2 and 3 for various linear constraints.5.1.Gaussian Noise Removal.In this subsection, we consider the case of Gaussian noise removal, that is, the model (4).The ideal images are degraded by various blurs and are further contaminated by additive Gaussian noise with noise variance  2 .Those procedures can be realized via fspecial and imnoise in matlab Image Processing Toolbox (IPT).The concrete parameters for both commands in experiments are summarized in Table 1.We test HWN08 (codes are available at http://www.math .hkbu.edu.hk/∼mng/) and ADM2 in this numerical experiments.We choose the weights  = 10 −4 and  = 1 in model (4) for both testing algorithms.The penalty  = 2 × 10 −3 is adopted for ADM2.Both HNW08 and ADM2 require Chambolle's projection algorithm [17] to approximate the solution of -related subproblem.We take 10 steps for this approximation at each iteration.The initial point  0 is chosen as the observed degraded image and  0 = 0 and  0 = 0.The stopping criterion in (53) is set as Tol = 10 −4 .The numerical results are reported in Table 2.The datum therein, such as "62/1.94/23.91,"represents the iteration number, computing time in seconds and SNR, respectively.We can see that ADM is faster than AM algorithm.The reason is that the dual variable  is exploited and is updated at each iteration.Figure 3 illustrated some devastated images and their restorations by ADM2.

Mixed Noise Removal.
We now test the case of mixed noise removal, that is, the model (7), and report numerical comparisons between HNW09, ADM2, ADM3, and ADMGB.The ideal image of cameraman is degraded by Gaussian blur with hsize = 7 andsigma = 2 and out-offocus blur with radius = 3, respectively.The blurred images are further corrupted by additive Gaussian noise with noise variance as  2 = 0.005 and impulse noise with various noise level .The median-type filters are attractive in removing impulse noise.We choose AMF for salt-and-pepper noise detection while ACWMF for random-valued noise detection as in [6].
We first experiment on the scenario of salt-and-pepper noise plus Gaussian noise removal.The parameters in model (7) are taken as  = 10 −4 ,  = 1, and  = 2 for all testing algorithms.Moreover, we choose  = 10 −3 for ADM2;  1 = 10 −3 ,  2 = 10 −3 , and  3 = 10 −1 for ADM3 and ADMGB.The window size for median-type filter AMF is set as 19 as in [12].Because the and -related subproblems in both HNW09 and ADM2 should be approximated by Chambolle's projection method [17] and PCG method, respectively, we set 5-step Chambolle's approximation for -related subproblem   3. We plot the evolutions of SNR with respect to iteration numbers and computing time in Figure 4.It manifests the efficiency of ADM3 and ADMGB as the resulting subproblems possess closed-form solutions.Figure 5 illustrates some restored images by ADM3 with different impulse noise level.We also test algorithms on other images for salt-and-pepper noise plus Gaussian noise removal.The ideal 512 × 512 images are degraded by out-offocus blur with radius = 7 and contaminated by salt-andpepper noise with noise level 70% plus Gaussian noise with noise variance  2 = 0.005.Figure 6 shows the restored images by ADM3.We now experiment on the scenario of random-valued noise plus Gaussian noise removal.Generally, Gaussian noise mixed with random-valued noise is more intractable to be removed than that of Gaussian noise mixed with salt-andpepper noise.The reason is that it is difficult to distinguish pixels contaminated by Gaussian noise with those contaminated by small value random-valued noise.We tackle model (7) by choosing  = 3 × 10 −3 ,  = 1, and  = 2 for all testing algorithms.We take  = 5×10 −2 for ADM2 and  1 = 5×10 −2 ,  2 = 5 × 10 −2 , and  3 = 1 for ADM3 and ADMGB.The median-type filter ACWMF is run 4 times in phase of noise detection as in [12].The and -related subproblems in HNW09 and ADM2 are still approximated by Chambolle's projection method and PCG method as the scenario of random-valued noise plus Gaussian noise removal.Likewise, the initial points are also taken as  0 = 0,  0 = 0, and  0 = 0 and the stopping criterion in (53) is set as Tol = 5 × 10 −4 .
The numerical results are listed in Table 4. Figure 7 plots the evolutions of SNR with respect to iteration numbers and computing time.Some restored images by ADM3 are illustrated in Figure 8.Those 512 × 512 images are also tested for random-valued noise plus Gaussian noise removal.All images are degraded by out-of-focus blur with radius = 7 and contaminated by random-valued noise with noise level 40% and Gaussian noise with noise variance  2 = 0.005.Figure 9 shows the restored images by ADM3.

Conclusions
The variable splitting is a hot-investigated strategy in both variational inequalities and optimization.Numerous practical applications illustrate the attractive and efficient performance of this strategy.We illustrate in this paper that the variable splitting can be efficient and promising in image     restoration.Note that the numerical effects for mixed noise removal utilizing model (7) with  = 1 are similar as that of  = 2 (see also [6]).As stated in this paper, we only consider the image denoising with blurry.However, the mixed noise may be occurring in image inpainting, superresolution, and decomposition, even in the area of video processing, which may be difficult to be handled.They are thus some of our future research topics.

Figure 4 :
Figure 4: Evolutions of SNR with respect to iteration numbers and computing time (cameraman image degraded by Gaussian blur and contaminated by salt-and-pepper noise ( = 70%) plus Gaussian noise ( 2 = 0.005)).

Figure 7 :
Figure 7: Evolutions of SNR with respect to iteration numbers and computing time (cameraman image degraded by out-of-focus blur and contaminated by random-valued noise ( = 40%) plus Gaussian noise ( 2 = 0.005)).

Table 2 :
Computational results on Gaussian noise removal by solving model (4).