Image Regularity and Fidelity Measure with a Two-Modality Potential Function

We define a strictly convex smooth potential function and use it to measure the data fidelity as well as the regularity for image denoising and cartoon-texture decomposition. The new model has several advantages over the well-known ROF or TV-L2 and the TV-L1 model. First, due to the two-modality property of the new potential function, the new regularity has strong regularizing properties in all directions and thus encourages removing noise in smooth areas, while, near edges, it smoothes the edge mainly along the tangent direction and thus canwell preserve the edges. Second, the newpotential function is very close to theL1 norm; thus using it tomeasure the data fidelity makes the newmodel perform very well in removing impulse noise and preserving the contrast. Lastly, the proposed fidelity and regularization term is strictly convex and smooth and thus allows a unique global minimizer and it can be solved by using the steepest descent method. Numerical experiments show that the proposed model outperforms TV-L2 and TV-L1 in removing impulse noise and mixed noise. It also outperforms some state-of-the-art methods specially designed for impulse noise. Tests on cartoon-texture decomposition show that our method is effective and performs better than TV-L1.


Introduction
Image denoising aims to recover a clean image from a noisy observation.In this work, we mainly focus on removing the impulse noise, which randomly contaminates a portion of the pixels so that their true values are completely lost.The impulse noise is physically caused by malfunctioning pixels in camera sensors, faulty memory locations in hardware, or transmission in a noisy channel [1].It can be categorized into two types: one is the random-valued impulse noise, for which the noisy pixels can take any random values between the maximal and the minimal pixel values; and another is the saltand-pepper noise, for which the noisy pixels can take only the maximal and minimal pixel values.For both types of noise, the noisy pixels are assumed to be randomly distributed in the image.
Let (, ) ∈  2 (Ω) be the original clean image defined on its domain Ω ∈  2 , with Lipschitz boundary and (, ) ∈  2 (Ω) be the observed image corrupted by impulse noise.The corruption can be formulated in the following general from: where  represents an impulse noise.Two main models for the impulse noise are used in a wide variety of applications: salt-and-pepper noise and random-valued impulse noise [2].Denote the dynamic range of  by [ min ,  max ]; that is,  min ≤ (, ) ≤  max , for every pixel (, ), the model of the saltand-pepper noise is defined by with probality  2  (, ) , with probality 1 − , (2) where (, ) denotes the gray level of  at a pixel location (, ) and  determines the level of the salt-and-pepper noise.The model of the random-valued impulse noise is defined by  (, ) = { { {  (, ) , with probality   (, ) , with probality 1 − , where (, ) are identically and uniformly distributed random numbers in the range [ min ,  max ] and  defines the level of the random-valued impulse noise.

Mathematical Problems in Engineering
Image denoising is a typical ill-posed inverse problem and one of the most popular approaches is to solve a minimization problem of the form min   () +  () , where () is a data fitting term derived according to the assumed noise type and () is a regularization term that imposes the former on .Many methods have been proposed by using various a priori knowledge about the image and the noise [3][4][5].One of the most influential examples is the Rudin-Osher-Fatemi (ROF, or TV- 2 ) [6]: where |∇| is the modulus of the gradient of .The ROF model uses the  2 norm to measure the data fidelity under the additive Gaussian noise assumption and uses the total variation (TV) to measure the regularity by assuming the image is piecewise smooth or the gradient of the image is sparse.The total variation regularity allows for reconstruction of images with discontinuities across hypersurfaces and is extensively used in variational image restoration.Nevertheless, the  2 fidelity leads to some limitations.One important issue is the loss of contrast in the restored image even if the observed image is noise-free; another issue is that the fidelity term with  2 norm deals well with Gaussian noise but does not perform well in removing impulse noise.In [7], Chan and Esedoglu use the  1 norm as a measure of fidelity and formulate the following variational problem (TV- 1 ): It was shown that the  1 norm better preserves the contrast, and the order in which features disappear in the regularization process is completely determined by their geometry (area and length), rather than the contrast as in the ROF model.This important geometric property is also used for the active contour global minimization problem [8].Using the  1 fidelity, as analyzed in [7], model (6) implicitly detects the pixels contaminated by impulse noise and it preserves edges very well.Empirically, TV- 1 outperforms TV- 2 in detecting outliers and removing impulse noise [9].However, in order to detect large noisy connected regions, it requires a greater weight of the regularization term in the cost function, which causes distortion of some pixels near edges.Moreover, it has some mathematical limitations: the minimizers of the variational problem (6) need not be unique in general because the  1 fidelity term is not strictly convex; it is not smooth either and solving the problem needs some regularization tricks.A weighted sum of  1 and  2 fidelities is used as the data fitting term and it works effectively and robustly for removal of mixed noise or almost any type of unknown noise [10].But it still suffers from the shortcomings of  1 fidelity.Huber norms [11] have been used for TV in order to avoid undesirable staircase effects [12].In [13], the Huber loss is used for both data fidelity and regularization.The advantage of using the Huber loss in comparison to the  2 norm is that geometric features such as edges are better preserved and it has continuous derivatives in contrast to the  1 norm that is not differentiable and leads to staircase artifacts.However, the Huber norm involves a parameter that affects the results.
Except the variational methods, some filtering based methods exist for impulse noise removal such as the Adaptive Median Filters (AMF) [14] and the Adaptive Center Weighted Median Filters (ACWMF) [15].The AMF method uses Adaptive Median Filter with variable window size to filter out impulse noise.It is robust in removing mixed impulses with high probability of occurrence while preserving sharpness.But it is ineffective when an image is disturbed by other types of mixed noise, such as Gaussian, Poisson, and impulse noise.The ACWMF further uses spatial varying central weight to improve AMF and it is better than AMF in preserving details and in suppressing impulse noise, additive white noise, and signal dependent noise.However, the ACWMF tends to become an identity filter if impulses exist within a window and in that case, the ACWMF is not effective in suppressing impulses, especially for salt-and-pepper noise.
Cartoon-texture decomposition is an important mathematical tool for image analysis.It aims to decompose an image  into a cartoon component and a texture component.Ideally, the cartoon component is a piecewise smooth approximation of the original image and it mainly contains object hues and sharp edges while the texture component contains repeated small scale patterns.The general framework for cartoon-texture decomposition has the following from: min ,V  () +  (V) , where () and (V) are two functionals, usually norms, measuring the cartoons  and V, respectively.Meyer [2] shows that the ROF model is ineffective in cartoon-texture decomposition because the  2 norm is not a good measure of the texture, yet the TV is effective in measuring the cartoon component.To overcome the ineffectiveness of the  2 norm in measuring the texture component, Meyer [2] and Haddad and Meyer [16] proposed using the -norm, Vese and Osher [17] approximated the -norm by the div(  ) norm, Osher et al. [18] proposed using the  −1 norm, Lieu and Vese [19] proposed using the more general  − norm, and Le and Vese [20] proposed using the div(BMO) norm to measure the texture component.However, the models involving these norms are difficult to solve.Yin et al. [21] show that the  1 norm is effective in measuring the texture component and proposed the TV- 1 model.
In this work we define a strictly convex smooth potential function () and use it to measure the data fidelity as well as the regularity for image restoration and cartoontexture decomposition.Like Huber norm, the new potential function has two modalities: it is approximately half the square function (corresponding to the  2 norm) near 0 and approximately a linear function (corresponding to the  1 norm) when  is far away from 0. But Huber norm involves a parameter while our potential function does not.The new model has several advantages over the well-known Rudin-Osher-Fatemi (ROF) or TV- 2 model and the TV- 1 model.First, due to the two-modality property of the new potential function, using it working on the image gradient to measure the regularity makes the regularity work in two ways: in smooth area of the image, the regularity results in a diffusion term that is uniform and isotropic, having strong regularizing properties in all directions, and thus encourages removing noise in smooth area, while, near edges, the regularity results in a diffusion process which smoothes the edge mainly along the tangent direction and thus can well preserve the edges.Such regularizing role of our regularity term is different from the TV; especially in smooth areas, TV regularity causes staircasing effect while our method does not.Second, the new potential function is very close to the  1 norm; thus using it to measure the data fidelity makes the new model perform very well in removing impulse noise and preserving the contrast.Lastly, the proposed fidelity and regularization term is strictly convex and smooth; thus the new model allows a unique global minimizer and it can be solved by using the steepest descent method.Mathematical analysis and numerical experiments show that the proposed model outperforms TV- 2 and TV- 1 in removing impulse noise and mixed noise.It also outperforms the Adaptive Median Filters (AMF) and the Adaptive Center Weighted Median Filters (ACWMF) in removing mixed noise.We also apply this model for cartoon-texture decomposition.Experimental results show it performs better than TV- 1 in cartoon-texture decomposition.

The Proposed Model
The proposed model is as follows: where  is a nonnegative tuning parameter and () is defined by The rationality of this potential function can be explained as follows.First of all, the function () is strictly convex and differentiable since ()  = tan −1  and ()  = 1/(1 +  2 ); thus our model ( 9) allows a unique global minimizer and it can be solved by using the steepest descent method.Secondly, when () is used to measure the data fidelity, similar to Huber norm, it is also a good approximation of the  1 norm in the sense that () →  2 /2 as  → 0 and () → (/2)|| − ln || as || → ∞, so it has similar performance to that of the  1 fidelity in removing impulse noise and preserving image contrast.Figure 1 compares (),  1 norm,  2 norm, and the Huber norm.Lastly, when () is used to measure the regularity as in (9), it induces the following gradient descent flow: where the right side is the negative gradient of the functional in (9), with the first and the second term being deduced from the regularity term and the fidelity term, respectively.The diffusion term div((  (|∇|)/|∇|)∇) can be decomposed as where T and N denote the tangent and normal directions to the isophote lines (lines along which the intensity is constant) and  TT and  NN denote the second derivatives of  in the T-direction and N-direction.One can see that in a flat or smooth region of the image where the variations of the intensity are weak, that is,  = |∇| ≈ 0, the coefficient of  TT ,   ()/ and the coefficient of  NN ,   () satisfy lim then (11) becomes where Δ denotes Laplacian differential operator.So, at these points,  locally satisfies (14), in which the diffusion term is uniform and isotropic, having strong regularizing properties in all directions, and thus encourages removing noise in smooth area.Near edges of the image, that is,  = |∇| ≫ 0,   ()/, and   (), satisfy lim This means the coefficient of  TT ,   ()/ and the coefficient of  NN ,   () both vanish.However,   () vanishes faster than   ()/; this allows the diffusion process to smooth the edge a little along the tangent direction; thus our regularity term can well preserve the edge.The TV regularity can be regarded as a special case of our regularity term by taking () = , then   () = 1, and   () = 0.In smooth area, that is,  = |∇| ≈ 0, the coefficient of  TT ,   ()/ becomes large while the coefficient of  NN ,   () = 0; this may be the reason why TV regularity causes the staircasing effect in smooth area.
The minimization problem ( 9) can be iteratively solved by the gradient descent method.Numerically, we use the following forward finite difference scheme to discrete the gradient descent flow (11): where Δ denotes the time step size and  is the diffusion term, defined by and  is a regularizing constant to avoid dividing by 0, which is set by  = 0.01, in our experiment.The spatial derivatives are discretized by central differences.

Numerical Simulation
This section is mainly devoted to numerical simulation of image denoising in the presence of impulse noise and mixed noise consisting of Gaussian, Poisson, and impulse noise.We also use our model to decompose an image into a cartoon component and a texture component.The simulations are performed using Matlab 8.5.0 (R2015a) in Windows 7 environment on 3.30 GHZ Intel Core i5-4590 CPU, 4 GB Ram PC.To assess the restoration performance quantitatively, we evaluate the peak signal to noise ratio (PSNR) defined as [22] PSNR = 10 lg ( 255 2 where  , and  , are the pixel values of the restored image and of the original image, respectively.In the presence of Poisson noise, the maximum intensity of the original noisefree image is varied in order to create images with different levels of Poisson noise.

Image Denoising.
We first show the effectiveness of our method in removing impulse noise, including the salt-andpepper noise and the random-valued impulse noise.In all experiments the time step size is set by Δ = 0.1.
The regularization parameter  plays an important role in denoising because it balances the competition between the data fidelity and the regularization term.When  takes large values, the regularization term dominates the total energy, which tends to force the restored image to be smoother and cleaner.When  takes small values, the fidelity term dominates the total energy, which tends to force the restored image to be closer to the observed noisy image.In the following we analyze through experiments how the PSNR of the restored image depends on the value of .We show the results for the test images "Cameraman" (256 × 256) and "Lena" (256 × 256) with intensity values ranging from 0 to 255.In the experiment, the noisy images are produced by corrupting the test images with salt-and-pepper noise or random-valued impulse noise of levels 10%, 20%, and 30%.Figure 2 plots the PSNRs versus the values of  for the image "Cameraman" with salt-and-pepper noise at different levels.Figure 3 plots that for random-valued impulse noise.From the plots, one can observe the following: first of all, in both cases of impulse noise, the PSNR of the three methods increases and reaches a maximum rapidly and then decreases slowly as the value of  increases.Moreover, the optimal value (numerical) of  (corresponding to the maximum PSNR) depends on the level of impulse noise.Lastly, for all levels of noise, the maximum PSNRs obtained by TV- 1 and our method are comparative while the maximum PSNRs obtained by TV- 2 are much lower (about 2 dB less).This again indicates that TV- 2 is not fit for impulse noise removal.To study how the optimal value of  depends on the noise level for TV- 1 and our method, we show some best values of  corresponding to various levels of impulse noise in Figure 4, salt-and-pepper, and in Figure 5, random-valued.From Figures 4 and 5, one can see that TV- 1 and our method have similar patterns of the dependency of the best  on the noise level.In general, the higher the noise level, the larger the best value of .To be more specific, for both methods, the best value of  tends to be stable in [0.8, 1.2] when the noise level is above 15%.Moreover, Figures 2 and 3 show that the PSNR attenuates slowly if the value of  is a little larger than the optimal value.For convenience, we choose  = 1.1 uniformly when the noise level is above 15% and  = 0.8 when the noise level is below 15%.
In the following experiments we compare visually and quantitatively the performance of our method with TV- 1 , AMF, and ACWMF in removing impulse noise.Figures 6 and  7, respectively, show the results obtained by these methods for salt-and-pepper noise and random-valued impulse noise.The maximum window size used in AMF [14] is 19.The  ACWMF [15] is successively performed 4 times with different parameters, which are chosen to be the same as those in [23].Obviously, whether in removing salt-and-pepper noise or random-valued impulse noise, TV- 1 and our method are well in removing noise and preserving the edges.But our method is a little better than TV- 1 in two aspects.Objectively, the PSNR of our method is about 0.2∼0.3dB higher than that of TV- 1 , and visually, there is less staircasing effect in the smooth area of the restored images.The ACWMF and AMF are better than TV- 1 and our method in preserving small scale details such as the textured ground in the image "cameraman," and the PSNR of the AMF on the image "cameraman" is even higher than our method by 2.65 dB in case of salt-and-pepper noise.However, the ACWMF and AMF cannot successfully detect all the impulse noise in that some scattered peak points are visible in the restored images.Moreover, the AMF fails in suppressing the random-valued impulse noise.As indicated in [10], the weighted sum of  1 and  2 fidelity is robust to any kind of commonly used noise prior, yet  empirically, the  1 norm absolutely dominates the fidelity.This motivates us to apply our model to remove Gaussian noise.Table 1 presents some results by TV- 2 , TV- 1 , and our method.Figure 8 compares the visual effects of these methods.One can see that TV- 1 performs worse than the other two methods in case of higher level noise, and the restored image by TV- 2 exhibits obvious and annoying staircase artifacts.The PSNRs and the restored images show that, for additive Gaussian noise, where the  2 fitting function is the best choice based on statistical analysis among all possible data fitting terms, our method performs better as well.The better results come from the two-modality property of our potential function.Now we test the performance of our method in removing mixed noise consisting of additive Gaussian noise, Poisson noise, and impulse noise.We also compare our method with TV- 2 , TV- 1 , ACWMF, and AMF.The Poisson noise is generated using the "poissrnd" function in Matlab with the input image scaled to the maximum intensity ( max = 255).For the impulse noise, we only consider the randomvalued impulse noise, because a pixel contaminated by such an impulse noise is not as distinctive as an outlier that is contaminated by salt-and-pepper noise and consequently is more difficult to detect.We consider three levels of the random-valued impulse noise: 10%, 20%, and 30%.The standard deviation of the white Gaussian noise is 10.For all  cases, impulse noise is the first to be added and Gaussian noise is the last to be added.The PSNRs of different methods are presented in Table 2 and some of the restored images are shown in Figure 9.For all levels of impulse noise, our method obtains the best PSNRs and visual effects.TV- 1 performs comparatively in removing impulse noise, but it does not perform as well as our method in removing mixture noise containing Gaussian noise.It may be explained by the two modalities of our potential function.The median filter based methods, especially the AMF is well fit for salt-andpepper noise, but it does not perform well in case of randomvalued impulse noise or mixed noise containing randomvalued impulse noise.In fact, the AMF is good at detecting salt-and-pepper noise because in that case, most of the noisy pixels are much more dissimilar to regular pixels and hence are easier to detect.However, the AMF is not effective in detecting random-valued impulse noise when the noise ratio is high.

Cartoon-Texture Decomposition.
In this subsection, we show the effectiveness of our method in cartoon-texture decomposition and compare it with the TV- 1 method.Since the function defined in ( 8) is a good approximation of the  1 norm, our model ( 9) can be used in cartoon-texture decomposition.We use model ( 9) to obtain  and finally take V =  − .  Figure 10 shows some results by the two methods on four test images, each of which contains smooth area bounded by large scale edges (cartoon) and repeated small scale details (texture).The top row shows the original test images.The other rows show the decomposition results.One can observe that our method can more thoroughly separate the cartoon and the texture.In the cartoon components obtained by TV- 1 , some textures are left behind.The cartoon component obtained by our method only contains the mainframe of the image, that is, the smoothed objects and their boundaries, and the small scale details are to a large extent separated into the texture part.
Finally we test the robustness of our method for cartoontexture decomposition in presence of noise.The results are shown in Figure 11.The first row shows the input images: image (a) is corrupted with salt-and-pepper noise ( = 20%), Gaussian noise with standard deviation  = 10, and Poisson noise; synthetic image (b) is corrupted with random-valued impulse noise ( = 20%), Gaussian noise with standard deviation  = 10, and Poisson noise.Both TV- 1 and our method decompose the noise together with the texture.In the cartoon components obtained by TV- 1 , there exist noticeable staircase artifacts while the cartoon components obtained by our method are visually much better.

Conclusions
In this work we define a new potential function and use it to measure the data fidelity as well as the regularity for image denoising and cartoon-texture decomposition.The new potential function has some attractive mathematical properties: strictly convex, smooth, and two-modality, which makes the proposed model have some advantageous properties over the classical TV- 2 and TV- 1 models.For example, it can well remove wider categories of noise including additive Gaussian noise, impulse noise, Poisson noise, and their mixture; like TV regularity, it can well preserve important geometric structure such as image edges, but unlike TV regularity, it does not cause staircase effect in smooth areas; moreover, the new model allows a unique global minimizer and it can be solved by using the steepest descent method.Numerical experiments show that the proposed model outperforms TV- 2 and TV- 1 in removing commonly used noise.Tests on cartoon-texture decomposition show that our method is effective and performs better than TV- 1 .

Figure 2 :
Figure 2: PSNR versus  for the image "Cameraman" corrupted by salt-and pepper noise at different levels.

Figure 3 :
Figure 3: PSNR versus  for the image "Cameraman" corrupted by random-valued impulse noise at different levels.

Figure 4 :Figure 5 :
Figure4: The best value of  for the proposed method and TV- 1 on the image "Cameraman," in case of salt-and-pepper noise at different levels.

Figure 10 :
Figure 10: Cartoon-texture decomposition results (for each method, we show on the left the cartoon component and on right the texture).

Table 1 :
Denoising results (PSNR) on three test images corrupted by Gaussian noise.The best PSNRs are given in bold.

Table 2 :
Denoising results (PSNR) on four test images in the presence of mixture of random-valued impulse noise, Gaussian noise with standard deviation  = 10,