A Residual-Based Kernel Regression Method for Image Denoising

,


Introduction
In image processing, noise reduction with preserving fine features is a long-standing research problem.Many applications such as image segmentation and medical image analysis require effective noise suppression to obtain reliable results.In recent years, many efficient mathematical methods such as kernel regression based methods and locally adaptive based methods for the purpose of image denoising have been developed in the literature (see [1][2][3][4][5][6][7][8][9][10] and the references cited therein).For example, Tomasi and Manduchi proposed the well-known bilateral filtering for image filtering in [1], which exploits the local structural patterns to regularize the image filtering procedure and assume that images are locally smooth except at edges.The bilateral filter can preserve certain details of images, while more information inherent in the image still remains to be discovered and utilized.Tschumperlé [2] proposed a common framework for image filtering which is based on the iterative local diffusion in the image guided by the local structure tensor.The method makes use of the local gradients of the image to control the shape of the kernel and hence to keep texture details of the image efficiently.However, for a very noisy image, the gradients of the image are likely to be corrupted by the noise and thus might raise huge errors in terms of gradient estimator in smooth areas, which might generate negatively "ripple" phenomena.To overcome the drawbacks mentioned above, Takeda et al. [3] further proposed an iterative steering kernel regression filter, of which an iterative filtering procedure is used to exploit the output (less noisy) image of each iteration to improve the local orientation estimates used in the kernel in the next iteration.The implementation of the method is very timeconsuming, even exceeding that of nonlocal methods [11,12]; and it shows instability in estimating the local gradients of the image under high-level noisy case.López-Rubio and Florentín-Núñez [4] proposed to adopt a general kernel regression framework to 3D MR image denoising.The method is rooted on a zeroth-order 3D kernel regression, which calculates a weighted average of the pixels over a regression window; and the weights are obtained from the similarities among small sized feature vectors associated to each pixel.One common feature for these kernel regression methods is the exploration of the local structural regularity properties in natural images/videos.Based on this observation, the kernel regression based methods (or combined with other methods) are also successfully applied to image and video deblurring, upscaling, interpolation, fusion [3], superresolution [6,[13][14][15][16], registration [17], JPEG image deblocking [18], and so forth.On the other hand, the residualbased image denoising methods have also been studied.For instance, in [19], the corrupted image was first denoised with the total variation minimization algorithm; then the residual image was denoised with the adaptive Wiener filter; and finally, based on a statistical test, the filtered residuals were added back to the denoised image.Similarly, in [20], the authors first adopted nonlocal means filter to denoise the degraded image and added the final filtered residual image to the denoised image to recover the lost image details.In [21], the image denoising was implemented in the wavelet transform domain.In the method, based on the kurtosis statistic of the residual image, a criterion was proposed for determining wavelet coefficient thresholds.
In this paper, different from the residual-based methods in [19][20][21], we propose a new residual-based denoising method, which combines the bilateral filter and the local structure adaptive kernel filter by making use of their respective advantages.The proposed method can prevent the fine details loss caused by the bilateral kernel regression method and avoid image ripple phenomena produced from the local structure adaptive kernel method.Numerical experiments demonstrate that, for images contaminated with Gaussian noise, the performance of our combined approach is generally superior to those of some traditional and state-of-the-art restoration methods in terms of the image quality measured by PSNR and SSIM [22].

Classic Kernel Regression Methods
In this section, we briefly review bilateral kernel and structural adaptive kernel methods which will be used in the development of our new method later.
Classical regression methods for image processing rely on a specific model of the signal of interest and calculate the parameters of the model in the presence of noise.Compared to the parametric methods, kernel regression methods [23,24] rely on the data itself to dominate the structure of the model.Generally, the traditional kernel regression methods are based on solving the following optimization problem: where N(x) is a set that consists of the index numbers of the neighbors of x;   is the pixel observation at x  ; ỹ(x) denotes the estimate of the pixel at location x of the image.(x  − x) is a generic kernel (e.g., Gaussian, exponential, or other forms) which is used as the penalty for distance between x  and x, so that the influence of x on x  decreases as the distance between these two points increases.Considering  0 as the first coefficient of some regression function on the coordinates x, (1) is essentially a zero-order estimation.In order to approximate the image local structure better, higher order estimation was used in [3,25].Specifically, assuming that the image is locally smooth to some order, we can relay on the local expansion of the function using the Taylor series where tril denotes the operator that extracts the lower triangular part of a matrix and stacks it to a column vector.Then, the optimal regression coefficients {  }  =0 can be calculated from the following optimization problem: Although all regression coefficients {  }  =0 can be estimated effectively, the first element  0 is the desired pixel value estimation at x.For the details of computing  0 , one can refer to [3,23,24].
As mentioned above, the traditional kernel methods simply take the distance of the pixels into consideration.That is, large weight is assigned to nearby pixels, while smaller weights are assigned to farther pixels.This may lead to obscurity of the image, especially on the edges.To relieve the problem, Tomasi and Manduchi [1] proposed the bilateral filter as solving the following optimization problem: where Here, the kernel functions  1 and  2 are used for penalizing the spatial distance between the pixel of interest x and its neighbors {x  } and the radiometric "distance" between the corresponding pixels  and {  }.The main idea of bilateral kernel is that, when calculating weights, it considers both the pixel distance and space distance between the points at x and x  .Then, due to considerable variation of the pixel values around the image edges, the values of the associated weights  2 are relatively small, which contributes to preserving the edges of the image.In addition, the generalizations of the bilateral filter, including the choices of various kernels and higher order estimation, were reported in [10,26,27].
Although the bilateral filter can solve the indistinction problem of image margins to some extent, only distance information of the pixels is used, which might make a loss of details in the textures of image margins.Tschumperlé [2] further proposed to utilize correlation information of pixels to estimate the local directions of the image.And he constructed a local structure adaptive kernel as follows: where  is the diffusion tensor controlling the spatial structure of the kernel and  is the parameter that controls the bandwidth of the structural kernel.In order to obtain ,  we construct a structure tensor, which can be defined as follows: where (⋅) is a Gaussian kernel and (x) and (x) are the -gradient and -gradient at coordinates x, respectively.Here we use Prewitt operator [23,24] to estimate the gradient.After performing eigenvalue decomposition on (x), we get where  1 ,  2 are the eigenvalues of (x) with  1 ≤  2 ̸ = 0, reflecting the strength of the gradient along each eigenvector direction and u and k are the eigenvectors of (x).According to methods for finding diffusion tensor described in [2,24], we set where Here,  ≥ 0 is a sensible parameter so that the shape of the local structure adaptive kernel will respond to the local gradient of the image.

A Residual-Based Joint Denoising Method
Bilateral filter and structure adaptive denoising methods described above can restore the noisy images efficiently.For example, the original noise-free image "Lena" shown in Figure 1(a) is degraded by an additive white Gaussian noise with noise variances  2 = 20 2 , as shown in Figure 2(a).We test the two methods on the Noisy "Lena" image, respectively.As we can see, the bilateral filter with PSNR = 27.56 dB, SSIM = 0.86 and the structure adaptive denoising method with PSNR = 30.20 dB, SSIM = 0.84 exhibit acceptable image quality as shown in Figures 2(b) and 2(c), respectively.On the other hand, as shown in the figure, the bilateral filter can smooth the Noisy "Lena" efficiently but erase certain details of the image in the meanwhile; and the local self-adaptive method can deliver the fine features, such edges, of the image, whereas it might generate image ripples caused by noise in the smooth area.Based on this observation, we here propose a new method, that is, a residual feedback-based joint image denoising method using bilateral filter and structure adaptive methods.
Firstly, we define an image residual model as follows: where (  ) is the pixel value at point   of the noisy image with  pixels; (  ) is the pixel value at point   of the denoised image; and ε is the estimate of the location disturbance of   , namely, image residual.We assume that image residual ε still obeys Gaussian distribution with mean of 0 and a low-level variance; that is, We next define a criterion for testing image residual as follows: where  ℎ 1 (  ) denotes a neighbor of   with the radius being ℎ 1 .As the expectation of image disturbance is 0, the criterion (⋅) should not be too large or too small in a flat area.Thus we can tag abnormal points by where  is a preprescribed value, used as an indicator of the toughness of tagging.As an example, the image residual  between Figures 2(a) and 2(b) is shown in Figure 3(a) and  1 shown in Figure 3(b) is obtained by using (13).From Figure 3(b), we can see that the details are tagged to some extent, but the noise still remains.So, for a tagged residual figure  1 (  ), we need to "filter" it a second time.Define Here, similarly to  ℎ 1 (  ),  ℎ 2 (  ) is a neighbor of   and ℎ 2 is its radius; and  1 is a preselected threshold.The main idea of designing  2 in ( 14) is based on the fact that the useful information, such as edges, in the residual tagged image  1 () tends to be clustered while the noise-tags distribute randomly in the image  1 ().Thus we can determine whether a given pixel, say   , is informative or noisy by simply calculating  2 (  ).As shown in Figure 3(c), the noise scattered in the tagged image has been eliminated visibly after the second filtering process.In addition, it is worth noting that there exist some interrupted points in image  2 (), which might result from the fact that some useful and informative areas are covered by the noise.So, we further expand the tagged information by linking those unconnected abutting areas with following function: After the operation using  3 , we get a binary image shown as in Figure 3(d), which just tags useful information, such as edges, and meaningless noise has been eliminated clearly.
According to the analysis described above,  3 can clearly indicate the indistinct margins caused by bilateral denoising.On the other hand, as discussed in Section 2, the structure adaptive kernel based methods can protect fine features, such as edges, of an image as well as possible.So, we can avoid the appearance of indistinct margins and ripples in the restored image by combining the favorite properties of these two methods.Detailedly, let  1 and  2 be the images denoised by using structure adaptive kernel method and bilateral filter, respectively.Then the resulting image  3 can be formulated as follows: It is worth noting that in order to get  3 , instead of filtering the whole noisy image to obtain  1 by structure adaptive kernel method, we only need to get some parts of  1 by filtering those pixels satisfying the first restriction in (15), which can reduce the computation complexity greatly.In addition, in order to erase those image edges caused by splicing, a smoothing formula is defined as where # denotes the number of elements of set . Thus we can replace   3 with  3 in ( 16) to gain a smoothed splicing image.
In summary, the main steps of our proposed method are as follows: (1) Restore the noisy image  by bilateral filter and get the restored image  2 .
(4) Apply structure adaptive method to denoise those pixels   of the noisy image  satisfying the first restriction of ( 15) and get  1 .
(5) Get the restored image  3 by Figure 2(d) illustrates the restored "Lena" image by using our proposed method.This image has PSNR = 31.65dB and SSIM = 0.92.As we can see, the proposed method performs remarkably better than the bilateral filter and structure adaptive kernel method.Furthermore, the execution time of our proposed method with 55. 16 seconds is much faster than that of the structure adaptive kernel method with 78.12 seconds.This can be explained by the fact that our proposed method only performs structure filter on tagged points indicated by  3 , instead of the whole image, leading to decrease of the computation complexity.

Numerical Experiments
In this section, we present numerical results on simulated and read data to demonstrate the performance of the proposed method and compare it with several traditional methods and some existing state-of-the-art methods, which include bilateral filter, structure adaptive kernel method described in the previous Section 2, Wiener filter [28], Kuwahara filter [29], nonlinear complex diffusion (NCD) processes [30], BiShrink [31], NeighShrinkSURE [32], OWT-SURELET [33], FP 2 O-ITV [34], and BM3D [35].We note that all algorithms are implemented by using MATLAB (version 7. (1/) ∑  =1 (  − f ) 2 ) , where   and f are the pixel intensity of the original image and the retrieved image, respectively. denotes the total number of pixels.Moreover, we here also use the Structural SIMilarity (SSIM) Index introduced by [22] to measure the similarity between the original image  and the restored image f.Note that in this paper we use Gaussian type kernel functions in the three kernel based methods.For parameters in bilateral filter, corresponding to  1 and  2 , we choose  1 =  2 = 3 and the window size ℎ = 4; for structure adaptive kernel method, we choose  = 3, ℎ = 5; and for our proposed joint method, we choose ℎ 1 = 3 in (12),  1 and  2 , respectively, being 15% quantile and 85% quantile of the residuals in the set  ℎ 1 (⋅) in ( 13),  1 =  2 = 15 in ( 14) and ( 15), and ℎ 1 = ℎ 2 = ℎ 3 = 3 in all the three equations.For the other test denoising methods, we experimentally select the parameters according to the suggestions discussed in the literature.For example, in the Wiener filter [28] and Kuwahara filter [29], we set the window size being 6 × 6 and 5 × 5, respectively; in the BiShrink [31], we use the separable discrete wavelet transform with Daubechies' least asymmetric compactly supported wavelet with eight vanishing moments with four scales; also in the NeighShrinkSURE [32] and OWT-SURELET [33]  methods, the wavelet type used in the method is "sym8" with the number of wavelet decomposition levels equal to four, and we assume that the noise variances were known; in the algorithm FP 2 O-ITV [34], we fix  = 1/0.05and  = 0.25 for convenience.
We first choose the 512 × 512 standard test images, "Lena," "Pirate," "Boat," and "House" shown as in Figure 1, in our experiments.They were contaminated by the Gaussian random noise with standard deviation varying from  = 20 to  = 40 and the step size being 5.As we can see from Table 1, when the quality of the restored images is measured by PSNR and SSIM, the proposed method significantly performs better than the traditional methods, bilateral filer, structure adaptive kernel method, Wiener filter, and Kuwahara filter; and it generally outperforms the NCD, BiShrink, and FP 2 O-ITV, especially for high-level noisy cases (e.g.,  ≥ 30).When compared with NeighShrinkSURE and OWT-SURELET methods, it is also promising.In addition, when only focusing on the SSIM-values of the table, our proposed method possesses particularly favorable results in comparison with all others except BM3D, which can be explained by the fact that the proposed method inherits the advantages of the bilateral filter and the local structure adaptive kernel filter, coinciding with our aforementioned analysis in Section 3. Note that the performance of BM3D nonlocal method is superior to all the other methods.

Conclusion
In this paper, we propose a new method for joint denoising with the help of residual feedbacks.This method combines favorable properties of bilateral filter and local self-adaptive kernel; and it can efficiently preserve the fine structure of the image.Numerical experiments show that our new combined method has acceptable denoising performance even for handling images with heavy Gaussian noise.It remarkably exceeds traditional bilateral filter, local self-adaptive kernel, Wiener filter, and Kuwahara filter methods and generally outperforms some

Figure 2 :
Figure 2: Comparison of the three denoising methods: (a) noise image with  = 20, (b) bilateral filter, (c) structure adaptive filter, and (d) the proposed method.
10) and preformed on Windows 7 with CPU E3V2 3.3 GHz and 8 GB memory.The denoising image quality is measured by Peak Signal-to-Noise Ratio (PSNR) defined by PSNR = 10 log 10 ( 255 2

Figures 4 -
7 illustrate the comparative results of the corrupted "Lena" and "Pirate" images with noise-levels,  = 30 and  = 40, respectively.As shown in the figures, the proposed method can preserve the fine features of the images favorably in visual effects.Finally, we apply the three denoising methods on the color image shown in Figure8, which is contaminated by real film grain and scanning process noise.To obtain better color evaluations, we transform the RGB image to the YCrCb representation.Here, Y represents the luminance component and Cr and Cb denote the chrominance components.Then we apply the bilateral filter, the structure adaptive filter, and our proposed denoising method on each channel separately.The results are shown in Figures 8(b)-8(d), respectively.As we can see, the image in Figure 8(d) appears clearer than that in Figure 8(b).To further demonstrate this, Figures 8(e) and 8(f) show the absolute values of the residuals on the Y channel.It can be seen that the proposed method produces less features than bilateral filter and more noise-like residuals than structure adaptive method, which, in the absence of original noise-free image, is a reasonable indicator of superior performance.

Figure 8 :
Figure 8: Performance of different denoising methods is compared on a color image with real noise; (e)-(g) show the residual between given noisy image and the respective estimates.(a) Real noisy image; (b) bilateral filter; (c) structure adaptive filter; (d) the proposed method; (e) bilateral filter; (f) structure adaptive filter; (g) the proposed method.

Table 1 :
A comparison of different denosing methods.