Image Denoising Algorithm Combined with SGK Dictionary Learning and Principal Component Analysis Noise Estimation

,


Introduction
In the image acquisition and transmission, noise is inevitably carried, which will reduce image quality, so image denoising has a very important significance.Image denoising algorithms can be divided into space domain denoising and frequency domain denoising.The former includes the mean filtering, median filtering, and Wiener filtering.The latter includes Fourier transform [1], Laplace transform [2], and wavelet transform [3].A series of postwavelet multiscale tools have been developed based on the wavelet theory to filter noise effectively such as curvelet [4], directionlet [5], bandelet [6], and shearlet [7].
In recent years, there are some novel denoising algorithms such as nonlocal mean [8] denoising, Gaussian mixture model denoising [9], and dictionary learning denoising [10] based on sparse representation [11].An image denoising method based on wavelet and SVD transforms improves denoising performance [12].Moreover, K-singular value decomposition (K-SVD) [13] based on overcomplete sparse representation has recently been the subject of intense research activity within the denoising community [14,15].However, K-SVD increases the iteration number when dealing with large data.So Sujit proposed the SGK [13] dictionary learning algorithm in 2013, which not only overcomes the drawbacks of ordinary dictionary learning that breaks the sparse coefficient structure but also can be applied to a variety of sparse representations, with the low complexity and fast calculation ability [10].
At present, many image denoising algorithms need to foreknow the noise standard deviation [16], but it is usually unknown in practice.So the noise estimation has been developed in the image denoising community.The classic image filtering in [17] estimates the noise standard deviation by the convolution of image and filter.The DCT of the image patch [18] concentrates the image structure in the low frequency coefficient region, so that the noise estimation can be performed by the high frequency coefficient.It is also common to estimate noise level by the grayscale value of the image [19].Patch-based local variance [20] generally estimates noise level by robust statistical algorithms.The Bayesian contraction algorithm [21] is used to denoise the image and analyze the autocorrelation of residuals in the range of noise standard deviation to find the true value.The distribution of the sideband filter response [22] can be divided into two parts according to the difference of the image and noise, which is calculated by the expected maximization [23].The kurtosis of the edge sideband filter response distribution [24] is constant for the noisy image, and a kurtosis model can be established and the noise standard deviation can be evaluated by finding the best parameters of the model.However, the above algorithms mostly assume that the image is uniform.For images with abundant textures, Pyatykh et al. [25] proposed PCA noise estimation based on the data patch, where the noise standard deviation can be estimated as the minimum eigenvalue of the image patch covariance matrix.
Based on the above considerations, a denoising algorithm combined with SGK dictionary learning and PCA noise estimation is proposed.Firstly, the image with additive Gaussian white noise is segmented, and the noise level is estimated by calculating the minimum eigenvalue of the image patch covariance matrix.Then the estimated noise standard deviation is entered into SGK dictionary learning algorithm to denoise the image.During the denoising process, each image patch is sparse and the sparse representation coefficient is calculated by pursuit algorithm.The dictionary atom is updated with the sparse representation coefficient; therefore a more accurate approximation of the image patch is obtained.The experimental results show that the proposed algorithm is superior to other algorithms in noise level estimation and has better denoising performance.

SGK Dictionary Learning
Denoising Algorithm

Image Denoising Problem and SGK Dictionary Learning.
SGK dictionary learning algorithm is a generalization of the -means clustering.It mainly consists of two stages: sparse coding stage and dictionary update stage when using SGK dictionary learning algorithm to perform denoising [13], and the flow chart is shown in Figure 1.SGK algorithm firstly processes image through the original DCT dictionary and then updates dictionary with the sparse representation coefficient.Each local patch extracted in the image is sparsecoded by new training dictionary to achieve the denoising performance.

Sparse Coding Stage.
For an image A of size √  × √  added to additive white Gaussian noise W ∈  √×√ , it constitutes a noisy image B: Assume that the dictionary D ∈  × consists of image atoms d  ∈   , where  = 1, 2, . . ., .Q  represents a  ×  matrix that extracts patches of size √ × √ in image A, which is ∀  {Q  A ∈   }.For each local patch, the sparse representation a = Q  A can be represented by a dictionary D: For any patch in the image, β = arg min Therefore the global image representation is shown as For the solution of (4),   can be obtained by (3), and then B is represented as sparse approximation of A by choosing appropriate   , so it can be obtained as β = arg min (5)

Dictionary Update Stage.
In the dictionary update stage, updating each image's atoms sequentially can minimize sparse representation error, which is denoted as In (6),   () is the th component of   .And the error matrix G is composed of all these elements {g  }.So the error of the image patch d  can be expressed as All these {g   } form the error matrix G  and also form vector   containing corresponding {  ()}, so it has And ‖ ⋅ ‖  is Frobenius norm in (8).According to the sequential generalization of -means [12], the solution of ( 8) is  The closed-form solution of ( 9) is It replaces all atoms with d = d and SGK training dictionary D is used to obtain the final sparse representation of the component β for each extracted local patch.Â is obtained as The final solution of the sparse representation error minimization problem is

Noise Estimation Theory Based on PCA.
Suppose that A is a clean image with size of √ × √ , and Β represents an image with additive white Gaussian noise W. The noise variance is unknown, so it needs to be estimated.For A, B, and W, each image contains  = ( √  − √  + 1)( √  − √  + 1) patches with size  = √ × √ .Since W is the independent additive white Gaussian noise, there is W ∼   (0,  2 I) and cov(A, W) = 0. Suppose that J A and J B are, respectively, the sample covariance matrices of A and B. Meanwhile, μA,1 ≥ μA,2 ≥ ⋅ ⋅ ⋅ ≥ μA, are the eigenvalues of J A , and the corresponding eigenvectors are WA,1 , . . ., WA, .Similarly, μB,1 ≥ μB,2 ≥ ⋅ ⋅ ⋅ ≥ μB, are the eigenvalues of J B , and corresponding eigenvectors are WB,1 , . . ., WB, .W B,1 B, . . ., W B, B represent the sample principal component of B [26], and  2 represents the sample variance, so it is shown as follows: where  2 represents the sample variance.
In order to apply PCA to noise variance estimation, it defines positive integer .The clean image A satisfies A  ∈ W − ⊂   , and its dimension  −  is less than the number of coordinates .So there is And it is held for all ℎ =  −  + 1, . . ., .When considering the overall principal component, cov(A, W) = 0 represents ∑ B = ∑ A + ∑ W , where ∑ B , ∑ A , and ∑ W , respectively, represent the overall covariance matrices of B, A, and W. Meanwhile, the minimum eigenvalues of ∑ W =  2 I and ∑ A are zero, so the minimum eigenvalue of ∑ B is  2 .With the sample size  tending to infinity, it meets which represents the fact that μB, converges to  2 , so the noise variance can be estimated as μB, and it is a consistent estimation of the noise level.
If the above assumptions hold, the expected values of μB,−+1 − μB, can be calculated from the trigonometric inequality and ( 15): The condition of ( 16) is where  is a fixed value and it satisfies  > 0.
For the estimation of noise variance  2 est , it can be verified by (17).If (17) holds,  2 est is the final estimation.But if (17) cannot hold, it is necessary to extract a subset of the image patches with a small standard deviation.Performing noise estimation again to satisfy (17) is satisfied until the final noise estimation is obtained.

Experimental Results' Analysis
The standard Kodak Photo CD benchmark was used to evaluate the performance of denoising algorithm.The size of some images is 256 * 256, and the size of other images is 512 * 512.The patches sizes of all images are 8 × 8.

Comparison of Four Dictionary Learning Denoising
Algorithms.We conduct experiments for image denoising by using SGK, DCT, Global, K-SVD dictionary learning   algorithms, and BM3D algorithms.The similarity of the first four algorithms is to build a dictionary and then use the dictionary to denoise.DCT algorithm denoises an image by sparsely representing each block with the overcomplete DCT dictionary, thus averaging the represented parts [11].Global algorithm denoises an image by training a dictionary on patches from the noisy image, sparsely representing each block with this dictionary and averaging the represented parts [11].K-SVD algorithm uses DCT dictionary to initialize and then uses singular value decomposition for dictionary updating [11].BM3D algorithm is an image signal denoising method based on transform domain enhancement sparse representation [26].
Throughout this experiment, we use SGK, DCT, Global, K-SVD, and BM3D algorithms to denoise the Barbara image with  = 25 as an example.As shown in Figures 3(b)-3(f), the denoising results of five different algorithms are basically the same, and the image's details are basically well preserved.In the following, we do quantitative comparisons between five algorithms, and the experimental data is shown in Table 1.The PSNR of the SGK is similar to that of K-SVD, and it is superior to Global, DCT, and BM3D algorithms.Encouragingly, we see that SGK runs much faster than K-SVD.As to the value of MSE, SGK is smaller than Global, DCT, and BM3D algorithms.With all the above-mentioned results, the denoising supremacy of SGK over the rest algorithms is demonstrated.

Comparison of SGK Combined with Different Noise
Estimation Algorithms.In order to analyze the sensitivity of SGK algorithm to the noise level, the noise standard deviation is set as  = 5, 10, 15, 20, 25, respectively.SGK algorithm is used to denoise the Peppers image with different noise standard deviation.
Figure 4 shows the denoising experiment's results using the SGK algorithm with different offsets.The five different colors curves, respectively, represent the case where the PSNR varies with the noise offsets if the noise standard deviation is given.It can be found in Figure 4 that the PSNR of denoised image is basically invariant when the noise standard deviation has negative offset of 0∼−5% and the forward offset of 0∼+5%.When the offset of noise standard deviation continues to increase, it shows a significant downward trend in −5%∼−25% of the negative offset and 5% to 25% of forward offset, which indicates that the image's PSNR is significantly reduced.Therefore the PSNR would be changed when the noise standard deviation has negative or forward offset, which shows that the SGK algorithm is sensitive to the offsets of the noise standard deviation.So it is necessary to estimate noise level before image denoising.If the estimated noise level is close to the true noise, the denoising results will be more accurate.
In order to analyze the performances of SGK dictionary denoising combined with different noise estimation  algorithms, this paper also introduces four noise estimation algorithms: Kurtosis [25], local standard deviation distribution mode (Mode) [27], local standard deviation (Med) [27], and local standard deviation minimum (Min) [27].
Kurtosis assumes that the corresponding distribution of the kurtosis edge bandpass filter should be a constant for the noise-free images.However, the kurtosis at the entire scale of a noisy image may vary.Under this assumption, the noise standard deviation can be estimated by the kurtosis model.
Mode can divide image and the noise standard deviation is estimated according to the distribution pattern of the image's local standard deviation.As the variance of noise is constant throughout the picture, it will affect every local variance value equally.As a result, the maximum of the bell-shaped distribution will reflect the local variance of the degraded image within homogeneous areas.This value is the mode of the distribution, and it is very close to the mode, so we can use the mode to estimate it.
Med estimates noise standard deviation based on the median of the image's local standard deviation.If practical difficulties to properly estimate the mode arise, it may be useful to use the median operator instead, due to its greater simplicity and the fact that both parameters, albeit different, are not far apart in practice.So an alternative estimation procedure can be done.
Min estimates the noise standard deviation based on the minimum value of the local standard deviation of the images.Within a uniform area, the variance of the degraded image equals the variance of noise.According to the previous statement, one straightforward way to estimate standard deviation is to calculate the variance within homogeneous regions, where the variance of the original image is close to zero.
The above noise estimation algorithms are all combined with the SGK dictionary learning algorithm in this paper.Now we combine the five noise estimation algorithms mentioned above with SGK, respectively.The combined algorithms are referred to as PCA + SGK, Kurtosis + SGK, Mode + SGK, Med + SGK, and Min + SGK. Figure 5 shows the Figure 6 illustrates the performance of algorithms more intuitively varying with standard deviation.Figure 6(a) shows variation of noise estimation absolute error with the noise standard deviation.It can be seen that PCA + SGK has the least value, which is superior to the other four algorithms.So the estimation of PCA + SGK is the most accurate.Figure 6(b) shows the variation of the noise estimation time with the noise standard deviation.For the noise estimation time, Mode + SGK, Med + SGK, and Min + SGK are the least accurate, followed by PCA + SGK and Kurtosis + SGK. Figure 6(c) shows the variation of PSNR with the noise standard deviation.With the increase of standard deviation, PCA + SGK and Kurtosis + SGK keep the PSNR higher, followed by Med + SGK and Mode + SGK, and Min + SGK has the lowest PSNR. Figure 6(d) shows the variation of MSE with the noise standard deviation.In this experiment, PCA + SGK owns the lowest value of MSE, which means that the denoising performance is better than any of the other four algorithms.

Denoising Experiment of Noisy Image with Unknown Standard Deviation.
The above experiments presumably assumed that the standard deviation of the noise contained in the image is known.In order to demonstrate the advantage of the proposed PCA + SGK algorithm, it is used to denoise the noisy image with unknown standard deviation and compare it with the original SGK algorithm.Twelve classic original images are shown in Figure 7.These images are mixed into additive white Gaussian noise with unknown standard deviation.After that, we do PCA + SGK denoising and SGK denoising, respectively.When SGK is used for denoising, the standard deviation can only be guessed based on the noisy image or given a random value because the noise standard deviation is unknown.When PCA + SGK is used for denoising, the noise standard deviation is first estimated by PCA and thus is entered into SGK for denoising.Denoising results of the Cameraman image using these two algorithms are shown in Figure 8.The experiment testifies for the good performance of our approach.It can be seen that the denoising performance of The denoising results of 12 images are shown in Table 3.It is seen that the PSNR of SGK is less than that of PCA + SGK; that is, PCA + SGK has better denoised performance than PCA.Because the standard deviation of noisy image is not given when using SGK, the noise level can only be guessed and entered into SGK for denoising.While using PCA + SGK to deal with noisy image, PCA is first used to estimate the standard deviation, and then the estimated value is entered into SGK for denoising, so the denoising performance is better.Quantitative comparisons with traditional SGK illustrate the benefits of PCA + SGK.

Conclusions
In this paper, the algorithm of PCA noise estimation combined with SGK dictionary learning was proposed to denoise image.The noisy image is first divided into patches, and the noise standard deviation is estimated by calculating the minimum eigenvalue of the image patch covariance matrix.After that, the estimated noise standard deviation is entered into SGK dictionary learning algorithm.The sparse representation of each training sample is obtained by sparse coding, and the dictionary atoms are updated by dictionary updating to denoise the image.This algorithm effectively solves the problem that the SGK algorithm requires a prior noise standard deviation for image denoising.This paper has the following three conclusions.
Firstly, the SGK dictionary learning algorithm is compared with K-SVD, DCT, Global, and BM3D algorithms.The PSNR of SGK algorithm and those of the other four algorithms do not have much difference, and the MSE of SGK algorithm is only higher than K-SVD algorithm.SGK algorithm owns great advantage in denoising time, which is much faster than K-SVD, DCT, and Global algorithms.Therefore, the SGK algorithm has the best denoising performance.Secondly, the PCA algorithm is compared with the other four noise estimation algorithms: Kurtosis, Mode, Mad, and Min.The five algorithms are, respectively, combined with the SGK algorithm to denoise the additive Gaussian white noise images with different standard deviation.The absolute deviation of the noise estimated by PCA + SGK is the smallest, and it is better than the other four algorithms; that is, the noise standard deviation estimation of this algorithm is the most accurate.For the noise estimation time, Min + SGK, Med + SGK, and Mode + SGK all have a faster estimation and then it is PCA + SGK algorithm proposed in this paper and Kurtosis + SGK is the slowest.On the other hand, PCA + SGK and Kurtosis + SGK keep the high PSNR, followed by Med + SGK and Mode + SGK.The lowest PSNR is that of Min + SGK.At the same time, the MSE value of PCA + SGK is the lowest.So the denoising performance of proposed algorithm is better than the other four algorithms.It is found that the proposed algorithm is more accurate to estimate the noise standard deviation with faster denoising speed and good denoising performance.

Mathematical Problems in Engineering
Thirdly, PCA + SGK and SGK are, respectively, used to denoise the image with different standard deviation.Experiments show that PSNR of PCA + SGK is much higher than that of SGK.When using SGK for denoising, the noise standard is unclear, so the denoising performance is not good, while PCA + SGK firstly uses the PCA to estimate the noise standard deviation, which is close to the true value of the noise level, so the denoising performance is more ideal and the image's details are better preserved.While performance improvement is different for different images, the results nonetheless indicate the potential of proposed algorithm over original SGK algorithms.

Figure 2
is a PCA noise estimation example of the house image.Figure2(a)is the house image, and Figure2(b)shows the noise estimation results under the different noise standard deviation.It can be seen that the PCA noise estimation value is very close to the true value.
(b) Noise level estimation results

Figure 2 :
Figure 2: Noise estimation based on principal component analysis.

Figure 3 :Figure 4 :
Figure 3: Using five kinds of dictionary learning algorithm for image denoising.

Figure 5 :
Figure 5: Comparison of five kinds of algorithms.
Figure 8(a)  shows the noise image with  = 10.
Figure 8(b) shows the denoised image by SGK and Figure 8(c) shows the denoised image by PCA + SGK.

Figure 6 :
Figure 6: Variation of denoising performance with noise standard deviation on five algorithms.

Figure 8 :
Figure 8: Comparison of two denoising algorithms on the Cameraman image. min

Table 1 :
Image denoising results of five dictionary learning algorithms.

Table 2 :
Denoising indicators on SGK combined with five noise estimation algorithms.

Table 3 :
Denoising results of two algorithms for 12 images.