In traditional image denoising, noise level is an important scalar parameter which decides how much the input noisy image should be smoothed. Existing noise estimation methods often assume that the noise level is constant at every pixel. However, real-world noise is signal dependent, or the noise level is not constant over the whole image. In this paper, we attempt to estimate the precise and pixelwise noise level instead of a simple global scalar. To the best of our knowledge, this is the first work on the problem. Particularly, we propose a deep convolutional neural network named “deep residual noise estimator” (DRNE) for pixelwise noise-level estimation. We carefully design the architecture of the DRNE, which consists of a stack of customized residual blocks without any pooling or interpolation operation. The proposed DRNE formulates the process of noise estimation as pixel-to-pixel prediction. The experimental results show that the DRNE can achieve better performance on nonhomogeneous noise estimation than state-of-the-art methods. In addition, the DRNE can bring denoising performance gains in removing signal-dependent Gaussian noise when working with recent deep learning denoising methods.
Noise level is an important parameter which decides how much the input noisy image should be smoothed. This parameter is directly required for many well-known denoising algorithms including Wiener filtering [
Image denoising is often formulated to remove Gaussian white noise, which is additive and homogeneous.
However, homogeneous noise assumption is not valid in real-world images [
Signal-dependent image noise. (a) Linear RGB image processed from a raw image without denoising, brightening, or gamma correction. (b) Bright part (strong noise). (c) Dark part (slight noise). (d) Normal part (medium noise).
Another observation is that noise in the dark part seems stronger than that in the bright part in standard RGB (sRGB) images. The reason is that sRGB images are gamma corrected and that the intensity including noise in the dark part is amplified more times than that in the bright part.
In the image signal processing (ISP) pipeline, as shown in Figure
Image processing pipeline (ISP). In the ISP, light signals are converted to the raw image through a color filter array (CFA) and then go through white balancing and demosaicing to result in a linear RGB image. This linear RGB image further goes through denoising, color space conversion, and finally nonlinear gamma correction to produce the final sRGB image.
Since the image noise variance is not constant in an image, granularity of noise-level estimation needs to be smaller in order to improve denoising performance. Pixelwise noise-level estimation is the ultimate form.
Traditional denoising methods may require some modifications to take advantages of pixelwise noise estimation. However, deep learning methods use the pixelwise noise estimation results naturally. Recent deep learning-based image denoising methods [
In this paper, we propose a deep residual convolutional neural network for pixelwise noise-level estimation. The main architecture simply consists of a stack of customized residual blocks, which is more proper for noise estimation. No pooling operation or a larger stride of convolution is adopted in the proposed architecture, which always intends to extract high-level features, such as semantic information. However, such a refined feature is unnecessary for noise estimation, which always focuses on low-level features, e.g., boundary and local variance. Given a noisy image, the proposed method is able to produce a pixelwise noise-level estimation map as well as a global scalar noise level. The contributions of this study are as follows: Although many works have pointed out that noise standard deviation is not uniform across the image [ Deep convolutional neural networks are adopted to provide pixel-to-pixel predictions, which are carefully designed by several residual blocks. In terms of traditional scalar average estimation error, the proposed method is able to compete with the state-of-the-art methods [
The challenge of noise-level estimation lies in distinguishing high-frequency noise from high-frequency image details. To overcome the difficulty, traditional methods often divide an image into fixed-size patches and search for flat parts to estimate the noise level [
Lee and Hoppel [
In recent years, PCA-based approaches [
All methods above compute a scalar estimation, which works well with traditional denoising methods including BM3D [
Recent works on color image noise estimation consider noise variance difference on different color channels [
Recent deep learning-based image denoising methods [
Deep denoising model: the recent model takes a noisy image and a noise-level map as the input and produces an estimated clean output.
Currently, the noise-level map of deep denoising methods is filled with constant values. However, equation (
Let the noisy RGB image be denoted by
The target of this paper is to learn a dense corresponding mapping function
The goal of this paper is to produce a pixelwise noise-level estimation map, with which we can visualize the noise level over the image. We can also compute a global noise-level scalar from the map and compare the proposed method with traditional estimation methods.
The 2D convolution samples a regular grid
Fully convolutional networks (FCNs) inherent the correspondence property from 2D convolution, which are good at modeling problems of dense correspondence of images. For example, FCNs are suitable for image segmentation [
Residual networks [
There are a few layer operations that are not suitable for image denoising. Recent deep denoising methods [
A deep residual noise estimation (DRNE) convolutional neural network is proposed. Figure
(a) Whole network architecture. The input is a noisy RGB image with three channels, which goes through a stack of residual blocks followed by a convolution and Relu [
As for implementation, the sizes of all convolution kernels are
Training images are required to be clean and rich textured. We select the first 4,000 of images from the Waterloo exploration dataset [
In theory, every training image needs to be randomly cropped and corrupted with signal-dependent noise levels. However, we find that corrupting every training image with only one noise level in
As a result, our method is trained from homogeneous noisy patches. However, it is able to produce signal-dependent pixelwise noise estimation results in Section
Pixelwise noise-level estimation results. (a, e) Clean images. (b, f) Images with artificial nonhomogeneous Gaussian white noise. The image is divided into four rectangular parts, which are counterclockwise added with noise of levels
In theory, equation (
(a) Unregularized loss; (b) regularized loss.
The phenomenon suggests the loss function may require regularizations. In the training stage, we feed the model with images corrupted by homogeneous noise. Thus, we use the mean value of elements of the predicted noise matrix to regularize the loss:
After applying the regularization term, the training and evaluation losses become significantly more stable and easy to converge. Figure
We implement the model with TensorFlow [
More ablation study can be done for the parameters. Performance gain may be acquired with intensive trails. Since the experiments in this paper already take a large part, and the current parameter set has good performance in the experiments, we leave it for the future work.
In this section, we give quantitative and qualitative evaluations of the proposed method. In quantitative evaluations, we have to use the traditional scalar average error and standard deviation as evaluation criteria since other methods can only produce scalar estimation. In qualitative evaluations, we visualize the estimation map on both simulated data and real noisy images to show the effectiveness of the proposed method. In addition, we apply comparative methods to two deep denoising methods to reveal the denoising performance gain brought by the DRNE.
We compare the proposed method with state-of-the-art methods on three datasets: Kodak, McMaster [
Four methods, including Pyatykh’s method [
We first evaluate all methods on traditional homogeneous noise. The fixed level of homogeneous Gaussian white noise is added to clean RGB images of the three datasets. For a fair comparison, we implement a framework to do noise addition and performance evaluation. Each compared method receives exactly the same input noisy images and uploads estimation results to the framework through a wrapper. Note that Pyatykh’s method [
Table
Average error on simulated homogeneous noise (dB).
Dataset | Noise level | Pyatykh [ |
Liu [ |
Chen [ |
DRNE |
---|---|---|---|---|---|
Kodak (24 images) |
|
0.91 | 0.96 |
|
|
|
1.79 | 0.17 |
|
|
|
|
3.93 | 0.19 |
|
|
|
|
6.13 | 0.33 |
|
|
|
|
8.42 | 0.55 |
|
|
|
|
10.87 | 0.90 |
|
|
|
|
|||||
McMaster (18 images) |
|
|
|
1.60 | 0.83 |
|
1.70 |
|
0.23 |
|
|
|
4.12 | 0.65 |
|
|
|
|
6.93 | 1.22 |
|
|
|
|
9.84 | 1.89 |
|
|
|
|
12.73 | 2.65 |
|
|
|
|
|||||
BSD500 (500 images) |
|
0.52 |
|
0.54 |
|
|
1.66 | 0.18 |
|
|
|
|
3.70 | 0.35 |
|
|
|
|
5.93 | 0.57 |
|
|
|
|
8.30 | 0.90 |
|
|
|
|
10.79 | 1.33 |
|
|
Bold fonts denote the best performance and italics denotes the second best performance.
Table
Standard deviation of error on simulated homogeneous noise (dB).
Dataset | Noise level | Pyatykh [ |
Liu [ |
Chen [ |
DRNE |
---|---|---|---|---|---|
Kodak (24 images) |
|
0.45 | 0.30 |
|
|
|
0.51 | 0.26 |
|
|
|
|
1.00 | 0.49 |
|
|
|
|
1.46 | 0.67 |
|
|
|
|
1.91 | 0.86 |
|
|
|
|
2.31 | 1.07 |
|
|
|
|
|||||
McMaster (18 images) |
|
|
|
0.71 | 0.24 |
|
|
0.18 | 0.28 |
|
|
|
0.52 | 0.46 |
|
|
|
|
0.87 | 0.76 |
|
|
|
|
1.11 | 1.09 |
|
|
|
|
|
1.42 | 1.37 |
|
|
|
|||||
BSD500 (500 images) |
|
0.77 | 0.26 |
|
|
|
1.12 | 1.31 |
|
|
|
|
0.80 | 0.42 |
|
|
|
|
1.25 | 0.61 |
|
|
|
|
1.71 | 0.85 |
|
|
|
|
2.17 | 1.11 |
|
|
Bold fonts denote the best performance and italics denotes the second best performance.
In general, the performance of Chen’s method and the proposed DRNE on homogeneous noise estimation is quite close. The differences are that Chen’s method is better at average error, while the proposed DRNE is better at standard deviation.
For example, the average error of Chen’s method on the McMaster dataset at noise level 0 is 1.60, which is quite large. More specifically, for the first image of the McMaster dataset shown in Figure
We then evaluate all methods on nonhomogeneous noise. Clean images from datasets are first divided into four rectangular parts. Then, the four parts are added with noises of different levels:
Note the noise pattern is designed for easy comparison since it is easy to visualize and compute a global scalar estimation. The real noisy images will be evaluated later.
For traditional methods, the noise estimation results should be the weighted average of noise levels of all patches, which is
Tables
Average error on simulated nonhomogeneous noise (dB).
Dataset | Noise level | Pyatykh [ |
Liu [ |
Chen [ |
DRNE |
---|---|---|---|---|---|
Kodak |
|
3.33 | 1.08 |
|
|
|
7.34 | 0.48 |
|
|
|
|
11.71 | 0.86 |
|
|
|
|
|||||
McMaster |
|
3.39 | 1.18 | 1.03 |
|
|
7.88 | 1.46 |
|
|
|
|
13.31 | 2.74 |
|
|
|
|
|||||
BSD500 |
|
3.15 | 1.47 | 0.82 |
|
|
7.12 | 0.81 |
|
|
|
|
11.56 | 1.38 |
|
|
Bold fonts denote the best performance and italics denotes the second best performance.
Standard deviation of error on simulated nonhomogeneous noise (dB).
Dataset | Noise level | Pyatykh [ |
Liu [ |
Chen [ |
DRNE |
---|---|---|---|---|---|
Kodak |
|
1.67 | 0.64 |
|
|
|
1.40 | 0.68 |
|
|
|
|
2.45 | 1.10 |
|
|
|
|
|||||
McMaster |
|
0.61 | 0.70 |
|
|
|
0.76 | 1.03 |
|
|
|
|
|
1.58 | 1.35 |
|
|
|
|||||
BSD500 |
|
1.03 | 0.86 | 0.14 |
|
|
1.11 | 0.77 |
|
|
|
|
1.94 | 1.20 |
|
|
Bold fonts denote the best performance and italics denotes the second best performance.
The proposed DRNE is not only able to give a scalar prediction but also able to produce a pixelwise noise-level map. Figures
To show the effectiveness of the proposed method on real images, we also evaluate our method on real linear RGB images. These images are captured using mobile phones and saved in raw (DNG) format. Then, they go through early stages of the image processing pipeline without denoising, brightness adjustment, or gamma correction. Figures
Noise estimation results on a real linear RGB image. In general, the bright part in images suffers from stronger noise, which is in accordance with the noise model (
In general, the dark part in images suffers from weaker noise, which is in accordance with the noise model (
We mentioned in contributions that pixelwise noise estimation is expected to improve the performance on deep learning denoising methods [
First, we prepare two existing deep denoising models [
Previous evaluations have shown that the performance of Chen’s method [
Table
Applying noise estimation methods to deep learning models to remove homogeneous noise (dB).
Dataset | Noise level | Chen [ |
DRNE + DRDD | Chen + FFDNet | DRNE + FFDNet |
---|---|---|---|---|---|
Kodak |
|
|
|
39.80 | 39.80 |
|
34.84 |
|
34.84 |
|
|
|
32.99 | 33.02 | 33.04 |
|
|
|
|||||
McMaster |
|
|
39.15 | 38.84 | 38.84 |
|
34.74 |
|
34.65 | 34.67 | |
|
32.26 |
|
32.34 | 32.42 | |
|
|||||
BSD500 |
|
|
39.35 | 39.36 | 39.35 |
|
34.26 |
|
34.23 | 34.24 | |
|
32.34 | 32.39 | 32.36 |
|
Table
Applying noise estimation methods to deep learning models to remove nonhomogeneous noise (dB).
Dataset | Noise level | Chen [ |
DRNE + DRDD | Chen + FFDNet | DRNE + FFDNet |
---|---|---|---|---|---|
Kodak |
|
|
39.90 | 39.84 | 39.84 |
|
34.97 |
|
34.97 |
|
|
|
33.04 | 33.07 | 33.10 |
|
|
|
36.29 | 36.31 | 36.38 |
|
|
Noise model ( |
33.30 |
|
33.24 |
|
|
|
|||||
McMaster |
|
39.62 |
|
39.30 | 39.32 |
|
35.04 |
|
34.93 | 34.95 | |
|
32.30 | 32.39 | 32.38 |
|
|
|
34.66 |
|
34.61 | 34.67 | |
Noise model ( |
|
33.18 | 33.14 | 33.14 | |
|
|||||
BSD500 |
|
|
39.46 | 39.44 | 39.43 |
|
34.28 |
|
34.25 | 34.26 | |
|
32.34 | 32.38 | 32.35 |
|
|
|
35.66 |
|
35.64 | 35.66 | |
Noise model ( |
33.01 |
|
32.92 | 33.17 |
Text in bold denotes the best method. Text in italics denotes significant performance gains of the DRNE against Chen’s method.
Average running time (seconds).
Dataset | Pyatykh (CPU) | Liu (CPU) | Chen (CPU) | DRNE (CPU/GPU) |
---|---|---|---|---|
McMaster | 1.75 | 2.20 |
|
5.31/1.16 |
When the real noise model is adopted, the DRNE got a significant performance gain on Kodak and BSD500 datasets. On the McMaster dataset, the performance of the two methods is close. We can conclude from Table
Figure
Applying noise estimation methods to deep denoising of nonhomogeneous noise. (a, b) Visual results of the test case noise model (
In this section, we show that when working with the CNN denoising model, the proposed DRNE is generally better in removing nonhomogeneous noise with both quantitative and qualitative results.
In order to handle large images, the DRNE crops images into patches with the fixed size and handles them sequentially. For a fair comparison, we use the McMaster dataset for evaluation and set the crop size the same as the input image size
Table
In this paper, we propose a deep residual convolutional neural network named “DRNE” for Gaussian noise-level map estimation of images. The main architecture consists of a stack of carefully designed customized residual blocks. Given a noisy image, the proposed DRNE is able to produce a pixelwise noise-level estimation map as well as an overall scalar noise level.
Experiments show that the proposed DRNE is able to compete with state-of-the-art methods such as Chen’s method on traditional scalar noise-level estimation. In addition, the DRNE is able to produce a signal-dependent noise-level map, which is in accordance with the linear RGB image noise model (
Pixelwise noise-level estimation is helpful for precise noise removal. Recent deep learning-based noise removal methods [
Deep learning might be the ultimate method to separate high-frequency image details from noise by automatically mining patterns from image data. For the future work, we believe that joint training of the deep noise estimation model and the deep denoising model is possible to surpass all traditional methods in image denoising.
The three datasets Kodak, McMaster, and BSD500 are publicly available at
The authors declare that they have no conflicts of interest.
This work was partially supported by NSFC (National Natural Science Foundation of China) under Grant 61703415.