An Objective Evaluation Method for Image Restoration

Accurate image restoration is of paramount importance for low-level vision, computer vision, and various other fields. Numerous complex restoration algorithms have been stated in the literature. Performance of these restoration algorithms differs with the nature of the image and distortion. These algorithms are evaluated either qualitatively or quantitatively by comparing the restored image with the original image. The practical drawback of this quantitative comparison is the requirement for the original image. In this paper, we propose a measure, with theoretical grounding, to objectively evaluate the restoration algorithms with no knowledge about the original image. This measure analyses the deblurring as well as the denoising nature of the restoration method. The main utility of this measure will be in designing automatic restoration systems. Effectiveness of this measure is substantiated with experimental results.


Introduction
Images become blurred when there is a smearing of the point spread function caused by certain imaging conditions such as camera motion, object motion, atmospheric turbulence, and so forth.In addition, sensor noise leads to additional degradation of the image.Image restoration [1,2] is concerned with estimation of the undistorted image from the given blurred and noisy image.It has utmost importance in medical imaging, astronomy, outdoor surveillance photography, and it is a frequent preprocessing step for highlevel vision.Image restoration for linear spatially invariant degradation is generally modelled as where, g, f, and n are lexicographically ordered vectors of blurred, original, and noisy images, respectively.H is the convolution matrix of the linear, spatially invariant kernel h.Numerous image restoration algorithms [3][4][5][6][7][8] have emerged utilizing a wide variety of regularizations, derived based on some a priori knowledge about the original image.Image restoration is a complex problem without an exact solution.The performance of a restoration algorithm varies for different sets of images with dynamically changing environment and values chosen for its own parameters [9].These algorithms have a regularization parameter that controls the trade off between data fidelity and noise reducing capability of the algorithm.Maximizing the value of the regularization parameter distorts the signal, whereas minimizing its value results in singularity problem followed by noise magnification [1].Thus, ranking of the algorithms is of paramount importance.Visual evaluation of the restoration algorithm is subjective (depends on the evaluator) and it puts extra load on users.An ability to numerically evaluate multiple image restoration algorithms for the given blurred image is necessary to make image restoration adaptive.
The improvement in the quality of the estimated image { f} over the blurred image is measured in terms of improvement in signal-to-noise ratio (ISNR) as [10] The ISNR can be computed only if there is access to the original image, thus making this objective evaluation impossible in real-time scenario.Despite significant advances in the field of image restoration, evaluation of these techniques, by and large, has been subjective in the real field conditions.
In this paper, we introduce a new information theoretic approach (based on entropy) to evaluate restoration.Our new method provides a score to evaluate the parameterizations of a single restoration algorithm as well as fundamentally different restoration algorithms.This measure does not demand any knowledge about the original image thus making it feasible to be used to automate the restoration process in the real field.The proposed measure quantifies the noise reducing as well as the deblurring capability of restoration algorithms.
The advantages of the proposed objective evaluation are the following.(i) It automates the restoration process in real time as it does not demand any knowledge of the original image, (ii) it ranks the algorithms based on the data fidelity as well the denoising capability, and (iii) The proposed measure does not assume the statistics of the noise or the image.
Section 2 introduces the proposed empirical measure and justifies the use of entropy as an evaluation function.Sections 3 and 4 furnish the theoretical and empirical results, respectively, to substantiate the use of the given measure in judging restoration algorithms.

New Evaluation Method
This section brings out our information theoretic approach that could be used for restoration evaluation and justifies the use of each and every term of the proposed measure in the evaluation process.
A good restoration method should concentrate on estimating images as close as possible to the original image.The deviation of the estimate from the original image is a quantitative measure for the data fidelity of the restoration algorithm.Thus, the data fidelity of restoration algorithms can be quantified with a function φ of the residual image (r) as given in (3), where f and h are the estimated image and blur, respectively.
Maximum likelihood (ML) estimation or least square (LS) estimation could be used as φ of the residue.There are disadvantages associated with the use of ML or LS.ML could be applied only when the exact distribution of the residue is known a priori, which is not possible in real-time scenario.LS works best for data that does not contain a large number of random residual errors with extreme values.The main disadvantage of LS is its sensitivity to outliers.Outliers have large influence on the results as squaring the residue magnifies the effect of these extreme data points.
Due to the inherent disadvantages of ML and LS, we use entropy (H) as φ of the residue.Entropy is robust to the outliers and it is a nonparametric approach.This is of so much interest in scenarios where the statistics of the underlying residual error is not available a priori.In addition to the entropy of the residue, we consider the entropy of the estimated image and the blur to quantify the denoising capability of restoration algorithms (explanation in Section 3.2).Thus, the proposed measure turns out to be The restoration algorithm which gives minimum value for the above equation can be concluded as the best restoration algorithm.The role played by each entropy term is given in detail in the forthcoming sections.

Justification of the Derived Measure
Equation ( 4) is comprised of three terms.The first term is a measure of the data faithfulness; the second and third terms are measures of the noise content of the restored image and blur.So, the first term is called Data Fidelity term (DF); the second and third terms are called Noise Assessing terms (NA 1 and NA 2 ), respectively.
3.1.DF Term.This section proves the significance of the DF term in quantifying the deblurring effect of the given algorithm.The residue r can be written as where If the restoration algorithm completely removes the noise n from the estimates h and f, then x and n are independent of each other.In such scenarios, the minimum value of H(r) is H(n), which is proved in the following paragraph.
Let us consider the mutual information between x and r As n and x are assumed to be independent of each other, Substituting ( 8) in (7), The positive property of MI implies Thus, H(n) is the minimum value that H(r) can attain when the estimates are independent of the input noise n.
When the image and the blur are estimated perfectly, H(r) becomes H(n) (minimum value of H(n)).This proves the legitimacy of H(r) in estimating the data faithfulness.H(r) alone will suffice for assessing the restoration algorithms that remove the input noise completely from the estimated image and the blur.If a restoration algorithm fails to remove the noise totally from the estimates, the assumption of x and n being independent will be violated and (10) may not be true and there may be situations (as given in Section 3.2) where a noisy estimate may yield a lower H(r) than that of the perfect estimates.(11) and thus leading to H(r) as given in (12).Algo 1 gives perfect estimate of the image and blur, whereas algo 2 has amplified the noise in the process of deblurring H(r 2 ) < H(r 1 ).( 13) H(r) (13) of algo 2 will be lesser than that of algo 1, indicating algo 2 to be the best algorithm, which is not true.This shows the ineffectiveness of DF term in quantifying the restoration algorithm in the case of noisy estimates.In order to prevent noisy estimates to be judged as good estimates, we need additional NA terms.The NA terms quantify the noise reducing capability of the given algorithm.The entropy is a measure of disorder in the image.The randomness of noise leads to higher entropy, whereas the natural images which are not as random as that of noise, possess smaller entropy [11,12], that is, the entropy of image corrupted with independent noise n will be greater than that of the original image (14).
To prove ( 14), let us consider the MI between y = f + n and n As f and n are independent of each other, the above equation becomes The positive nature of MI proves that H(f + n) > H(f), that is, the entropy of the image with additive noise is greater than that of the original image.This proves the genuineness of NA terms in the proposed measure to assess the denoising capability of restoration algorithms.

Minimum Description Length (MDL) Principle.
An alternate insight of our evaluation method can be obtained by applying MDL [13] principle to balance the trade off between data fidelity and noise removal.Our evaluation formula typically has two description lengths (i) H(r) and (ii) H( f) + H( h).Restoration algorithms could be broadly classified as follows.(i) Algorithms that find a very good data fit {smaller H(r)}, but noisy estimates {larger H( f) + H( h)}; (ii) algorithms that find rather bad data fit {larger H(r)}, but denoised estimates {smaller H( f) + H( h)}.The sum of the two description lengths will be minimized for the restoration algorithm which results in quite (but not too) denoised, with a good (not too perfect) data fit.This has been validated with empirical studies in the forthcoming section.

Experimental Results
This section brings out the results of the experiments conducted on the restoration results, to test and validate the efficacy of the proposed measure.We have studied the effectiveness of the proposed measure by verifying its consistency with the conventional Distortion Measure (DM -( 17)) Table 1 provides the details of the input images (size 256 × 256 pixels), blur, type of additive noise, and the iterative deconvolution methods [14,15] used in different experiments.
The entropies of the residue and blur kernel are estimated as given in (18).
where p r (i) and p h (i) are the probabilities that a pixel in the residue and estimated blur respectively has a luminance value i.To estimate the NA 1 term, f is divided into j (1 to R) subimages of size 16 × 16 pixels.NA 1 (19) is calculated as the average of the entropy of the subimages (20).
where H j ( f ) is the entropy of the sub-image j and p fj (i) is the probability that a pixel of the sub-image j has luminance value i. Sub-image of any real image will be smoother except for the presence of noise.Thus the entropy of the sub-image gives more appropriate measure of the noise content in the image than the entropy of the image as a whole.

4.1.
Results of Experiment 1.This section gives the results of Exp 1a and 1b (Table 1).Figures 1(a Table 2 is the display of the proposed measure and the distortion measure calculated on the restored images of Exp 1a and 1b.The proposed and DM conclude Wiener to be the best restoration method in Exp 1a and 1b, thus proving the credibility of the proposed measure.

4.2.
Results of Experiment 2. This section furnishes the results of the three experiments Exp 2a, 2b, and 2c (refer Table 1).Table 3 gives the values of the proposed and distortion measures for Exp 2a.The proposed measure coincides with the distortion measure, thus proving the efficacy of the proposed measure.
Figure 3 is the plot of normalized cross correlation (21) between the proposed and DM of Exp.2b.
where r pd (l) is the cross correlation between the Proposed measure and distortion measure at lag l.Lag refers to the lag   in the ordering of the DM values in the experimental table.r pp (0) and r dd (0) are auto correlations of the proposed and distortion measure, respectively at lag 0. Figure 3 reaches its maximum value for lag = 0, thus proving that the proposed and distortion measures follow the same trend in their value fluctuation.Figure 4 is the result of Exp 2c.As it can be seen, minimum values of the proposed measure and distortion are obtained at the same lambda (=10), certifying the parameter lambda = 10 to be the optimum parameter for the given blurred image.

Conclusion
We have proposed a measure to analyse the performance of restoration algorithms for the given blurred image, which (i) does not demand any knowledge about the original image and (ii) derived without any assumption of the noise or image statistics.This helps in automating the restoration process without the intervention of the user.The given measure has a data fidelity term and noise assessing terms, thus analysing the deblurring as well as the denoising nature of the restoration method.The authenticity of the proposed measure was substantiated with different sets of empirical results.

Figure 3 :
Figure 3: Cross correlation between Proposed and DM.

Table 1 :
Details of the experiments conducted.

Table 2 :
Comparison of Proposed and Distortion measure of Experiments 1a and 1b.

Table 3 :
Comparison of Proposed and Distortion measure of Experiment 2a.