A Variational Bayesian Superresolution Approach Using Adaptive Image Prior Model

1School of Computer Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, China 2The Key Laboratory of Image Processing and Intelligent Control, Ministry of Education, Wuhan 430074, China 3Sino-US Intelligent Information Processing Joint Lab, Anyang Normal University, Anyang 455000, China 4Center for Machine Vision and Security Research, Kennesaw State University, Marietta, GA 30060, USA


Introduction
Superresolution (SR) technique [1][2][3][4][5] is an important branch in image fusion technology which targets the reconstruction of a high-resolution (HR) image from a set of degraded low-resolution (LR) images.The LR images are usually affected by warping, blurring, downsampling, and noising.Due to advancement of the technology, superresolution has been widely used in a broad range of applications such as computer vision, medical imaging, public safety, and military reconnaissance.
SR is a typically ill-posed inverse problem that is hard to solve without the introduction of some prior image information [6][7][8].Thus, a number of regularization-based SR approaches have been proposed [9][10][11] by incorporating the prior knowledge of the unknown high-resolution image in the regularization strategy [12].The authors in [9] proposed a regularization-based SR approach based on the Tikhonov regularizer using 2 norm.It can remove noises effectively; however, it also blurs the image edges.Farsiu et al. [10] proposed the bilateral total variation (BTV) regularizer, which penalized gradient magnitudes measured by 1 norm, with the intention of preserving edges in the reconstruction.However, the BTV regularizer often produces artifacts in the smoothed regions due to the use of 1 norm [11].In [11], the authors improved BTV with the norm (, ) =  √  2 +  2 −  2 , where  is the gradient value and  is a positive scale parameter, and proposed the bilateral edge-preserving (BEP) regularizer.This new norm can adaptively use 1 and 2 norms, and the transition from 1 to 2 can be controlled by modifying the positive scale parameter .Unfortunately, the BEP regularizer employed a fixed scale parameter for the whole image and ignored the local image features.With this drawback, it cannot well control the level of punishment to local gradients.
It is well known that accurate registration which refers to the estimation of motion information is a very important factor in the success of the SR reconstruction method [13].The above SR methods only registered the observations once before reconstructing the HR image.This is not a robust method.The error caused by an inaccurate registration would produce poor effects in the reconstruction process [13].Other than the motion parameters, the hyperparameters related to the image prior model and the noise model could intimately affect the image reconstructing quality [12].Recently, the variational Bayesian method has been used for SR [14][15][16].This method approximates the joint posterior distribution of the unknowns, including the HR image, the motion parameters, and the hyperparameters, using a tractable distribution by minimizing the Kullback-Leibler (KL) divergence and by estimating the unknowns simultaneously in each iteration.
The Bayesian approach uses a prior model to estimate a priori knowledge of the unknown HR image.For edgepreservation, the variational Bayesian approaches based on TV prior model [14] and ℓ1 prior model [16] have been proposed.In these two prior models, the gradient magnitudes are measured by the 1 norm, and consequently both of them can preserve edges.Compared with the TV prior, the ℓ1 prior contains two model parameters, which constrain certain preferred edge directions [15,16].However, the 1 norm may not always distinguish the real image features from the effects of errors caused by inaccurate registration and additive Gaussian noise.Therefore, some artifacts will be produced in the smoothed image regions.Villena et al. made progress by combining the TV prior model and simultaneous autoregressive (SAR) (i.e., based on 2 norm) prior model [15].A similar approach is also applied to the ℓ1 prior model.The problem is that it is difficult to choose a proper parameter to balance the priors in the combination.
In this paper, a variational Bayesian estimate of the HR image, the motion parameters, and the hyperparameters are proposed and derived and then given a set of LR images.In order to overcome the difficulty of aforementioned SR methods, an adaptive image prior model is proposed, in which the degree of interaction between pixels is adjusted adaptively by using the norm (, ) in order to preserve edges and remove noises.It is important to note that we propose a method for automatically estimating the scale parameters (i.e., ) to the proposed prior model.Information needed to determine these scale parameters is updated iteratively based on the available estimated HR image.Experiment results show the effectiveness of the proposed SR method.
The rest of this paper is organized as follows: the observation model and a Bayesian framework for image SR are described in Section 2. The adaptive image prior model is sketched in Section 3. The proposed SR approach is presented in Section 4. Experimental results are demonstrated in Section 5. Finally, the conclusion is given in Section 6.In addition, Appendix is presented at the end of the paper showing the solving process in detail.

The Problem Formulation
We formulate the problem to be solved in this section.The formulation is divided into the observation model and hierarchical Bayesian framework which are discussed below.

The Observation Model
. Formulating an observation model that relates the original HR image to the LR observations is the first step to fully analyze the SR reconstruction problem.
Let  denote the original HR image of size  1  1 ×  2  2 , and each observed LR image is of size  1 ×  2 . 1 and  2 denote the downsampling factors in the horizontal and vertical directions, respectively.The HR image and LR images are written in lexicographical notations as the vectors of sizes  1  1  2  2 × 1 and  1  2 × 1, respectively.In this work, the image acquisition process is modeled by geometric transformation, blurring, downsampling, and adding with white Gaussian noise.Thus, the following popular matrix notation [15] is adopted to describe the acquisition process: where ) is the warp matrix, and   (the dimension of   is  1  2 × 1) is the additive white Gaussian noise.The motion that occurs during the image acquisition is represented by warp matrix (  ).In this paper, we assume that the motion includes global rotation and translation; that is,   = (  ,   ,   )  , where   ,   , and   are the rotation angle and the displacement in horizontal direction and vertical direction of the th HR image, respectively, relative to the reference frame.In this study, we assume that the blur matrices   are determined by sensors.The sensor PSF is usually modeled as a spatial averaging operator, and the characteristics of this blur are usually assumed to be known.Under these assumptions, the matrices   and (  ) have a block circulant with circulant block structure.The downsampling matrix  is determined by the downsampling factors  1 and  2 , and it generates LR images from the warped and blurred HR image.

The Hierarchical Bayesian
Framework.We use a hierarchical Bayesian framework [17] to model the acquisition process, the HR image , the motion parameters {  }, and the hyperparameters  and {  }.Parameters  and {  } are the model parameters of our proposed prior model and the noise model, respectively.Thus, we can obtain the following joint posterior distribution of , {  }, {  }, and  given V by using the Bayes rule: where (V | , {  }, {  }) represents the conditional distribution of the LR images V and ( | ), ({  }), ({  }), and () are prior distributions for the unknown image , the motion parameters {  }, and the hyperparameters {  } and , respectively.Suppose that the noise   in the LR image V  ( = 1, 2, . . ., ) is the white Gaussian noise with a zero-mean distribution   ∼ N(0,  −1  ), and we can write where ‖ ⋅ ‖ 2 denotes the 2 norm of a given vector ⋅.
Moreover, if we assume the statistical independence of noises among the LR images, we can obtain the conditional distribution of the LR images V By using the available registration algorithm, the initial values of motion parameters can be obtained.However, the obtained initial values are often inaccurate.In this paper, we refine the values of motion parameters by modeling them as stochastic variables following Gaussian distributions (  ∼ N( 0  ,   )), which is similar to the model used in [15].The prior distributions for the hyperparameters are defined to be Gamma distributions ( ∼ Γ(  ,   ),  ∈ {, {  }}).Gamma distributions for the hyperparameters were selected because they are conjugate for the variance of the Gaussian distribution; therefore, the posteriors will also have Gamma distributions in the Bayesian formulation [18].
Our proposed adaptive image prior model, denoted by ( | ), will be presented in the next section.

The Proposed Adaptive Image Prior Model
In this section, we propose an adaptive image prior model and then present a method to calculate the scale parameter  used in the proposed prior model.

The Adaptive Image Prior Model.
The adaptive norm (, ) =  √  2 +  2 −  2 is used to measure the horizontal and vertical gradients in which  is the gradient value and  is a positive scale parameter.These parameters are to determine the severity of penalizing image gradients.A new adaptive image prior model is proposed and defined as follows: where  =  1  1 ×  2  2 is the number of pixels in the HR image.Symbols ∇    and ∇    represent the horizontal and vertical gradient components, respectively, for the pixel ,  = { 1 ,  2 } is the hyperparameter of this prior controlling the degree of regularization, and  ,1 and  ,2 are the scale parameters of the adaptive norms measuring the horizontal and vertical gradient component, respectively, of pixel .
Thus, the expression  =   (10).In the SR process based on (10), when the LR pixel splits in both horizontal and vertical directions into a block of HR pixels, the large gradient magnitude in the vertical direction will be preserved due to the 1 norm, and the small gradient magnitude in the horizontal direction will be smoothed due to the 2 norm.Thus whenever there exists an edge along horizontal or vertical direction,  √  2 + | ⋅ | 2 −  2 can act as the 1 norm in the direction perpendicular to the edge, preserving large gradients, and, at the same time,  √  2 + | ⋅ | 2 −  2 can act as the 2 norm in the direction along the edge, obtaining relatively ideal smoothing effects.In (12), when |∇   | → ∞ and |∇   | → ∞, the 1 norm acts on both horizontal and vertical directions, to preserve the edges.In (13), when |∇   | → 0 and |∇   | → 0, the noise will be effectively removed.

Adaptive Calculation of Parameter 𝑎.
It is known that the gradients produced by edges should be preserved, while the gradients produced by errors, which suffer from inaccurate registration and additive Gaussian noise, need to be smoothed.Thus, the severity of penalizing image gradients should be determined according to the local image features.In (, ), the scale parameter, , can control the severity of penalizing local gradients.For the gradients produced by errors, 2 norm is a good choice.In this case, the scale parameter should be set as a large value.When the value of the scale parameter increases, (, ) accepts a larger range of errors and handles them like 2 norm [11].Otherwise, it should be set as a small value.Therefore, the property of scale parameter should be constrained with the following conditions: (1) its value is determined according to the local image features, (2) the value is inversely proportional to gradient magnitudes, and (3) the value is larger than zero.
We use the well-known monotonically decreasing function [19] to calculate  ,1 , and  ,2 in (6), where |∇   | and |∇   | represent the magnitudes of horizontal and vertical gradient components, respectively, for the pixel .
We also obtain for the posterior distribution ({  }) the multivariate Gaussian with mean value and covariance In ( 23) and ( 24), Finally, we obtain the mean values of the hyperparameter distributions, which are used to estimate hyperparameters, At each iteration step, after obtaining the HR image, the scale parameters R are updated using (10).We now conclude our algorithm (Algorithm 1).

Experimental Results
We will give our experimental results in detail in this section.The experimental environment and parameter settings are presented in Section 5.1.The evaluation standards and simulated experimental results obtained by four popularly used SR approaches and our proposed SR approach are illustrated in Section 5.2.Some of the results on real dataset are given in Section 5.3.

General Description.
We compare the performance of our proposed approach with the bicubic approach, the BEP model [11] based approach, the ℓ1 model [16] based approach, the TV model [14] based approach, and the SAR model [15] based approach in terms of Peak Signal-to-Noise Ratio (PSNR), Structural Similarity (SSIM) values [21], and thorough visual inspection.The images presented in Figure 1 were chosen for the simulation test.We used MATLAB R2009a on a 3.0 GHz Pentium Dual core computer with 4.0 GB RAM.
We need to derive a set of LR images from original images in the experiments.These LR images should have been generated with subpixel motion, rotation, blurring, downsampling, and noise addition.In the experiments, we let   ∈ (−3 ∘ , 3 ∘ ),   ∈ (−1, 1),   ∈ (−1, 1),  = 1, 2, 3, 4, 5, and the transformation of LR images be variant from each other.3 × 3 average and Gaussian blurring operators with deviation equal to 1 were used in the simulation.The downsampling factors were  1 = 2 and  2 = 2. Finally, additive zero-mean white Gaussian noises were added to the LR images with Signalto-Noise Ratio (SNR) levels of 1 dB, 5 dB, 10 dB, and 30 dB, respectively.Thus, in each level, five LR images were obtained from each of the original images.The initial parameters used in the experiments were set as follows: our proposed method:  0 = (/4)/ ∑  √ 0  ; the SAR method: with Laplacian operator ℏ; the BEP method: , where  is the scale parameter,    and    shift  0  by  and  pixels in the horizontal and vertical directions, respectively, and these parameters are selected to obtain the best reconstructions; consider  0 = (/4)/∑  √ 0  .For all the methods, a bicubic interpolation of V 1 is used as initial value  0 ,  0  = /‖V  − (  ) 0 ‖ 2 2 , the initial motion parameters were estimated using the method proposed in [22],   = 0, and   =   = 0,  ∈ { 1 ,  2 , {  }}.In our work, the convergence of the algorithms was set with the same criterion ‖ +1 −   ‖ 2 2 /‖  ‖ 2 2 ≤ 10 −5 .

Simulation Experiments.
We used LR images derived from artificial HR images to test our proposed approach.
The PSNR and SSIM are used to evaluate the reconstruction quality of different approaches.The first set of experiments was with the high level of noise (i.e., SNR = 1 dB and SNR = 5 dB).For the second set of experiments, white Gaussian noises with SNR = 10 dB and SNR = 30 dB were used and the motion parameters were estimated in each iteration.
In the first set of experiments, we will show the improvement of our proposed approach in heavy noise (i.e., SNR = 1 dB and SNR = 5 dB).Tables 1 and 2 show the PSNR values of all the estimated HR images in heavy noise with average blur and Gaussian blur.Based on the PSNR values, our proposed method shows a good performance.Moreover, Tables 1 and  2 give the corresponding SSIM values that also demonstrate the efficiency of our proposed method.
For the second set of experiments, the motion parameters were estimated.In order to further illustrate the effectiveness of our approach, we will present the reconstruction results for "zebra" and "car" images in noise with SNR = 10 dB and SNR = 30 dB, respectively.Tables 3 and 4 show       the comparisons of PSNR values of bicubic approach, BEP approach, ℓ1 approach, SAR approach, TV approach, and our approach.Tables 3 and 4 also give the SSIM results of these approaches with average blur and Gaussian blur, respectively.From Tables 3 and 4, we can see the improvement of our proposed approach over other tested methods.Among all the approaches tested, the proposed method achieves the highest PSNR and SSIM values on both test images.For visual quality comparison, the error images, that is, the difference between the estimated HR image and the original image, are shown in Figures 2 and 3. Our proposed approach can preserve edge details well.Figures 2(g)-2(l) and 3(g)-3(l) show the corresponding error images to the reconstructed images obtained with different approaches.Brighter pixels represent a large error.From these error images, the difference between different SR approaches is     4 and 5, the HR images obtained by the bicubic and SAR approaches are still blurred.In addition, there exist artifacts in the HR images obtained by the BEP, ℓ1, and TV approaches.Our approach is superior to the approaches compared.

Conclusions
The errors caused by inaccurate registration and noises in the traditional regularization-based SR methods often produce unsatisfactory results in the reconstruction.Thus, the variational Bayesian method, which can simultaneously estimate the HR image, the motion parameters, and the hyperparameters, has been used to improve the reconstruction quality.However, the existing variational Bayesian approaches cannot adapt to local image features.Therefore, in this paper, a Bayesian SR approach is proposed by designing a new adaptive image prior model, based on an adaptive norm.
Our adaptive image prior model can be adjusted automatically based on the evaluation of the local image features; therefore, this new model not only preserves edge details but also avoids artifacts in the smoothed regions.We also propose a method for automatically estimating the scale parameters for the proposed adaptive image prior model.Information needed to determine these scale parameters is updated in each iteration step based on the available estimated HR image, and they are calculated by using a monotonically decreasing function.In our approach, the acquisition process, the HR image, the motion parameters, and the hyperparameters are modeled in a stochastic sense by using a hierarchical Bayesian framework.And all unknowns are estimated by employing the variational Bayesian inference.
The experimental results show that the HR images obtained with our SR approach are better than those previously tested.In the proposed adaptive image prior model, we only calculate the gradients of neighboring pixels at the top and left of the center pixel for fast computation.We will investigate further for the prior on the selection of 4neighbourhood or 8-neighbourhood in order to improve the proposed algorithm in our future work.

Figure 1 :
Figure 1: The 128 × 128 test images: (a) "zebra" and (b) "car." Error image of bicubic interpolation (h) Error image of SAR (i) Error image of ℓ1 (j) Error image of TV (k) Error image of BEP (l) Error image of ours

Figure 2 :
Figure 2: Results obtained by applying different approaches to LR zebra images corrupted with Gaussian blur and white Gaussian noise with SNR = 10 dB and their corresponding error images.Brighter pixels represent a large error in error image.
Error image of bicubic interpolation (h) Error image of SAR (i) Error image of ℓ1 (j) Error image of TV (k) Error image of BEP (l) Error image of ours

Figure 3 :
Figure 3: Results obtained by applying different approaches to LR car images corrupted with average blur and white Gaussian noise with SNR = 30 dB and their corresponding error images.Brighter pixels represent a large error in error image.

Figure 4 :
Figure 4: The reconstructed images for "Adyoron" sequence obtained by different approaches.A subimage is enlarged and displayed at the right bottom corner in the image.

Figure 5 :
Figure 5: The reconstructed images for "EIA" sequence obtained by different approaches.A subimage is enlarged and displayed at the right bottom corner in the image.

Table 1 :
Comparisons of PSNR and SSIM with average blur and white Gaussian noise with SNR = 1 dB and SNR = 5 dB.

Table 2 :
Comparisons of PSNR and SSIM with Gaussian blur and white Gaussian noise with SNR = 1 dB and SNR = 5 dB.

Table 3 :
Comparisons of PSNR and SSIM with average blur and white Gaussian noise with SNR = 10 dB and SNR = 30 dB.
5.3.Experiments on RealData.The performance of our approach is tested with real dataset.The datasets are those popular used video sequences, downloaded from

Table 4 :
Comparisons of PSNR and SSIM with Gaussian blur and white Gaussian noise with SNR = 10 dB and SNR = 30 dB.Milanfar's website https://users.soe.ucsc.edu/∼milanfar/software/sr-datasets.html.The first ten LR images were used to reconstruct the HR image.Figures4 and 5present the obtained HR images for two of the image sequences downloaded.In Figures