Multiframe Superresolution of Vehicle License Plates Based on Distribution Estimation Approach

Low-resolution (LR) license plate images or videos are often captured in the practical applications. In this paper, a distribution estimation based superresolution (SR) algorithm is proposed to reconstruct the license plate image. Different from the previous work, here, the high-resolution (HR) image is estimated via the obtained posterior probability distribution by using the variational Bayesian framework. To regularize the estimated HR image, a feature-specific prior model is proposed by considering the most significant characteristic of license plate images; that is, the target has high contrast with the background. In order to assure the success of the SR reconstruction, themodels representing smoothness constraints on images are also used to regularize the estimated HR image with the proposed feature-specific prior model. We show by way of experiments, under challenging blur with size 7 × 7 and zero-mean Gaussian white noise with variances 0.2 and 0.5, respectively, that the proposed method could achieve the peak signal-to-noise ratio (PSNR) of 22.69 dB and the structural similarity (SSIM) of 0.9022 under the noise with variance 0.2 and the PSNR of 19.89 dB and the SSIM of 0.8582 even under the noise with variance 0.5, which are 1.84 dB and 0.04 improvements in comparison with other methods.


Introduction
Nowadays, the intelligent transport system (ITS) is increasingly used to address traffic problems.The ITS can apply the advanced information technology, data communication transmission technology, electronic sensor technology, control technology, and computer technology to the whole transportation management system effectively and efficiently.The vehicle license plate character recognition (VLPCR) system is one of the most important parts of the ITS and is widely used in traffic monitor and control.However, if the license plate image is captured at low resolution, the license plate cannot be readable; hence, the ITS could not work well.There are many reasons leading to the degradation of required license plate images, such as downsampling, blurring, warping, and noising.Thus, the problem addressed in this paper is using multiframe superresolution (SR) technique [1][2][3] to reconstruct license plate images with better quality.
The objective of multiframe SR is to fuse a sequence of low-resolution (LR) images representing the same scene in a single high-resolution (HR) image.Such kind of SR technique can be classified into three classes: (i) frequency domain approaches [4,5], (ii) interpolation approaches [6,7], and (iii) regularization approaches [8][9][10].Among these approaches, the regularization approach is studied widely.Due to the illposedness of SR reconstruction problems, the basic idea of regularization approach is to incorporate the prior knowledge of the unknown HR image into the reconstruction process.The regularization approach includes deterministic and stochastic regularization approaches.The former uses prior models as regularization terms, and the latter uses prior models to establish prior probability distributions.
The popularly used prior models are the Tikhonov model [11], the total variation (TV) type model [12], and the Markov random field (MRF) model [13].The Tikhonov model is based on the L2 norm.It might increase the punishment for the noise; however, it may blur the edges.The famous TV type model penalizes the total amount of changes in the image by using the L1 norm to measure the gradient.However, the TV type model could not remove the heavy noise completely, 2 Journal of Control Science and Engineering which may lead artifacts to be produced in the estimated HR images.In order to make use of advantages of the L1 and L2 norms, Suresh et al. proposed a discontinuity-adaptive Markov random field (DAMRF) prior model to reconstruct license plate images [14].In [15], a generalized DAMRF prior model has been used to make the license plate more legible.In [16], the authors proposed a bimodal prior model for the text image and combined it with the Huber prior model.In these methods, the estimation of motion parameters and the reconstruction of HR image are separated and conducted independently, which is a well-known suboptimal solution.And only the translation motion was considered in [14,15], which is not suitable to many realities of situation.In [15], the authors proposed a method to estimate the regularization parameter; however, they did not consider estimating the noise variance, which is also important for reconstructing the HR image.
In this paper, a new method based on the variational Bayesian inference (VBI) estimator is proposed to perform the SR of vehicle license plate, in which the HR image, the motion parameter, and hyperparameters are estimated jointly.The VBI estimator is a distribution estimation algorithm, which could solve the nonlinear, high-dimensional problem effectively [17].
The VBI framework used in this paper is similar to the ones used in [12,18].The main difference between the proposed method and these similar methods is the image modeling.An accurate and comprehensive image model is very useful to improve the quality of the reconstructed images.The image models proposed in [12,18] have been proved to be efficient; however, they did not consider a significant feature of the license plate image; that is, its gray-level distribution satisfied the bimodal distribution.
The most significant characteristic of license plate is that the target has high contrast with the background to make it readable.Thus, for the gray image of license plate, the target pixels tend to cluster around one center, while the background pixels tend to cluster around another one.From this point of view, we propose a feature-specific model for the license plate image by considering that its gray-level distribution has two peaks.In this paper, this feature-specific prior information is introduced explicitly as constraints into the SR reconstruction of the vehicle license plate.Moreover, in order to assure the reconstruction success, a smoothing prior is combined with the feature-specific prior model to regularize the estimated license plate image.During the reconstruction process, the target pixels and the background pixels tend to cluster to different centers.Thus, the gradient information will be estimated as exactly as possible.Certainly, the smoothing prior is useful to divide the target and background.Experimental results demonstrate that combining the feature-specific and smoothing prior models could reduce artifacts effectively.
The paper is organized as follows.The mathematical model for image degradation and VBI estimator are provided in Section 2. In Section 3, the prior probability distributions for the HR image, motion parameters, and hyperparameters are presented, and the corresponding optimization procedure is described in Section 4 in detail.In Section 5, experimental results are illustrated.The conclusion then follows.

Bayesian Framework
2.1.Degradation Model.Before attempting to solve the SR problem, it is necessary to know the process of generating the LR images.We assume that a set of LR observations are obtained from their corresponding single HR image.The size of arbitrary LR image and the HR image is  1 ×  2 and  1 ×  2 , respectively,  is a downsampling factor in the horizontal and vertical direction, and let  =  1 ×  2 .Usually, it assumes that the LR observations are generated from the HR image through a sequence of operations that includes (i) geometrical warps, (ii) blur, (iii) downsampling, and (iv) an additive zero-mean white Gaussian noise.Such SR degradation model for the LR image   derived from HR image  is given by where  > 1 is the total number of the LR observations,  ∈   2 ×1 and   ∈  ×1 are the vectorized version of the HR and LR images, respectively,   ∈  × 2  is a downsampling operator,   ∈   2 × 2  is a blurring operator, (  ) ∈   2 × 2  is a warp operator that represents subpixel shift between the LR image   and the reference frame, and   ∈  ×1 is a noise vector.
In this paper, we assume that the downsampling matrix   and blurring matrix   remain the same between the LR images and are known.The warp matrix   (  ) represents the motion that occurs during the image acquisition.It is considered that, in this paper, the motion parameters are due only to global motion, and the motion model contains global translation and rotation; that is,   = (  ,   ,   ), where   is the rotation angle and   and   are the horizontal and vertical translations of the th HR image with respect to the reference frame.The noise model is assumed as the white Gaussian noise during acquiring LR observations.Supposing that the noise   ( = 1, 2, . . ., ) is white Gaussian noise and   ∼ (0,  −1  ), we can get Then, the following equation is obtained by using (1): Since the noise among the LR images is mutually independent, we can obtain where  = {  |  = 1, 2, . . ., }.

2.2.
The VBI Estimator.By using the VBI estimator, the variables including the HR image, motion parameters, and hyperparameters are estimated through their corresponding posterior probability distributions.Usually, the mean value of the obtained posterior probability distribution is used as the estimation of the corresponding variable.
The following conditional distribution is obtained by using the Bayes rule: where By using the VBI estimator, (Θ | ) is approximated by a tractable distribution (Θ).This approximating distribution is found by minimizing the Kullback-Leibler (KL) divergence, which can measure the difference between the two distributions (Θ | ) and (Θ).The KL divergence is defined as follows: Since ,   ,   , and  are mutually independent, (Θ) = ()()∏  =1 (  )(  ).() and (  ) are posterior probability distributions of  and   , respectively.(  ) and () are posterior probability distributions of the hyperparameters   and , respectively.

The Prior Probability Distributions
3.1.Image Prior Probability Distribution.In the Bayesian method, the prior information of the original HR image represented by the prior model plays an important role.A significant property of license plate is that the target of license plate has high contrast with the background (see Figure 1).The pixels in the license plate images can be divided into two classes.Thus, the original HR license plate image can be represented as where  tar represents the set of pixels belonging to the target regions (i.e., the white regions in Figure 1) and  bac represents the set of pixels belonging to the background regions (i.e., the black regions in Figure 1).Therefore, we assume that, for the license plate images, the target pixels tend to cluster around one center, while the background pixels tend to cluster around another center.In order to make use of this characteristic, we propose the following model to regularize the estimated HR license plate image: where  is a vector with size  2  × 1.In the following, we will demonstrate how to obtain the vector .
In (8), the elements of the vector  are the mean values of  tar and  bac , respectively.That is to say, if   ∈  tar (or  bac ), thus,   =  tar (or  bac ), where  tar and  bac are the mean values of  tar and  bac , respectively.Then,  tar and  bac can be obtained as follows: where  tar and  bac represent the total number of pixels in the target regions and background regions, respectively.In order to obtain , first, the following expressions are defined: where   and   are vectors with size  2  × 1,   () =  tar and   () =  bac , for  = 1, 2, . . .,  2 . is a diagonal matrix with size  2 × 2 , whose elements are 0s and 1s, and  is a matrix with size  2  ×  2 , whose elements are 1s.
Then, for obtaining the matrix , we use OSTU to partition the estimated HR image into target regions and background regions.After the partition, the element values of  corresponding to the pixels in the target regions are set as 1s; the others are set as 0s.We can obtain Thus, Finally, the nonzero elements in  1 and  2 are extracted and positioned in  at the corresponding location as desired.
Based on the prior model ( 8), the following prior probability distribution for the estimated HR license plate image is proposed: where  1 is the hyperparameter of this prior distribution.
In order to assure the success of the reconstruction, we have adopted the total variation (TV) prior model and the simultaneous autoregressive (SAR) prior model [18] to regularize the estimated HR image.The TV and the SAR prior models are defined as follows: where ∇ ℎ   and ∇ V   represent the horizontal and vertical gradient components of the th element of , respectively.Consider where  denotes the Laplacian operator.
The TV model's and the SAR model's corresponding prior distributions are defined as 3.2.Motion Prior Probability Distribution.The motion parameters are modeled as stochastic variables following Gaussian distributions, similar to [12,18]: where  0   is the a priori mean vector and  0   is the a priori covariance matrix.These two parameters can incorporate prior knowledge about the motion parameters into the estimation process.Setting  0   and  0   equal to zero represents the fact that no such knowledge is available, which makes only the observed LR images responsible for the estimation process.
In this work, the parameters  0   are obtained by using the Lucas-Kanade method [19], and the inverse covariance matrices ( 0   ) −1 are set equal to zero matrices.And they will be used as the initial values in the following SR method.

Hyperparameter Prior Probability Distribution.
The prior information about the hyperparameters is usually expressed using the conjugate prior distribution which is calculated conveniently.Moreover, the corresponding posterior distribution has the same functional form with the prior distribution and hence the analytic solution can be obtained.It is well known that the inverse Gamma distributions are the conjugate priors for the variance of the Gaussian distribution whose mean value is known.Thus, we assume that the hyperparameters obey Gamma distributions; that is, with the shape parameter  0  and rate parameter  0  .These hyperparameters can incorporate prior knowledge about the variances of the HR image and noise among the observed LR images into the estimation process.In the following SR method,  0  and  0  will be used as the initial values. 0  is set equal to 1 and  0  is set equal to 0, which corresponds to utilizing flat prior distributions for the hyperparameters; in this case, only the observed LR images are responsible for the estimation process.

Optimization
In this method, three prior models (i.e., the proposed prior model, the TV prior model, and the SAR prior model) are used to regularize the estimated HR image; however, establishing a prior probability distribution that includes these prior models is difficult.Here, the following linear combination of three KL divergences is used to combine the proposed prior model, the TV prior model, and the SAR prior model: where

𝐸 (𝑞 (Ω) , 𝑦) .
( Due to the half-quadratic form of TV model, (20) is difficult to be solved.In this work, this difficulty is overcome by resorting to the majorization-minimization (MM) approach [12].Thus, a lower bound of ( |  2 ) is found by using the MM approach: where  = {  ,  = 1, 2, . . .,  2 } and the auxiliary variables   need to be calculated by using the following formula: Then, posterior probability distributions, (), (  ), (  ), ( 1 ), ( 2 ), and ( 3 ), can be obtained.The solving process is described in the Appendix in detail.Consequently, the following explicit expressions are obtained to calculate the HR image, the motion parameters, and hyperparameters.
The formula for calculating the HR image is given by The formula for calculating the motion parameters is given by In ( 25) and (26),  is an identity matrix with size  2  ×  2 , and the formulas for   ,   , , Γ  , Ψ  , Φ  , and   are given in the Appendix.The hyperparameters can be calculated by using the following formulas: The optimization procedure can be concluded as shown in Algorithm 1.
In the experiments on real data, the settings of  1 = 10 and  2 = 0.2 are used.
Our method is compared with the bicubic interpolation method, the TV-SAR method, and the l1-SAR method.The MATLAB code provided in [18] was used for the testing.For the test images, the performance of reconstruction methods is evaluated by measuring the improvement in peak signal-to-noise ratio (PSNR) and structure similarity (SSIM) index.

Simulation Experiments.
In this subsection, we would like to show the experimental results by using the pictures presented in Figure 2 as test images.Experimental results are used to illustrate the effectiveness of the proposed models compared with the bicubic interpolation method, the TV-SAR method, and the l1-SAR method.Figures 3, 5, 7, and 9 show the reconstructed images obtained by using different SR methods.Results obtained by applying different approaches to LR images generated from Figures 2(a) and 2(c) are presented in Figures 3 and 7, respectively.These LR images In order to make the visual contrast effect more obvious, the corresponding binary results are presented in Figures 4, 6, 8, and 10.The binarization step is also usually included in the vehicle license plate recognition.Although the reconstructed results obtained by our proposed method have some vague senses, the corresponding binary results are better than other results obtained by other methods.The binary results obtained by our proposed method are closer to the binary results of the original HR image.For example, there exist less miscellaneous points in the binary results presented in Figures 6 and 10.
The PSNR and SSIM values of each SR reconstruction method are presented in Tables 1 and 2. From these two tables, we see that our proposed method could produce the reconstructed HR image with the highest PSNR and SSIM values.We take the HR image presented in Figure 2      the case of  = 0.2, the PSNR value of our proposed method (20.76 dB) has outperformed that of the bicubic interpolation method more than 4.0 dB.And in this case, the PSNR value of our proposed method is slightly better than those of the l1-SAR method and the TV-SAR method.Even under a stronger noise, in the case of  = 0.5, the PSNR value (18.14 dB) of our proposed method is at least 1.5 dB larger than those of the bicubic interpolation method, the TV-SAR method, and the l1-SAR method.
5.2.Discussion.In this paper, the proposed method and the comparison methods are all based on the variational Bayesian framework for fair.The computational complexity of such kind of method has been analyzed in [12].The majority of computations are performed for estimating the HR image and the motion parameters.The HR image is calculated by using the conjugate gradient method [12], and the motion parameters are calculated by inverting a 3 × 3 matrix for each observed LR image.Note that the matrix multiplications can be performed very efficiently by implementing the corresponding operators rather than storing full matrices.A comparison of the computation time is listed in Table 3, under the average blur with size 7 × 7 and the zero-mean Gaussian white noise with  = 0.2.Table 3 shows that the  proposed method uses the least time among the iterative methods.In the proposed method, the characteristic of gray-level distribution is introduced as constraints with the TV-SAR model into the reconstruction.During the reconstruction process, the target pixels and the background pixels tend to cluster to different centers, which is beneficial to reduce the computation time.
In Section 5.1, the experimental results under challenging zero-mean Gaussian white noise with variances 0.2 and 0.5, respectively, are presented.In the practical application, sometimes, small noise may exist.In order to verify the performance of the proposed Bayesian framework in the presence of various intensity Gaussian white noise, several simulated experiments are conducted by using Figure 2  In the experiments under different noise variances, the parameters  1 and  2 are adjusted by the criteria of obtaining optimal results.By taking the average blur with size 7 × 7 and zero-mean noise with  = 0.1 as an example, the relationship between the parameters  1 and  2 and PSNR is shown in Figure 12.From Figure 12, we see that when  1 nears 4 and  2 nears 0.1, the highest PSNR value can be obtained.Otherwise, the PSNR value decreases if the stable  1 is kept.It is noted that when  1 nears 7 and  2 nears 0.9, PSNR value decreases by a relatively sharp stage.Their relationship is quantitatively clear.

Experiments on Real Data.
In this subsection, we test our proposed method on several vehicle plate image sequences.A commercial digital continuous shooting mode was used to capture the vehicle image sequences.In this experiment, we use four sequences as examples, and each sequence has ten images.Figure 13 shows one of the ten LR images obtained with a camera far off, and some plate image sequences with very poor quality under different conditions are used, as shown in Figure 16.In this experiment, the license plate is selected as the region of interest.Figures 14,15,17,and 18 show a comparison between bicubic interpolation method, the l1-SAR method, the TV-SAR method, and our proposed method.In Figure 14(a), the number "5" could be misinterpreted as "S."And there exist obvious artifacts in the reconstructed images obtained by l1-SAR and TV-SAR, respectively.In Figure 15, the results obtained by the bicubic interpolation method are quite blur, which is hard to read.In Figures 15(b) and 15(c), the number "6" and letter "P" are unreadable.However, the Chinese characteristic is still not readable in Figures 14 and 15, and the letter "A" might be confusing.In Figures 17(b) and 17(c), the letter "C" could be misinterpreted as "0."In Figure 18, the reconstructed result obtained by the proposed method has a higher degree of visual contrast between the target and background; thus, it could be readable easily.In the proposed method, the characteristic of gray-level distribution is introduced as constraints into the reconstruction.During the reconstruction process, the target pixels and the background pixels tend to cluster to different centers, which is beneficial to use the gradient information to preserve the image edges and suppress noise.Meanwhile, the smoothing prior is useful to divide the target and background.Thus, by using the prior information of the license plate image more fully, the proposed method obtains better reconstructed results.These experimental results show that the proposed method in this paper achieves the best visual effect.Also, the practical utility and potential of our proposed method    in enhancing LR frames captured from real traffic could be demonstrated by using the experiments described previously.

Conclusion
Current existing reconstruction methods for license plate images use the image models based on gradient information.However, they do not perform well, especially in the heavy noise.This will result in poor recognition results.In this paper, we proposed a new SR reconstruction method to reconstruct the license plate images.Given the significant characteristic of license plate, a feature-specific prior model has been proposed in this study and combined with the TV-SAR prior model.The target and the background can be divided as far as possible during the reconstruction process, which is beneficial to use the gradient information to preserve the image edges and suppress noise.In this paper, the HR image, the motion parameters, and the hyperparameters are         estimated jointly by using the variational Bayesian inference estimator, and hence an unsupervised SR method for reconstructing license plate images is established.
In the future work, we will focus our study on estimating the parameters  1 ,  2 , and  3 and finding the relationship between these parameters.
(  | ,  ,   ) represents the conditional distribution of the LR image   , (  ) and (  ) are prior distributions of   and   , respectively, ( | ) is the prior distribution for the unknown HR image , and  is the hyperparameter of the prior distribution ( | ).For convenience, we denote Θ = {, {  }, {  }, }.

Figure 1 :
Figure 1: An example of the license plate image.

and Θ 3
denote the different sets of all the variables corresponding to the prior probability distributions based on the proposed prior model, the TV prior model, and SAR prior model, respectively,   ≥ 0 for  = 1, 2, 3, and  2 +  3 = 1.Then, (Θ | ) is approximated by minimizing the following function: (Ω) = arg min (Θ)

Figure 3 :
Figure 3: The reconstructed results for Figure 2(a) under white Gaussian noise with  = 0.2.

Figure 5 :
Figure 5: The reconstructed results for Figure 2(b) under white Gaussian noise with  = 0.5.

Figure 7 :
Figure 7: The reconstructed results for Figure 2(c) under white Gaussian noise with  = 0.2.
(b) as

Figure 9 :
Figure 9: The reconstructed results for Figure 2(d) under white Gaussian noise with  = 0.5.
- view

Figure 12 :
Figure 12: Relationship between the parameters  1 and  2 and PSNR.

Figure 13 :
Figure 13: One of the LR plate images in the case of long distance.(a) Vehicle sequence I and (b) vehicle sequence II.

Figure 14 :
Figure 14: The reconstructed results for the first LR license plates.

Figure 15 :
Figure 15: The reconstructed results for the second LR license plates.

Figure 16 :
Figure 16: One of the LR plate images in the case of (a) rain and fog weather and (b) black vehicle exhaust.

Figure 17 :
Figure 17: The reconstructed results for the LR license plates obtained under the rain and fog weather.

Figure 18 :
Figure 18: The reconstructed results for the LR license plates disturbed by black vehicle exhaust.

Table 1 :
Comparisons of PSNR (dB) and SSIM with average blur and white Gaussian noise of  = 0.2.

Table 2 :
Comparisons of PSNR (dB) and SSIM with average blur and white Gaussian noise of  = 0.5.

Table 3 :
Comparison of the time complexities (in seconds).