An Improved Method of Training Overcomplete Dictionary Pair

Training overcomplete dictionary pair is a critical step of the mainstream superresolution methods. For the high time complexity and susceptible to corruption characteristics of training dictionary, an improved method based on lifting wavelet transform and robust principal component analysis is reported.The high-frequency components of example images are estimated throughwavelet coefficients of 3-tier lifting wavelet transform decomposition. Sparse coefficients are similar in multiframe images. Accordingly, the inexact augmented Lagrange multiplier method is employed to achieve robust principal component analysis in the process of imposing global constraints. Experiments reveal that the new algorithm not only reduces the time complexity preserving the clarity but also improves the robustness for the corrupted example images.


Introduction
Superresolution (SR) reconstruction techniques produce high-resolution (HR) images from one or more low-resolution (LR) images, thereby increasing the high-frequency image detail and correcting degradation caused by LR sensors.SR technology is employed in numerous applications including medical imaging systems, face hallucination, remote sensing image processing, and military target reconnaissance and surveillance.There are three principal approaches achieving SR.Interpolation-based and reconstruction-based methods are traditional, examples of which include the Bicubic [1], IBP [2], POCS [3], and mixed ML/MAP/POCS methods [4].In contrast, learning-based methods construct optimally weighted constraints inferred from training overcomplete dictionary pair.Learning-based methods are capable of extrapolating high-frequency image features that are not apparent in LR images.Instances of learning methods are found in the example-based method [5], in the support vector regression (SVR) approach [6], in neighbor embedding methods (NESR) [7], and in papers on sparse representation of superresolution (SRSR) methods [8,9], which are the most effective SR algorithms.
In the process of SR algorithms based on sparse representation, an overcomplete dictionary pair, D  for highresolution patches and D  for low-resolution patches, will be trained to estimate the sparse coefficients.The accuracy of training dictionary will directly affect the performance of SR algorithms.However, the SR algorithms are often affected by corrupted acquired digital images including the lack of camera focus, faulty illumination, or missing data.
Principal component analysis (PCA) has been widely applied to traditional denoising SR algorithms by solving the constrained optimization problem.This method works well in practice as long as the noise power is small.However, it breaks down when corruption is severe, even if only very few of the observations are affected.For example, there are two PCA simulation results shown in Figure 1.For comparison, 100 sampled 2-dimensional arrays are represented by circle with some small noise in Figure 1(a) and with one error in Figure 1(b).On the bottom of Figure 1(a), circles can arrange by order of size.However, each circle has the same size in Figure 1(b).The results verify that PCA fails to estimate the correct order when data is corrupted by large errors.Furthermore, training dictionary has a high time complexity.To remedy these shortcomings, this paper presents an improved training dictionary algorithm based on lifting wavelet transform (LWT) and robust principal component analysis (RPCA).High-frequency wavelet coefficients are estimated by 3-tier LWT decomposition process, thereby reducing two-dimensional image data storage to about 75%.For low-frequency image components reflected in the LWT decomposition, scale coefficients are determined through robust principal component analysis (RPCA) instead of PCA.RPCA can preserve image detail and edge information and recovery of the broken data simultaneously.In the next section, this paper will explain the strategy of training dictionary in SR algorithm.

Training Dictionary Based on Sparse Representation
2.1.Sparse Representation Overview.Suppose that signal x ∈ R  can be represented as a linear combination of elements in an overcomplete dictionary D ∈ R × , where  denotes the number of atoms in the dictionary.Then an observation of signal x, say y, can be expressed as follows: in which  0 ∈ R  is a sparse vector and S ∈ R × ( < ) is the matrix used to downsample x.A similar model can be applied to digital images.Let X denote an original HR image and Y denote a degraded LR version of X.Now suppose that X is partitioned into submatrices, called patches, of dimension  × .Each patch in X is denoted by a lower case "x, " with a single integer index, say x 1 , x 2 , . ... Similarly, Y is partitioned into  ×  LR patches, y 1 , y 2 , . ... For the purposes of this development, the relation between the index on patch x  and its position in X is arbitrary and similarly for the index of y  and its position in Y.However, there must be a one-to-one correspondence between patches x  and y  for a given .The image degradation between X and Y is modeled by in which H represents the blurring filter.The model is shown in Figure 2. The process of image degradation (per patch) is viewed as a projection from a high-to a low-dimension space SH : R × → R × .According to the theory of manifolds, the local characteristics of the image patches are essentially unchanged by projection [10].Given sufficient sparsity of  0 , HR image patches can be perfectly recovered from the sparse representation of LR image patches with high probability.Similar to (1), the LR and HR image patches are represented as linear combinations of dictionary elements as follows: where ‖ ⋅ ‖ 0 indicates the ℓ 0 -norm.  denotes the sparse basis vector which is used to represent for patch  in both the HR and LR images.D  ∈ R  2 × and D  ∈ R  2 × are the overcomplete joint dictionary pair corresponding to the HR and LR patches, respectively.Since ( 3) is underdetermined, we use a Lagrange multiplier to solve the ill-posed problem [8] as follows: in which  balances sparsity of the solution and fidelity of the approximation to ỹ .Consider F indicates the feature extraction operator which retains the high-frequency details in the image.P extracts the region of overlap between the current target patch and previously reconstructed HR image.  contains the values of the previously reconstructed HR image on the overlap.Finally, HR image patches are reconstructed using the optimal  *  determined in the following equation: The trained dictionary pair D *  and D *  is required to solve for the optimal sparse vector of (4).

LWT Strategies for Example Images.
To reduce the time complexity of the SR algorithm, the LWT procedure is achieved before training the example images, which is called x in (7).By adopting separate training strategies for high-and low-frequency components of example images, the LWT can differentially preserve the critical features that accompany these separate bandwidths.
The example images are converted to low-frequency, horizontal-, vertical-, and diagonal-high-frequency domain coefficients by the LWT.For the three high-frequency components, wavelet coefficients on this layer can be accurately estimated by the next layer's wavelet coefficients through strong correlation.Therefore, in the process of 3-tier LWT, the first layer's high-frequency wavelet coefficients can be estimated by the second and the third layers' high-frequency wavelet coefficients.The number of example image pixels involved in training dictionary is only 25% of the original ones, which leads to greatly reducing the iteration time and preserving the edge information simultaneously.
For low-frequency components, RPCA discussed in the next section is employed to recover the corruption.

Matrix Completion and RPCA
3.1.Overview.The matrix completion problem has been the subject of intense research in recent years.In 2009, Candès and Recht [11] demonstrated exact matrix completion using convex optimization.As the rank of a matrix is not a convex function, the nuclear norm of the matrix can be used to approximate its rank, yielding a convex minimization problem for which there are numerous efficient solutions [11].
In 2010, Lin et al. [12] published a fast scalable algorithm for solving the robust PCA problem.The method is based on recovering a low-rank matrix with an unknown fraction of its entries corrupted.The algorithm proceeds as follows: given a rank  matrix A ∈ ∑ × , where  ≪ min(, ) is the target dimension of the subspace, the observation matrix is D. We can use to model these sets of linear measurements, where  Ω (⋅) is a subsampling projection operator and E represents the matrix of perturbations including complex noise and large errors.It should be relatively sparse compared to A.

Matrix Completion.
The objective of matrix completion is to recover the low-dimensional subspace the truly low-rank matrix A from D, under the assumption that E is zero min in which ‖ ⋅ ‖ * denotes the nuclear norm of a matrix.It has been shown that the solution to this convex relaxation can exactly recover the true low-rank matrix A under quite general conditions [11].Furthermore, the recovery is stable to small and bounded noise [12], that is, when the entries of E are nonzero and small-bounded.

Robust Principal Component Analysis.
The conventional PCA method is used to efficiently estimate a low-dimensional subspace from high-dimensional data.It can be written as follows: min where ‖ ⋅ ‖ F is the Frobenius norm.That means input matrix A is corrupted to generate output matrix D by i.i.d.Gaussian noise.To use PCA, the singular value decomposition (SVD) of D is computed to project the columns of D onto the subspace spanned by the  principal left singular vectors of D.
Robust principal component analysis (RPCA) has identity operator  Ω (⋅) and sparse matrix E which is different from matrix completion and PCA.Wright [8,13] has shown that a low-rank matrix A can be recovered exactly by solving the following convex optimization problem from observation matrix D, as long as the error matrix is sufficiently sparse.Consider min where  is a positive weighting parameter.RPCA has been used for background modeling, removal of shadows from face images, alignment of human face, and image denoising.

Training Dictionary Based on RPCA.
Because the lowfrequency part of an image has the main energy the LWT is achieved to example images before training dictionary.Then one uses RPCA to improve the effectiveness and robustness of the SR algorithms.
There is little difference in the coefficient values of the low-frequency subimages after a LWT on different images of the same scene.RPCA coefficients are used to represent the low frequencies in an attempt to preserve fidelity and coherency between the subbands.Algorithms have been developed to solve the RPCA problem that has recovered lowrank matrix A and sparse matrix E from observation matrix D. In this paper, we employ the IALM method to compute the low-frequency subband coefficients of dictionary images.
Suppose Γ = {I  }  =1 denotes  multisensor corrupted image set.Corresponding to ready-to-be-fused images I  , low-frequency subimages will be computed by using LWT.Let { Ĩ,, }  =1 be the low-frequency subimages where Ĩ, ∈ R  1 × 2 denotes the gray values of each pixel whose coordinate is (, ).If we represent Ĩ,, as a vector by concatenating all columns of the subimages, we define a  1 ×  2 ×  matrix Ĩ as follows.The data are standardized.For convenience, let  1 =  2 = .
Then, we can express the subimages in an equation similar to (8): where Ĩ ∈ R where augmented Lagrange multiplier is In this equation,  is a positive weighting parameter representing the ratio of the sparse matrix Ĩ to low-rank matrix Ĩ . is a positive value.Δ(, ) is the trace of    and  is the iterated Lagrange multiplier.
Augmented Lagrange multiplier method has excellent convergence and solution accuracy.In the proposed method of this paper, RPCA is coupled with the inexact augmented Lagrange multiplier (IALM) method to determine the lowfrequency coefficients of LWT for corrupted example images.Firstly, we should define some notations to denote some variables which is shown in Notation of IALM section. is iteration time.
Then, the result of Ĩ(+1) and Ĩ(+1) will be computed by the following equation: In summary, IALM is used to determine the lowfrequency component to be fused, and self-adapting regional   3. Obviously, the computation training dictionary seconds of SRSR  are shorter 61.32%, 62.00%, 60.40%, and 62.67% than SRSR by increasing the number of atoms .A larger number of atoms indicate that overcomplete dictionary pair includes more reconstructed accuracy, in the meanwhile, spending more time to train.In this study, balancing the training accuracy and time complexity, the value of  is set to 1024.

Robustness Experiments.
As mentioned in Section 1, PCA failed to estimate the original data when data was broken severely.To solve this problem, sparse coefficients are determined through RPCA instead of PCA.
To verify the reported method SRSR  robustness to some missing data and errors, the 10% random errors and the 10% missing data were applied to HR image patches for training dictionary.Therefore, the downsampling rate to original images reaches 20%.An overcomplete dictionary pair D  and D  were estimated to process SR.Test images called "bookshelf, " "girl face, " "lena, " "flower, " and "building" were processed by three mainstream SR algorithms-Bicubic, NESR, SRSR, and SRSR  .
The peak signal-to-noise ratio (PSNR) is an expression for the ratio between the maximum possible power of a signal and the power of distorting noise that affects the quality of its representation.This objective metric is used to compare the robustness of algorithms by measuring the proximity of the SR reconstructed image and the original image.
Table 1 shows the PSNR (dB) performance evaluation measures for the four SR algorithms.The results reveal that

Summary
For the computational complexity of training overcomplete dictionary pair, this paper proposes the 3-tier LWT to decompose the trained example images into low-frequency, horizontal, vertical, and diagonal high-frequency components.The high-frequency components in the example images can be accurately estimated through the second and the third layers' wavelet coefficients.PCA fails to estimate the gray value of broken pixels in the example images when corruption is severe, even if only very few of the observations are affected.For similarity of multiframe observation images, this research uses RPCA coupled with the IALM method instead of PCA to solve the robustness problem.
Experimental results show that the new algorithm not only shortens effectively the training overcomplete dictionary pair time of up to 60% but also provides significantly improved clarity for the reconstructed SR images.

Figure 1 :
Figure 1: PCA fails to reconstruct when data is corrupted by large errors.

Figure 3 :
Figure 3: Comparison of time complexity.
)2.2.Training Overcomplete DictionaryPair.The joint dictionary pair D  and D  is trained using known example image patches.Given x ∈ R × as HR image patches and ŷ ∈ R × as LR image patches, optimization problem becomes the following: 2× denotes the clean and integrated lowfrequency subimage sequence matrix and Ĩ ∈ R  2 × denotes the sparse matrix including errors and noises even though low-pass filter will alternate most of the noise power.The coefficient values of the low-frequency subimages are similar after a LWT on the same scene.If the recovered matrix Ĩ was free of noise and those corrupted elements were fixed successfully, all column vectors in Ĩ should have similar underlying image structures, and the rank of Ĩ should be low which is the prerequisite to matrix completion and RPCA.In such an ideal case, a good estimation of Ĩ may be found by matrix completion and RPCA as indicated by the following equation: Ω ( Ĩ )     1 s.t.Ĩ + Ĩ = Ĩ,

Table 1 :
PSNR comparisons of SR algorithms.
4.1.Time Complexity Experiments.To assess the accuracy and robustness of training overcomplete dictionary pair, 100000 standard LR and HR image patches were partitioned and trained from image libraries (http://decsai.ugr.es/cvg/dbimagenes/).To validate the new procedure, the four numbers of atoms were setting, respectively:  1 = 256,  2 = 512,  3 = 1024, and  4 = 2048.The size of each atom is 9 × 9 pixels.Different from the SRSR algorithm, the proposed method, namely, SRSR  , processed the 3-tier LWT to the example images.By adopting inverse lifting wavelet transform (ILWT), reconstructed HR image patches preserve the high-frequency critical features.The 75% example image pixels involved in the training dictionary are reduced.Experiment result is shown in Figure