Self-Similarity Superresolution for Resource-Constrained Image Sensor Node in Wireless Sensor Networks

Wireless sensor networks, in combination with image sensors, open up a grand sensing application field. It is a challenging problem to recover a high resolution (HR) image from its low resolution (LR) counterpart, especially for low-cost resource-constrained image sensors with limited resolution. Sparse representation-based techniques have been developed recently and increasingly to solve this ill-posed inverse problem. Most of these solutions are based on an external dictionary learned from huge image gallery, consequently needing tremendous iteration and long time to match. In this paper, we explore the self-similarity inside the image itself, and propose a new combined self-similarity superresolution (SR) solution, with low computation cost and high recover performance. In the self-similarity image super resolution model (SSIR), a small size sparse dictionary is learned from the image itself by the methods such as KSVD. The most similar patch is searched and specially combined during the sparse regulation iteration. Detailed information, such as edge sharpness, is preserved more faithfully and clearly. Experiment results confirm the effectiveness and efficiency of this double self-learning method in the image super resolution.


Introduction
Wireless sensor networks, in combination with image sensors, open up a grand sensing application field.Visual information provided by image sensor is the most intuitive information perceived by human, especially for recognition, monitoring, and surveillance.Low-cost and resourceconstrained image sensors with limited resolution are mainly employed [1][2][3].Recovery from low resolution to high resolution is the pressing need for image sensor node.Image super resolution (SR) receives more and more interests recently, which has lots of applications in image sensor, digital cameras, mobile phone, image enhancement, high definition TV [4][5][6], and so forth.It aims to reconstruct a high-resolution (HR) image from the low-resolution (LR) one based on reasonable assumptions or prior knowledge.From the view of the target HR image, the LR image can be generated after downsampling and some blurring operator.
Hence, the SR work has always been formulated as an inverse problem: where  is the HR image to be recovered,  is the known LR image, Φ is the downsampling operator,  is the blurring operator that minimizes the high frequency aliasing effect, and  is the noise.Traditionally, the downsampling operator Φ and blurring operator  are conducted at the same time.Hence, we can use the following formulation (2) instead of (1): where H = Φ is the generalized blurring and downsampling operator.However, the detailed information, especially the high frequency part, is lost after these two operations.Hence, image super resolution has become a highly underdetermined reconstruction problem.
The classical SR solutions are interpolation-based methods, including bilinear, bicubic, spline interpolation and some other improved versions [7,8].These methods tend to generate overly smooth HR images with ringing and jaggy effects.Their visual clarity is very limited.Edge preserving and directional interpolators have been proposed to improve the reconstruction image's visual clarity [9][10][11].However, the blurring and noises are still obstacles to overcome.
Sparse representation-based SR methods are becoming more popular recently since the issue of sparse representation is consistent with (2).Sparse representation provides a different perspective in solving the underdetermined problems [12][13][14][15].This powerful and promising tool has proven to be effective for a wide range of problems, such as sub-Nyquist sensing of signals and coding, image denoising, and deblurring [16][17][18][19][20][21][22][23].Several sparse representation based SR algorithms have been proposed with superior results reported [12,22,24,25].Most of them need training dictionaries based on a large scale external image gallery, which have limited matching degree to the target image and time consuming.Another issue is that the external dictionary depends on the blurring modal with less generality.Self-learning SR algorithms, lately emerged, show that the internal statistics in the image itself often have stronger prediction power than the external statistics and can give more powerful image-specific priors [26,27].
In this paper, we explore the self-similarity inside the image and propose a new combined self-similarity super resolution (SR) solution, which successfully restores the missing detailed image information.In this self-similarity image super resolution model (SSIR), the patches from the LR image are downsampled firstly to form smaller LR patches (SLR).Small-sized sparse dictionary is learned from the image itself by methods such as KSVD.Then, a most similar patch for the unrecovered LR patch is searched and combined, during the sparse iteration, to preserve the faithful detailed information.Experiment results confirm the effectiveness and efficiency of the double self-similarity learning method in the image super resolution.
The rest of this paper is organized as follows.Section 2 describes our approach of SSIR framework with self-learning dictionary.In Section 3, experiments are taken to compare the proposed method with other ones.The conclusions are finally given in Section 4.

The Proposed Self-Similarity-Based Image
Super Resolution Approach Hence, the HR image recovering procedure can be seen as the minimization of the  1 -norm problem: where  is the LR image and H is the generalized blurring and downsampling degradation matrix.The quality of recover HR image is always determined by the details, such as edges and contrast.However, such details are lost when the HR image is downsampled.Hence, small patch based recovery is more popular than the whole image based ones to prevent large scale details losing.We follow the patch based learning strategy in our approach.For  ×  sized LR image, the atoms in  are learned by patches sized by  × , where  can be 8, 10, and so forth.Then the sparse representation (4) can be rewritten as where  is the LR patch with size of (1/scale) ×  × ,   is the HR patch,   is the coefficient of the patch, and H and  are corresponding patch with the size of  × .The image reconstruction scheme based on self-learning dictionary can be presented more intuitively by 2.2.Internal Dictionary Learning.Most of the sparse representation SR methods are based on dictionary learning from the external image library [12,22,25].The number of the atoms in dictionary  should be huge enough to confirm the sparsity of  and avoid image hallucination and blurring [16].Normally, the dimension of external dictionary should be above thousand and the recovery time is huge.
For various natural images, especially the high-gradient ones, high recover performance could not be easily and fast reached if the dictionary is learned from the outside image gallery.External dictionary approaches are not suitable for the resource-constrained image sensor node.A different idea is that we should make full use of the information inside of the image itself as shown in [26,27].The feature of the same structure textures or patterns can be more easily found within the image.For the destination image, the dictionary does not need to be tremendous to mate different kinds of natural images.Inspired by [26,27], the dictionary  is learned firstly from LR image in our approach to classify the local structures.
The internal training patches are extracted from LR image and then used to generate an overcomplete dictionary  ∈  ×× which contains  atoms.It is assumed that a training patch  TS can be represented as  TS = , which satisfies ‖ TS − ‖ < .Hence, the training dictionary is the solution of Iterative optimization is used to solve this dictionary training problem.The iteration consists of two basic steps: (1) sparse coding: fix the dictionary  and search for the sparse representation of  and (2) dictionary update: update the dictionary atoms {  }  =1 and their corresponding coefficients  one by one.Inspired by [28,29], we use orthogonal matching pursuit (OMP) algorithm in the sparse coding step and K-singular value decomposition (K − SVD) based iterative optimization in dictionary update step, respectively.These two steps run iteratively until the maximum iteration or the convergence is reached.
Typically, the self-leaning dictionary size  is set below 256 in our approach, and we get similar recovery performance with the external dictionary.Detailed comparison is illustrated in Section 3.

Self-Similarity Regulation Scheme.
Local image structures in LR image can be classified by the patch dictionary learned from itself.However, detailed information, such as sharp edges and corners, could not be clustered perfectly by limit atoms and may be lost for some extent after downsampled from the HR patch.The following Table 1 demonstrates a real HR patch in Lena, its corresponding LR patch, and reconstruction patch by self-learning KSVD dictionary with 256 atoms.From Table 1, the rich variation between the HR pixels is omitted in LR patch and smoothed in the reconstruction patch.The reason of smooth effect under KSVD dictionary is mainly that the dictionary atoms are trained not only for the special patch, but also for all the patches in the image.
Hence, accurate reconstruction for each patch is tough even under the sparse self-learning dictionary.More prior information should be incorporated into the recover procedure to improve the HR image quality.Several additional parameters have been studied such as frequency, histogram, low-pass, nonlocal means constraints [22,25].Unlike these statistic constraints, we consider true information inside of the image as the regulation index.
As aforementioned, distinct edges and corners become blur after downsample operation.The information loss phenomenon appears when the HR image is downsampled to LR image.Similar information loss phenomenon also appears when the LR image is down-sampled to an even lower resolution image.The lost information during the latter procedure can be recovered from the image before down-sample.It provides a learning way to recover more realistic HR patches.A new self-similarity regulation scheme is proposed based on finding image patch similar to the where  similar is the regulation threshold and () is the similarity prior.We divide the whole sparse regulation into two steps: self-similarity regulation and sparse dictionary regulation.The self-similarity regulation step can be seen as an internal regulation step to compensate the sharpness of the edges.The sparse dictionary regulation step provides the basic framework to enlarge the LR image.The detailed self-similarity regulation step is described in Figure 1.Firstly, the input unrecovered LR patch, named as  LR , is upscaled by bicubic operator.Then, a similar HR patch of the same up-scaled size, named as  HR , is searched around the LR patch  LR inside of LR image .If a similar HR patch  HR is found, we can get its corresponding down-sampled LR patch  LR .The true HR patch  HR is approximated by the similar HR patch  HR .This recovered HR patch xHR coming from real pixels can be closer to the ground truth  HR than that recovered by statistic constraints studied previously.During approximation, the similar down-sampled LR patch  LR is firstly subtracted from the unrecovered LR patch  LR .Then, the above difference is estimated from the residual  LR by the self-learning sparse dictionary, which is named as  HR .At last, the recovered HR patch xHR is computed by adding the similar patch  HR and the difference estimation  HR .The well-known sparse regulation methods, Input: LR image , LR image patches' size  and HR image patches' size , the degradation matrix H. Output: HR image  Step 1. Extract patches  LR ∈   from LR image , follow the raster-scan order, and start from the upper-left corner (some pixel overlap in each direction is allowed).
Step 2. Recover HR image patches  HR iteratively by Steps 2.1 and 2.2, until the maximum iteration times or convergence is reached.
Step 2.1 Self-similarity regulation step: Step 2.1.1.Use bicubic method to up scale the unrecovered LR patch  LR to the same size  as HR patch, defined as  HR .
Step 3. Ensemble all xHR to recover HR image  (if there is pixel overlap, the weighted average method is needed).
The above self-similarity regulation step can be represented as where  is the current iteration index,   HR is the most similar patch found in th iteration,   HR is the recovered difference between   HR and xHR , x+1/2 HR represents updated xHR , and   and   are dictionary trained for low-resolution patch and high-resolution patch, respectively.
We introduced sum square error (SSE) as the selfsimilarity prior () and use it to decide which patch is the most matching one.The definition of the SSE is given by where   is the pixels taken from  LR neighbor patch in the searching zone and B is the pixels taken from the bicubic up-scaled patch  HR .Both have the same size as the output HR patch xHR .The patches we searched for come from the LR image, so the fidelity can be guaranteed.Sparse threshold  similar is used to decide whether a patch is similar to destination HR patch. similar is adaptive to  LR , instead of being a fixed value.The adaptive threshold  similar is defined as  Var +, where Var is variance of the processing patch  LR and ,  are associated parameters.If the minimum () within the searching zone is smaller than  similar , its corresponding patch is named as the most similar patch  HR .
The sparse dictionary regulation step is then performed under self-learned dictionary, which can be represented by The above two regulation steps are performed until the maximum iteration times or the convergence is reached.
The procedure of self-similarity regulation scheme is described in detail by Algorithm 1.

Overall Diagram of Self-Similarity Based Image Super
Resolution Approach.After all the analyses above, the overall diagram of self-similarity based image super resolution approach is shown in Figure 2. Firstly, the input LR image , regarded as a down-sampled version from corresponding HR image , is segmented into patches  LR .Then the sparse representation dictionaries   and   are trained by these internal patches.Next, the self-similarity regulation scheme is applied to find a matching patch  HR .Afterwards, HR patch xHR is recovered by sparse regulation based on the selflearning dictionary.At last, we ensemble all these recovered HR patches xHR to get a high-quality HR image .

Experimental Background.
In this section, several experimental results for the proposed method are given.All the simulations are conducted in MATLAB 7.5 on PC with Intel Core2/1.6 GHz/1 GB.The test LR images include several typical 256 × 256 natural images.We aim to recover their 512 × 512HR images.The input LR images with different degradation matrix H (direct downsampling degradation matrix H  and blur down-sampling degradation matrix H  ) are tested.Every experiment is evaluated from the luminance peak signal-to-noise ratio (Y PSNR) and SSIM  and is compared with the state of the art methods such as Yang et al. 's [12,22], Dong et al. 's [25].We thank the above authors to provide their program codes. Figure 3 shows the experiment on the image Lena under different downsampling matrix.Figure 3 [25] NCSR method cannot get acceptable performance without Gaussian low-pass filter, which is not illustrated in Figure 3.These experimental results show that our method has better performance than the state of the art methods [12,24,25] in both cases.The Bicubic method could not recover the high frequency details in both cases.Although Yang et al. 's [24] method can recover the blur downsampled-LR image very well but produce too much artifact and fake high frequency details in the direct downsampling case.

Experiments on
Experiment result on image Pepper is shown in Figure 4. Pepper has lots of edge, which is a preferable image   Images produced by industrial environment sensors are tested too, as shown below in Figures 5 and 6.Recovered high resolution images in Figure 6 show the effectiveness of our approach.
Furthermore, we do experiments on Forman video sequence to test the stability of our algorithm.All the frames are processed as an image.Figure 7 shows the PSNR comparison between the proposed method and Bicubic method.The proposed approach stably outperforms the Bicubic method.From about the 210th frame, recovery performance decays rapidly, since the followed frames are full of wild high frequency details.

Influence of Different Parameters.
To further observe different parameter's impact, several comparison experiments are conducted.

Influence of Dictionary
Size.Another advantage of the proposed approach is that the sparse dictionary only needs a small amount of atoms.128 atoms are enough to get a favorable result for the proposed method.Meanwhile, Yang et al. 's method [12,24] needs to train external dictionaries at least 512 atoms.In [22], Yang et al. propose a CS-based SR method, which also needs to train a dictionary with 500 atoms by external database.Comparison experiments are conducted on gray 512×512 natural images, including Lena, Pepper, and Boat.Table 3 shows the recovery PSNR of three sparse based SR methods with different dictionary sizes.The proposed method can recover favorable HR images by the smallest dictionary.Test results show that the proposed self-similarity learning method is more suitable for resource-constrained image sensor node.
For external dictionary based SR method, the recovery performance gets better as the dictionary size is growing larger.Figure 8 shows another comparison on Lena between Yang et al. 's method [24] and the proposed method.Yang et al. 's method [24] is conducted by a series of dictionary sizes of 256, 512, 1024, and 2048.The proposed method is conducted by different dictionary sizes of 64, 128, 256, and 512.We use the increment PSNR to Bicubic method as the comparison index.As PSNR growth curve shown in Figure 8, we can see that the recovery performance of Yang et al. 's method relies much more on the dictionary size.Its dictionary size should be three times larger than the dictionary size in the proposed method.By contrast, our approach gives a stable performance on different dictionary sizes.

Influence of Self-Similarity
Searching Zone.Selfsimilarity is introduced as the sparse regulation prior in our approach.The above tests show its effectiveness and stability in preserving the detailed information such as  edge sharpness.The size of self-similarity searching zone is tested here, using test image Tank from 8 × 8 to 14 × 14 neighborhood.Results are illustrated in Table 4 and Figure 9.The similar patches found are shown in  Figure 10.The experiment tells us that more edge patches can be found, and the recovery performance gets better, when the searching zone size increases.self-similarity super resolution approach, there are still some limitation that should be considered.The proposed method assumes that the matrix is known as most SR methods.Further research should consider how to estimate the optimal blur kernel under the blind circumstance.Another point is that the SSE self-similarity prior used in the proposed algorithm is quite simple.We will use more delicate prior such as Parzen window estimation [31], BM3D [32], and so forth, to get a better match with the destination HR patch.

Conclusion
This paper has presented a novel double self-similarity super resolution approach for the resource-constrained image sensor node in the wireless sensor networks.The proposed method does not need external database and only uses the LR image itself as the training sample for sparse representation dictionary with a small number of atoms.Selfsimilarity sparse prior is combined in the regulation iteration to preserve the detailed information.Experiments are conducted on bench-mark test images.The effects of different parameters have been surveyed.Comparative tests show the effectiveness and stability of the proposed method over the state of the art sparse based SR methods.

Figure 2 :
Figure 2: The overall diagram of the self-similarity-based image super resolution.
(a) plots the original Lena image.Figures 3(b)-3(d) plot the HR Lena images recovered from H  down-sampled LR image, respectively, by Bicubic, Yang et al. 's [24], and our proposed methods.Recovered image by Dong et al. 's [25] NCSR method is also illustrated in Figure 3(e), which uses the elaborate Gaussian low-pass filter.Figures 3(e)-3(g) show the recovered HR Lena images from H  down-sampled LR image, respectively, by Bicubic, Yang et al. 's [24] method, and the proposed method.Dong et al. 's

Figure 5 :
Figure 5: Low resolution test images from industrial environment sensors.

Figure 6 :
Figure 6: Recovered high resolution test images from industrial environment sensors scale factor = 2.

Figure 7 :
Figure 7: Self-similarity based SR performance on Foreman video sequence.

Figure 8 :
Figure 8: Recovery performance comparison over different dictionary sizes.

3. 4 .
Limitation and Further Research Direction.Although we have shown the outstanding performance of the proposed

Table 1 :
HR patch, corresponding LR patch, and reconstruction patch under KSVD dictionary.
∑  (  − B ) 2 Different Downsampled Image.In this test, our method is tested on several 512 × 512 common experimental natural images such as Lena, Plane, and Pepper.The input 256×256 LR image is down-sampled from the original 512 × 512 HR image.We use both direct downsampling degradation matrix H  and blur downsampling degradation matrix H  to test the algorithm's adaptability.At first, a sparse dictionary is trained by the 8 × 8 patches taken from input LR image.The dictionary has 128 atoms.Hence, the dictionary is a 64 × 128 matrix.Then, the 8 × 8 HR image patches are recovered by 4×4 LR image patches under our self-similarity based SR approach.We set 3 pixels overlap in LR patches by default.The neighbor searching zone is set to 10 × 10.

Table 2 :
[24]arison results of different SR methods.recovereffectaboutedge.Similar result is derived.The edge recovered by Yang et al. 's[24]method is not clear when LR image is down-sampled by H  .This failure may be caused by the inconsistency between Yang et al. 's[24]pair of HR and LR dictionaries.In comparison, our proposed method can preserve the edge's sharpness well.Besides the edge's sharpness, recovered information by self-learning is more faithful to the true HR details.More bench-mark comparisons are illustrated in Table2.Our proposed method shows high recovery performance

Table 3 :
Recovery PSNR of three sparse based SR methods with different dictionary sizes.

Table 4 :
Recovery effects of different size searching zones on image Tank.both kinds of downsampling degradation matrix.The comparison shows that self-similarity is a powerful imagespecific prior for sparse representation SR method. under