Shearlet-Wavelet Regularized Semismooth Newton Iteration for Image Restoration

Image normally has both dots-like and curve structures. But the traditional wavelet or multidirectional wave (ridgelet, contourlet, curvelet, etc.) could only restore one of these structures efficiently so that the restoration results for complex images are unsatisfactory. For the image restoration, this paper adopted a strategy of combined shearlet and wavelet frame and proposed a new restoration method.Theoretically, image sparse representation of dots-like and curve structures could be achieved by shearlet and wavelet, respectively. Under the L 1 regularization, the two frame-sparse structures could show their respective advantages and efficiently restore the two structures. In order to achieve superlinear convergence, this paper applied semismooth Newton method based on subgradient to solve objective functional without differentiability. Finally, through numerical results, the effectiveness of this strategy was validated, which presented outstanding advantages for any individual frame alone. Some detailed information that could not be restored in individual frame could be clearly demonstrated with this strategy.


Introduction
Image restoration is a processing procedure to rebuild or restore deteriorated images.Since the 1950s, many technologies have been developed for image restoration.In practical applications, image restoration could be eventually transformed into inverse problems, for example, geological exploration [1,2], medical imaging [3,4], and astronomy [5,6].Generally, image is defined as the 2D matrix in the R 2 space.The image is often recorded with a device known as a CCD (charge coupled device).The image often contains two kinds of noise, that is, Poisson noise and Gaussian noise.The Poisson component models the photon count, while the additive Gaussian term accounts for background noise in the recording electronics [7].
In this paper we discuss image restoration with high SNR.So we just consider the noise model is additive Gaussian.Though Gaussian white-noise processes do not exist physically, they provide a convenient approximation and manipulative tool.Because of its mathematical tractability in both the spatial and frequency domains, Gaussian (also called normal) noise models are used frequently in practice.
In fact, this tractability is so convenient that it often results in Gaussian models being used in situation in which they are marginally applicable at best.
Image restoration mainly includes parametric and nonparametric object models.There are plenty of literatures about nonparametric object models.Though few works are related to parametric object models, it is an interesting direction.
For parametric image restoration, the desire to know parameters that cannot be directly observed often occurs in practice [7,8].Given an output image, the parametric image restoration task is to find a set of parameters that represent the input image.Methods solving parametric object models mainly include ML (maximum-likelihood) estimation method, EM (expectation-maximization) estimation method, and constrained estimation method (regularization method).In this paper we just discuss nonparametric object models.The primary objective of image restoration is to restore the unknown image  ∈ R 2 by observation image : where  is Gaussian white noise with standard deviation  and  is linear operator.In image deconvolution, image inpainting, and image denoising,  is the convolution operator, projection operator, and identity operator, respectively.Currently, "sparsity" in image restoration has been getting more attention, and the sparse strategy has created a new field for studies on the regularization of inverse problem.The primary principle of this strategy is to represent original image by a column of coefficients under a certain frame.If the frame is selected appropriately, only a very few nonzero coefficients exist, that is, the sparse representation.The frame normally selected Fourier transform or windowed Fourier transform, cosine transform, and wavelet transform.The representation method of wavelet and its multiscale is the most common tool in image processing issues.However, wavelet and its multiresolution analysis theory are greatly limited in terms of directionality and all of its scales and positions show isotropy.Although the application of wavelet in singular signal representation has achieved great success, the limitation leads to the situation that wavelet frame fails to accurately represent highly anisotropic images.Therefore this phenomenon makes multidirectional frames emerging persistently.
Candés and Donoho set up a series of new multidirectional frameworks, such as curvelet [9] and ridgelet [10].These frameworks have essential differences from wavelet frames, showing more acute directionality as well as anisotropy.For example, curvelet is located along the curve in 2D space and along the slice in 3D space.The final numerical results of this paper will present that wavelet has fundamental differences from other multiscale frameworks.Curvelet, as a representative of the new generation of multiscale framework, has been proved efficient in image restoration but it could still be improved greatly [11,12].A strategy of combined wavelet and curvelet [13] also reached remarkable effects in image restoration.The combination of different frames could produce different results of image restoration [14].For example, wavelet could not restore the slender boundaries with high fidelity.Meanwhile, curvelet could not restore dots-like characteristics such as eyes of people satisfactorily.Generally, each frame has its own strengths and potentials, which also inspire researchers to try different combinations of frames for image restoration [15][16][17][18][19].
Recently, a new multiscale framework, namely, shearlet, has been found.Shearlet [20][21][22][23] could provide unified processing method for continuous and discrete models, rapid decomposition operator, and optimal sparse representation.Shearlet shares many similarities with curvelet.However, shearlet also has some advantages that curvelet does not have.In fact, shearlet originates from affine system and is generated by singular functions, showing simpler structure.However, the translation, rotation, and stretching involved in curvelet function are not generated by singular function.Moreover, a very important feature of shearlet is that it possesses multiresolution analysis, which is crucial for achieving fast algorithm.It should also be noted that numerical calculation showed more advantages in shearlet transform for shearlet than in the rotational transform for curvelet [24,25].Thus, it could be naturally inferred that the combination of wavelet and shearlet could acquire faster (because of the sparser coefficient) and more accurate algorithm when restoring dotlike and curve images.In fact, literature [26] has theoretically demonstrated the feasibility of combining wavelet with shearlet.
This paper is structured as follows.The first chapter introduces the theoretical framework of shearlet.The second chapter shows the sparse algorithm combined with wavelet and shearlet based on  1 regularization.The third chapter testifies the superiority of this algorithm for image restoration through numerical results.The fourth chapter gives the conclusion and prospects of this paper.

Shearlet Transform
2.1.Definition of Shearlet.The primary idea of shearlet transform [23] is to ensure the multiscale and multidirectionality of the transform by selecting two parameters.Traditionally, for the rotational direction, unified processing methods were hard to form for continuous and discrete regions due to the changes of grids.Therefore, the selection of direction parameter in shearlet became especially important.Shearlet used slope rather than angles to express directions.Shearlet matrix retains the integral constitution of grid, which is crucial for shearlet because it enables the accurate digitalization of shearlet in continuous regions.
For  > 0,  ∈  2 , let   and   be parabolic scaling matrix and shearlet matrix: In order to realize equivalent treatment results along and -axes, frequency domain was separated into four cones  11 −  22 .As shown in Figure 1, the following forms (equations) were generated: Let  1 ∈  2 (R) be wavelet function and its Fourier transform φ1 ∈  ∞ (R) and the support suup φ1 ⊆ For  12 , at scale , and shearlet is defined by Fourier transform: where  = (, , , ), respectively, represented scale, direction, position, and cone.For other cones  11 ,  21 , and  22 , the shearlet definition was similar.Its structure and frequency domain of shearlet are shown in Figure 1, and the discrete shearlet system is denoted by The definition shows that shearlets live on anisotropic regions of width 2 −2 and length 2 − at various orientations.It should be mentioned that recently introduced compactly supported shearlets [27,28] do not require projecting the shearlets to the respective cones; however, despite other advantageous properties, they do not form a tight frame for  2 ( 2 ).Setting  = ⋃ 22 =11   , the discrete shearlet system is tight frame for { ∈  2 ( 2 ) : supp f ⊆ } from [29].For the definition of a tight frame, please refer to [30].

Fast Shearlet Transform.
Shearlet transform in continuous areas will cause trapezoidal overlapped areas in the frequency domain.Special coordinate systems were introduced in the 2D continuous frequency domain space.The discrete shearlet transform could be represented by the following five operators string: (1) classic Fourier transform; (2) pseudo-polar coordinate conversion element; (3) radiation density compensation factor weighting; (4) trapezoidal overlay decomposition; (5) inverse Fourier transform of each trapezoid.This transform could be completely transplanted to the digital area.In addition, the pseudo-polar coordinates were compatible with digital image processing, leading to the same results for discrete and continuous shearlet transform.What is more fortunate, [31] had founded pseudo-polar Fourier transform.This transform realized Fourier transform on pseudo-polar grid Ω = Ω 1 ∪ Ω 2 : On pseudo-polar grid Ω pseudo-polar Fourier transform of (, V) was defined as the following form: where 0 ≥  indicating positive integer.Pseudo-polar Fourier transform facilitates the quick shearlet transform.However, the conversion of the concept is normally indirect because shearlet transform needs a tight support frame.Moreover, pseudo-polar grid is not isometric.The most difficulty is the highly asymmetrical structure for pseudopolar grids.Therefore, for areas with dense dots, the change of variable could be used to reduce weights.In fact, with sufficient pseudo-polar radiation samples, equidistant grid could be materialized.To sum up, quick shearlet transform of ×-dimensional images encompasses the following steps.
(1) Pseudo-polar Fourier transform was used and intensive sampling was conducted along the radial direction.
(2) The density compensating weights were adopted on the pseudo-polar grid.
(3) Windows were generated by scales and shears to decompose pseudo-polar indexes into rectangle subbands, and inverse Fourier transform was carried out for each column.By proper selection of weights and windows, quick shearlet transform could realize the equidistant grid.Thus, its inverse transform requires the calculation of adjoint matrix at each step.Fast shearlet transform code [32]  Wavelet combined with shearlet transform will obtain the overdetermination system.Assume  ∈ R × is the noised image.Let where  1 ,  2 were matrixes.Each column was formed by the primary function of wavelet and shearlet transform,  1 ,  2 were coefficient vectors of wavelet and shearlet.In order to obtain the sparse representation of the image, that is, the minimum  1 norm solution, The commonly used two methods for the solution of (10) are basis pursuit [33] (BP) algorithm and matching pursuit [34] (MP) algorithm.In MP algorithm, as one of the greedy algorithms, the global optimization was not considered.It only presented the locally optimal solution in a sense.BP algorithm, as a global optimization, could stabilize the restoration of original image.However, the calculation of the algorithm was huge.In order to overcome this shortcoming, sparser coefficients should be used to represent original images.In practical application, image   always contains noises.In order to solve the optimization problem (10), it is normally transformed into unconstraint optimization problem.At each scale of , the specific form is where   ≥  0 > 0 indicating a series of regularization parameters; let  = (  1 0 0  2 ),  = (  1  2 ); the optimization problem (11) could be eventually transformed into namely, where  is the length of the sparse coefficient  and   was the th component of .
Our method can be interpreted as an extension of Glenn Easley's sparse directional image representation [35], where shearlet sparse representation with  1 regularization is applied to image denoising.In our paper, we use two dictionaries (wavelet and shearlet) combined with  1 regularization.Observe that our model is different from the models for image separation [36,37].Starck et al. [37] proposed a MCA (morphological component analysis) method for image separation, in which a model argmin 2 } uses two-frame  1 sparsity constraints to decompose the image into cartoon and texture components   +   .Our model is also different from the models for compress sensing.Plonka and Ma [38] proposed a splitting Bregman distance method for compress sensing, in which a model argmin

Preprocessing.
To avoid high complexity, before addressing optimization issues, the image   for observation should be preprocessed.Dots and curves in the image generally correspond to the high-frequency part in the frequency domain.Hence it would be essentially sufficient for achieving accurate representation with only sufficiently large scales.Therefore, some bandwidth corresponding to large-scale  can accurately represent   .This idea simplifies the solution of related issues.Firstly,  + 1 filters  0 , . . .,   could be selected.For its specific form, please see [19]. 0 is low pass filter.The observation image   could be expressed as For each scale , consistent weights   > 0 were selected.When  <   , let it satisfy   <    .On the basis of these weights, a new figure ỹ could be acquired with the specific form as The pretreatment method simplifies the optimization problem ( 12) and highlights dots and curves in the image by suppressing the low frequencies.
The convergence of soft-threshold iteration algorithm has been proved in [39].The algorithm is easy to realize but the rate of convergence is relatively slow as linear convergence [41].The convergence rate of iteration hard-threshold algorithm is relatively fast in some practical applications.However this algorithm is slower than iteration soft-threshold algorithm theoretically.The Newton method has relatively faster convergence rate, but the penalty of  1 regularization is not differentiable.Thus, Newton method could not be directly used in  1 constrained optimization.To address the key issue, Guo and Donoho et al. rebuilt  1 penalty so that it could meet the requirement of second-order continuous differential.In this way, it could realize Newton iteration.However, the structure and realization of this method is relatively complexity.Its application range was limited.Semismooth Newton method based on subgradient [43] is now gaining more and more academic attention due to its less required conditions and faster rate of convergence.Griesse and Lorenz successfully applied this method to deconvolution and deblurring under weaker conditions.Currently, semismooth Newton method for nonlinear operator has also appeared.To sum up, our research intended to use semismooth Newton method in issues of image restoration.This issue belonged to a linear inverse problem.Wavelet transform and shearlet transform matrix formed the linear operator, which performed outstanding attributes.It is believed that this method is feasible for image restoration.This sector primarily discussed necessary conditions of subgradient calculation.
Proposition 2. If  2 →  is injective, the functional has a unique minimizer  ∈  2 .This minimizer is characterized by Then the sparse constraint minimum problem ( 17) can be transformed as for some  > 0. Next we discuss the subdifferentiability of ().
Definition 3 (see [44]).Let  :  →   be defined on the open set  ⊂   .Then, for 0 <  ≤ If  is -order semismooth at all  ∈ , then we call  -order semismooth on .The function  is called a generalized derivative (or slanting function) for  at .
The sets () and () are given by *  is defined as follows: Then the generalized derivatives are given by The iterative scheme of semismooth Newton method for ( 12) is The step of semismooth Newton method for ( 12) is as follows.
(ii) Calculate the sets () and (): (iii) Compute the residual (iv) If   meets the stopping criteria, then stop; let  true =   .
Mathematical Problems in Engineering (v) If   does not meet the stopping criteria, calculate (vi) Iterate  +1 :=   + .

Numerical Results
This section showed the numerical results with  1 sparse regularization combined with wavelet and shearlet for image restoration, and we compared our algorithm with wavelet sparsity constraint method, shearlet sparsity constraint method, and TV approach.The relevant code of these methods can be downloaded at the following websites: (wavelet) http://www-stat.stanford.edu/∼wavelab/,(shearlet) http://www.shearlet.org/,(total variation) http://www.math.montana.edu/∼vogel/Book/Codes/Ch8/.
All images in the numerical results are added with Gauss white noise, and its form is Gwnoise =  * randn(512, 512), where noise standard deviation  = 30.In the combined sparsity regularization method, wavelet is "Symlet 4" with 5-level decomposition.The iteration steps of semismooth Newton method are 10 steps.Signal-to-noise ratio (SNR) is used to evaluate the effects of image restoration: where  and  are original and restored image.
As mentioned in Section 1, although a wide range of wavelet-based tools and ideas have been proposed and studied for image processing, wavelets are only well adapted to point singularity not the curve singularities, which has limited its development in many applications.The shearlet transform has been shown to be a very efficient tool for many different applications in image processing.Shearlet transform can be regarded as an optimal representation to deal with curve-like edges.
If  = {  :  ∈ } is a basis or, more generally, a tight frame for  2 ( 2 ), then an image  can be (nonlinearly) approximated by the partial sums   = ∑ ∈  ⟨,   ⟩  , where   is the index set of the  largest coefficients ⟨,   ⟩  .The resulting approximation error is ‖ −   ‖ 2 = ∑ ∉  |⟨,   ⟩| 2 .If the image  is  2 , then the approximation   obtained from the  largest shearlet coefficients satisfies ‖ −   ‖ 2 2 = ( −2 ) compared with the rate ( −1 ) of wavelet.For the nonlinear approximation problem along a smooth boundary at scale 2 − , coefficients of the number of wavelet transform approximately are (2  ), while the number for shearlet is about (2 /2 ).If keeping the nonzero coefficients up to coarse level , the error that occurs for wavelets is about (2 − ) and correspondingly (2 −2 ) for shearlets.So Combined with wavelet methods, excellent performance of the shearlet transform will be shown in image processing.
The combined shearlet-wavelet method is very efficient in representing images including curve-like edges and point singularities.But the current combined systems still have two main drawbacks: (1) they are not optimal for sparse approximation of curve features beyond  2 -singularities; (2) the discrete transform is highly redundant.So it has to pay much higher computational cost for its adaptation.
In first experiment, Haar, Daubechies 4, Symlets 4, and Coiflet 4 wavelets are compared with the corresponding combined shearlet-wavelets method.Figure 2 shows that shearlet-wavelets methods have significant advantages over the traditional four kinds of wavelets denoising scheme not only in visual effects, but also in terms of SNR.Combination of shearlet and Haar, Daubechies, and Coiflet gives similar results.But combination of shearlet and Haar wavelet does not show the desired results.This may be because of the discontinuity of Haar wavelet.
Figure 3 shows performances of wavelet (Symlet 4) method and shearlet-wavelet (Symlet) method when the noise standard deviations are 30, 20, 15, and 10.Denoising results of images (from left to right) have similar trend.
Figure 4 presented the results of image restoring of Lena (512 × 512) using four different methods (combined sparse, shearlet, wavelet, and TV method).The combined sparse method showed the highest SNR, and the restoration results of the other three methods did not show significant differences.
Figure 5 shows the residual error between restored image and original image by different methods.Obviously, the structural information of original image showed in the residual errors with the combined method is much better.
Figures 6 and 7 showed the restoring effects of locally magnified structures.Figures 6 and 7 show that wavelet and shearlet are advantageous in terms of restoring dot-like structures (eyes) and curve structures (hats), respectively.Wavelet in Figure 7 presented excellent restoring effects of human eyes.However, the restoring of curves on the edge of hats turned out to be too smooth to show clear boundaries.In terms of the integrity of image, the combined sparsity regularization method showed excellent restoration.The conclusion could be obtained from SNR of Figure 4 and residual errors of Figure 5.
Figure 8 shows the comparison of the convergence rates of four methods.It also shows that the sparse regularization of wavelet, as linear convergence, was faster than TV.However, the internal iteration of combined sparsity regularization method was realized by semismooth Newton method, which managed to achieve superlinear convergence.Thus, the rate of convergence was faster.It should be noted that wavelet and shearlet required longer calculation time as sparsity regularization methods compared with the total variation method.However, the combined sparsity regularization method integrated linear operation of both wavelet and shearlet, thus showing greater size of matrix.From this perspective, the efficiency of the combined sparsity regularization method is      not necessarily the better.However, this method showed the best restoration results.Figure 9 provides SNR corresponding to iterative steps.The results showed that the combined sparsity constraint method only needed four iterative steps to get ideal SNR.This method could acquire SNR by two iterations while the other three methods require 6 iterations.This method could get ideal SNR with fast rate of convergence.Besides, the eventual restoring results of the image were much better than other methods.Table 1 showed SNR results of restoring four different images with 4 different methods.Although the restoration results of the four methods varied from images to images, the combined constraint methods showed the optimal restoration results.
In this part we test computational cost of SSN and IST with different sizes of the image Lena.Table 2 reports CPU times required for the SSN method and the IST.The results show SSN is more efficient than IST with same stopping criterion.

Conclusion
Based on the respective advantages of shearlet and wavelet in the sparsity regularization method, this paper proposed a combined sparsity regularization method.This method could efficiently restore the dots-like and curve structures in images, generating higher SNR of restored images.In order to improve convergence rate, we did not use the traditional soft and hard-threshold algorithm, but adopted semismooth Newton method with superlinear convergence rate.Numerical results showed the analysis results of Lena image restoration from various perspectives.Results demonstrated that the combined sparsity regularization method could restore more accurately and efficiently in the way of SNR, residual errors, and local dots or curve structures of images.In this paper, shearlet was used as a part of the sparse tools.This method shows extensive potentials in applications like denoising of seismic exploration, restoration of ground penetrating radar profile, and so forth.

Figure 1 :
Figure 1: Frequency domain induced by the discrete shearlets and frequency domain induced by the discrete shearlets on the cone.

Figure 4 :
Figure 4: Results of image restoration by different methods.

Figure 5 :Figure 6 :Figure 7 :
Figure 5: The residual error of original image compared to restored image.

Figure 8 :
Figure 8: Convergence rate of four different methods.

Figure 9 :
Figure 9: SNR of four different methods.

Table 1 :
SNR of four images restored by four different methods.

Table 2 :
Comparison of the CPU time in seconds for the SSN and IST algorithms with different sizes of image Lena.