Undersampled Hyperspectral Image Reconstruction Based on Surfacelet Transform

Hyperspectral imaging is a crucial technique for military and environmental monitoring. However, limited equipment hardware resources severely affect the transmission and storage of a huge amount of data for hyperspectral images. This limitation has the potentials to be solved by compressive sensing (CS), which allows reconstructing images from undersampled measurements with low error. Sparsity and incoherence are two essential requirements for CS. In this paper, we introduce surfacelet, a directional multiresolution transform for 3D data, to sparsify the hyperspectral images. Besides, a Gram-Schmidt orthogonalization is used in CS random encoding matrix, two-dimensional and three-dimensional orthogonal CS random encoding matrixes and a patchbasedCS encoding scheme are designed.The proposed surfacelet-based hyperspectral images reconstruction problem is solved by a fast iterative shrinkage-thresholding algorithm. Experiments demonstrate that reconstruction of spectral lines and spatial images is significantly improved using the proposed method than using conventional three-dimensional wavelets, and growing randomness of encoding matrix can further improve the quality of hyperspectral data. Patch-based CS encoding strategy can be used to deal with large data because data in different patches can be independently sampled.


Introduction
Typical hyperspectral imaging (HSI) is acquired on satellites and aerospace probes and then transmitted to grounds.The imaging spectrometer can provide tens to hundreds of narrow-band spectral information for each spatial location.The huge amounts of data but scarce equipment hardware resources on satellites and aerospace severely limit the transmission and storage of hyperspectral images [1,2].However, the ground receiving side holds very strong processing capability.If we transfer the system complexity from the satellites and aerospace probes to the ground, it is expected to potentially solve the limitation in traditional high spectral sampling.
Traditionally, super-resolution reconstruction is used to improve the spatial resolution of HSI [3,4] while compression can improve the transmission efficiency [5][6][7][8].We acquire "all" data and then "throw away" most of it in transmission process.Can we just directly measure the part that we need to save the cost of storage?Compressed sensing (CS) has been proposed to solve these contradictions in remote sensing.The theory of CS shows that a sparse signal can be recovered from a relatively small number of linear measurements [9,10].The difference between the conventional method and CS based transmission method is shown in Figure 1.
CS has been widely used in medical imaging [11,12] and wireless sensor network [13,14] to reduce the sampled data in recent years which has been applied in remote sensing.Duarte et al. [15] proposed a single-pixel imaging method.Ma [2] applied CS in single frame imaging.Aravind et al. [16] used ten spectral bands to compare orthogonal matching pursuit with simultaneous orthogonal matching pursuit in reconstruction.Fowler [17] presented compressiveprojection PCA to effectively shift the computational burden of PCA from the encoder to the decoder.According to the CS theory, the reconstruction error is bounded by the sparse approximation error [18].For a signal x ∈ R  , let x be the solution to obey CS conditions, Ψ  a forward transform, and (Ψ  x)  the sparse approximation of x in transform domain by retaining only the -largest entries of Ψ  x; then, the reconstruction error is where  0 ,  1 are small positive constants and  is noise power [19].Equation (1) implies that a sparser representation will reduce the reconstruction error.An optimal sparsifying transform is always important for sparse image reconstruction to reduce the reconstruction error.
In this study, we introduce the surfacelet transform (ST) to sparsely represent hyperspectral images by making use of the spatial and spectral information.The advantage of ST over wavelet on sparse HSI data reconstruction is demonstrated, where the data are reconstructed by using a fast iterative shrinkage-thresholding algorithm [20].To further improve the reconstruction performance, a 3-dimensional (3D) random encoding is designed and the Gram-Schmidt orthogonalization is adopted.Finally, a patch-based CS encoding scheme is designed to deal with large size data.
The remainder of this paper is organized as follows.First, ST-based compressive sensing HSI reconstruction method is introduced in Section 2. The experimental results are presented in Section 3. Finally, conclusions are given in Section 4.

ST-Based Compressive Sensing HSI Reconstruction
The CS theory comes up with two principles: sparsity, which asserts that the reconstructed signal is sparse with a transform Ψ, and incoherence, which requires that the encoding matrix Φ is incoherent with Ψ [10].
Surfacelet transform (ST), proposed by Lu and Do [25], can efficiently capture the surface intrinsic geometrical structure within -dimensional signals [25].It offers 3 × 2  directional subbands with decomposition level  by combining the multiscale pyramid with the 3-dimensional directional filter banks (3D-DFB).Thus, surfacelet with more directions may help reducing the blocky artifacts caused by orthogonal wavelets [24,25].
ST is a multiscale version of the 3D-DFB.The input signal [] first goes through the 3D hourglass filter  0,0 0,0 (), which is a three-channel undecimated filter bank.One branch of the three-channel structure of 3D-DFB is given by Figure 2. The output [] is then fed into a 2D filter bank, denoted by IRC 12 , which operates on the ( 1 ,  2 ) planes.The treestructured filter bank IRC 12 produces 2  2 output subbands, denoted by   [] for 0 ≤  ≤ 2  2 .Each output is then fed into another 2D filter bank IRC ( 3 ) 13 operating on the ( 1 ,  3 ) planes.In the end, we get 2  2 + 3 outputs, represented by  , [𝑛] for 0 ≤  ≤ 2  2 and 0 ≤  ≤ 2  3 [25].3 × 2  spatial domain basis images of ST with 2 levels of decomposition are shown in Figure 3.
ST is optimal for the  2 singularities, thus providing sparser representation of smooth curves and surface singularities than wavelet.Theoretically, for the  2 singularities of an image x, the best -term approximation error ‖x  − x‖ 2 using ST has error decay rate of ( −2 ), while this error decay rate is ( −1 ) for typical wavelets [28,29].Figure 4 shows that a sparser representation is achieved using ST.According to the CS theory [19], the reconstruction error bound is proportional to ‖Ψ  x − (Ψ  x)  ‖ 1 , where Ψ  is the forward transform and  means the number of preserved coefficients in transform domain.Therefore, ST is expected to reduce the image reconstruction error of HSI.

Incoherence and Gaussian Random Encoding Matrix.
Incoherence means a column of the CS encoding matrix must be trying the proliferation in the corresponding sparsity basis Ψ.Since this paper focuses on investigating the spatialspectral sparsity of HSI, a Gaussian random matrix is chosen as Φ because it is incoherent with the entire existing basis where Φ  (1 ≤  ≤ ) is an encoding matrix for the band  of HSI.When image is encoded with the same encoding matrix, then Φ is called 2D random encoding matrix; when  arrays are different for different bands.The performance of both encoding schemes will be discussed in Section 3.
In order to improve the performance of the recovered signal, a Gram-Schmidt orthogonalization (GSO) is used in CS encoding matrix.Given one band random encoding matrix Φ  (1 ≤  ≤ ) and the th column denote as v  , the GSO is expressed by , 1 ≤  ≤ . ( The new matrix Φ  = ( 1 ,  2 , . . .,   ) is adopted as the CS encoding matrix of one spectral band.The randomness of the matrix is not affected by GSO.In the following, columns of each Φ  are processed with GSO.

Reconstruction Algorithm.
For HSI x = (x (1) , x (2) , . .., x () ) ∈ R ×× , where x () represents the th spectral band image,  ×  represent the spatial dimensions, and  represents the spectral depth of HSI.Let x  = vec(x () ) (1 ≤  ≤ ), where x  is a column vector of the th band HSI with size  × 1.The data acquisition model for CS is given by where Φ  ∈ R × ( < ) is a random encoding matrix for the th spectral band image and y  ∈ R ×1 is the acquired undersampled data.The sampling ratio is defined as where CSR < 1, which means the HSI are undersampled.In this paper, ST is adopted to provide sparser representation of HSI data and is expected to reduce the reconstruction  error.Let Ψ represent inverse ST and let Ψ  denote forward ST; CS recovers x by solving [18,19] where  = Ψ  x, ‖ ⋅ ‖  ( = 1, 2) stands for   -norm, and  is the regularization parameter which decides the tradeoff between the sparsity and the data fidelity.Many researchers seek for a simple and fast algorithm to solve (6), such as the conjugate gradient [30], Bregman iteration [31,32], low rank reconstruction [33], and other methods.In this paper, we choose FISTA to solve (6) because of its simplicity and fast convergence, whose convergence rate is ( −2 ), where  is an iteration counter [20].

Patch-Based Compressed Sensing with Surfacelet Reconstruction (PCSST).
For a larger HSI data x, bigger encoding matrix may exceed the memory of computer or leads long computation time.In this case, a patch-based sampling operation can be used.Let R be a patch operator and let R  (1 ≤  ≤ ) divide the th band of x into  patches, and then the data acquisition model for CS is given by where y  (1 ≤  ≤ ) denotes measurements of patches in th band of x and Θ  satisfies Θ  = (Θ 1 , Θ 2 , . . ., Θ  )  , where Θ  means random encoding on each block.In our scheme, the encoding on different patches of each band is different and these patches are nonoverlapped.PCSST allows implementing the proposed method on a larger dataset.It has the potential to allow for parallel computing as shown in Figure 5 and also provides the possibility to reduce the complexity of sensor arrays since the sensing detectors independently sample the data in different patches.

Experimental Results
Experiments are conducted in three aspects.First, better reconstructed signal using ST than using wavelet is demonstrated.Second, improving the randomness of encoding matrix for each spectral band is shown to improve the recovery quality and the reconstruction performance of two encoding schemes is compared.Third, a patch-based CS encoding scheme is designed to deal with large data.
The HSI data is obtained from U.S. AVRIS website, including Moffet Field and Lunar Lake [35].These data contain 224 spectral bands with spatial size 64 × 64, and every pixel is encoded with 16 bits.Linear interpolation is employed to fix the junk bands [36] and maintain the consistency of spectral lines, and all data are normalized.
To evaluate the performance, the mean-square-error (MSE), peak signal-to-noise ratio (PSNR), and structural similarity (SSIM) [37] are adopted as criteria to measure the reconstruction error.Their definitions are where  denotes the original image,   and   are mean and standard deviation of , x stands for the recovered image,  x and  x are mean and standard deviation of x;  1 ,  2 are small constant, and  1 =  2 = 0.08 is adopted in our work, which is used to avoid instability when either ( 2 x +  2  ) or ( 2 x +  2  ) is very close to zero [37].Mean SSIM index is adopted in this paper.Simulations run on a dual core 2.5 GHz CPU laptop with 4 GB RAM.A fast ST implementation in C language is adopted.

Surfacelet and Wavelet with 2D Encoding.
To simulate the CS data acquisition, the HSI are undersampled by random encoding matrix Φ. 50% sampled data means In order to improve the quality of the recovered signal, Gram-Schmidt orthogonalization is used to obtain the CS encoding matrix.With the 2D encoding matrix (i.e., each band has the same encoding matrix), the ST-based reconstruction is compared with the 3D wavelets-based reconstruction.Daubechies filters "db4" with 2 levels of decomposition is used, and there are 7 directional subbands for 3D wavelet.The multiscale pyramid with 2 decomposition levels is chosen, and there are 12 directional subbands for ST.The complexity of ST is ( log 10 ) and wavelet is ().
In order to well discuss the relationship between  and PSNR, we compared the PSNR performance from  = 10 to  = 10 −4 as shown in Figure 6. = 10 −2 is empirically chosen to give optimal PSNR in our work.As shown in Figure 7, ST significantly improves the PSNR of each spectral band more than 3D wavelet.Edges and curves are better reconstructed using ST than 3D wavelets as shown in Figure 8.At a given spatial location, the spectral line reconstructed using ST is much more consistent with the ground truth than wavelets as shown in Figure 9.Under different sampling ratios, ST achieves much better PSNRs than wavelet as shown in Figure 10, and more advantage of ST is seen at low ratios.Besides, ST shows better performance for the Lunar Lake dataset which has richer geometric textures than Moffet Field dataset.Running time (unit: second) based on ST and WT with different sampled rate is shown in Table 1.

2D and 3D Encoding Matrix.
The performance of 2D encoding and 3D encoding schemes is compared in this section.The recovered spatial images with two encoding schemes are shown in Figure 11.Edges and curves are better reconstructed using 3D encoding than 2D encoding.The recovered spectral lines using 3D encoding are more consistent with ground truth than using 2D encoding as shown in Figure 12.The improvement is more obvious for Moffet Field dataset with textures in low intensities as shown in Figures 11(a) and 12(a).These results imply that increasing the encoding randomness among different bands will achieve better reconstruction.

Reconstruction with PCSST.
We tested the methods on Lunar Lake data with size 256 × 256 × 224.A reconstructed spectral band is shown in Figure 13.The image structures recovered by surfacelet are much sharper than wavelet.

Conclusions
Compressive sensing is a new sampling theorem.In this paper, the surfacelet transform is introduced into hyperspectral image reconstruction from compressive sampled data.The surfacelet transform is a directional multiresolution transform for 3D data, which is applied to sparsify the hyperspectral images for the first time.Simulations are conducted in three aspects.First, better reconstructed signal using surfacelet than using wavelet is demonstrated.Second, improving the randomness of encoding matrix for each spectral band is shown to improve the recovery quality.Third, a patch-based CS encoding scheme is designed to deal with large data.It provides the possibility to reduce the complexity of sensor arrays because sensing detectors in different patches independently sample data.Experiments demonstrate that reconstruction of spectral lines and spatial images is significantly improved using the proposed method than using conventional three-dimensional wavelets.
In the future, our work includes the following two aspects.The first aspect is optimizing sparse representation of hyperspectral imaging, for example, combining adaptive sparse representation [38][39][40] and surfacelet transform.An adaptive dictionary may provide a sparser representation leading to a lower reconstruction error.Therefore, it is meaningful to try a low-complexity training method in the future.The second aspect is overlapping patches compressed sensing reconstruction.Overlapping patches can reduce the "block artifacts." But overlapping also introduces more encoded data and higher computations on image reconstructions.Trading the compressive sampling rate with image qualities using overlapping and speeding up the reconstruction algorithm [41] will be discussed in the future.

Figure 2 :
Figure 2: One branch of the three-channel structure of 3D-DFB.
that is, each 2D image is encoded differently, then Φ is called 3D random encoding matrix.Physically, 3D random encoding means that digital micromirror device (DMD) 4 Journal of Sensors

Figure 5 :
Figure 5: Parallel processing of compressive sampling on HIS.

Figure 6 :
Figure 6: PSNR performance versus different .Note: test is performed on Lunar Lake dataset at sampling ratio  = 0.4.

Table 1 :
Running time (unit: second) based on ST and WT with different sampled rate.Note: the ST is implemented in C and WT is implemented in MATLAB.