Hyperspectral Image Denoising via Nonconvex Logarithmic Penalty

Hyperspectral images (HSIs) can help deliver more reliable representations of real scenes than traditional images and enhance the performance of many computer vision tasks. However, in real cases, an HSI is often degraded by a mixture of various types of noise, including Gaussian noise and impulse noise. In this paper, we propose a logarithmic nonconvex regularization model for HSI mixed noise removal. The logarithmic penalty function can approximate the tensor ﬁbered rank more accurately and treats singular values diﬀerently. An alternating direction method of multipliers (ADMM) is also presented to solve the optimization problem, and each subproblem within ADMM is proven to have a closed-form solution. The experimental results demonstrate the eﬀectiveness of the proposed method.


Introduction
A hyperspectral image (HSI) consists of multiple discrete bands at specific frequencies. It can deliver additional information that the human eye fails to capture for real scenes and has been attracting much interest from researchers in a wide range of application fields, such as land use analysis, environmental monitoring, and field surveillance [1][2][3]. However, HSIs always suffer from various degradations, such as Gaussian noise, impulse noise, and random noise, which can affect the subsequent image processing, such as unmixing, classification, and target detection [4,5]. Improving the HSI quality merely through a hardware scheme is unsustainable and impractical. erefore, it is natural to introduce image processing-based approaches to obtain a high-quality HSI before subsequent applications. e numerical optimization algorithm plays an important role in HSI denoising, such as Liu et al. [6] proposed a two-step wavelet-domain estimation method to extract the noise map, and Lu et al. [7] presented some representative high-order variational models in the context of image denoising. From the perspective of the prior data format, we classify the existing HSI restoration methods into three categories: (1) 1D vector-based sparse representation methods [8][9][10][11][12][13]; (2) 2D matrix-based lowrank matrix recovery (LRMR) methods [14][15][16][17][18][19][20][21]; and, (3) 3D tensor-based approximation methods [22][23][24][25][26][27][28][29][30][31][32][33]. Although the existing works have made significant progress in HSI restoration, there are still drawbacks that need to be improved, such as when the multidimensional HSI data is transformed into a vector or matrix, it usually breaks the spectral-spatial structural correlation. e tensor lowrankness characterization of an HSI is expected to explore the global correlation and preserve the intrinsic structural information. e tensors' recovery under a limited number of measurements is an important problem that arises in a variety of applications, such as computer vision [34][35][36]. Based on low tubal-rank tensor recovery, Zhang et al. [37] proposed an HSI mixed noise removal model. However, the framework of the tensor singular value decomposition (t-SVD) lacks flexibility for handling different correlations along the different modes of HSIs, leading to suboptimal denoising performance. en, Zheng et al. [38] proposed an HSI mixed noise removal model with tensor fibered rank, which is based on the mode-k t-SVD. Moreover, Zheng et al. [38] also proposed a threedirectional tensor nuclear norm (3DTNN) as its convex relaxation to provide an efficient numerical solution and a three-directional log-based tensor nuclear norm (3DLogTNN) as its nonconvex relaxation to promote the low rankness of the solution. Compared to 3DTNN, 3DLogTNN has two advantages. First, it is a closer approximation to the fibered rank than 3DTNN. Second, by using the sum of the log function of singular values, 3DLogTNN can better approximate to the fibered rank than 3DTNN.
It is well known that suitable nonconvex penalty functions induce sparsity among the singular values more effectively. However, the use of nonconvex penalty functions will lead to nonconvex optimization problems. en, it suffers from numerous issues such as spurious local minima in the subproblem, for example, 3DLogTNN in [38]. To avoid the intrinsic difficulties, we introduce a new nonconvex logarithmic regularization model, which allows the use of nonconvex penalty function while maintaining convexity of the subproblem within ADMM. Also, the new model can provide a good approximation for the fibered rank and preserve the major information well. e rest of the paper is structured as follows. Section 2 presents some notations, explains t-SVD and defines mode-k t-SVD. Section 3 introduces the proposed ADMM based on the parametric penalty function for HSI denoising. e experimental results and analysis are reported in Section 4. Finally, the conclusion is given in Section 5.

Brief Overview of Tensor Singular Value Decomposition
In this section, we first describe the notations used throughout the paper and then introduce the tensor decomposition proposed in [39][40][41] and mode-k t-SVD proposed in [38].

Notation and
Similarly, A i ∈ R n 1 n 3 ×n 2 , and fold A 1 It can be seen in Figure 3, for A ∈ R n×n×3 , let A i � A :,:,i ; then, D i are the frontal slices of tensor D, where D is computed by applying the Fast Fourier Transform (FFT) along each tube of A, i.e.,

t-SVD.
In this section, we exploit the proposed t-SVD. A t-SVD interprets third-order tensors as linear operators on the space of oriented matrices [39]. Based on a t-SVD, O. Semerci exploited the decomposition, completion, and recovery of multilinear data [42], and Zhang et al. [34] proposed novel methods for completion and denoising of multilinear data and, as an application, considered 3D and 4D (color) video data completion and denoising.

Definition
1 (t-product). For G ∈ R n 1 ×n 2 ×n 3 and K ∈ R n 2 ×n 4 ×n 3 , the t-product D � G * K is a tensor of size n 1 × n 4 × n 3 . For i � 1, 2, . . . , n 1 and j � 1, 2, . . . , n 4 , e t-product is analogous to matrix multiplication except that circular convolution replaces the multiplication operations between the elements, which are represented by tubes.
where U ∈ R n 1 ×n 1 ×n 3 and V ∈ R n 2 ×n 2 ×n 3 are orthogonal tensors, S ∈ R n 1 ×n 2 ×n 3 is a rectangular diagonal tensor, and * denotes the t-product. Figure 4 illustrates the decomposition for the 3D case. Additionally, one can obtain this decomposition by computing matrix SVDs in the Fourier domain, see Algorithm 1.
However, when handling different correlations along different modes of an HSI, the t-SVD and the induced tubal rank lack flexibility.
is inflexible HSI characterization usually does not have good denoising effects. Zheng et al. [38] proposed a novel tensor decomposition by generalizing the t-SVD to the mode-k t-SVD, which achieves a more flexible and accurate HSI low-rankness characterization.   Mathematical Problems in Engineering 3

Mode-k t-SVD.
In this section, we introduce the mode-k t-SVD and the related definitions. For a third-order tensor A ∈ R n 1 ×n 2 ×n 3 , the mode-k block circulation operation is denoted as where A (i) k is the i th mode-k slice of A. e mode-k block diagonalization operation and its inverse operation are defined as bdiag(A, k) ≔ e mode-k block vectorization operation and its inverse operation are defined as Definition 2 (Mode-k t-product). For A ∈ R n 1 ×n 2 ×n 3 and B ∈ R n 1 ×n 3 ×n 4 , the mode-1 t-product is a tensor of size n 1 × n 2 × n 4 : A(:, j, s) * B(:, s, t).

(8)
For A ∈ R n 1 ×n 2 ×n 3 and B ∈ R n 4 ×n 2 ×n 1 , the mode-2 tproduct is a tensor of size n 4 × n 2 × n 3 : For A ∈ R n 1 ×n 2 ×n 3 and B ∈ R n 2 ×n 4 ×n 3 , the mode-3 tproduct is a tensor of size n 1 × n 4 × n 3 :  Definition 3 (Mode-k identity tensor). I k ∈ R n 1 ×n 2 ×n 3 is the mode-k identity tensor, whose first mode-k slice is an identity matrix and other mode-k slices are all zeros.

Definition 4 (Mode-k conjugate transpose). For
A ∈ R n 1 ×n 2 ×n 3 , A T k ∈ R n 2 ×n 1 ×n 3 is the mode-k conjugate transpose of A, which is obtained by transposing each of the mode-k slices and then reversing the order of transposed mode-k slices 2 through n k .
Definition 5 (Mode-k diagonal tensor). For A ∈ R n 1 ×n 2 ×n 3 , each of its mode-k slices is a diagonal matrix, and then, A is a mode-k diagonal.
Mathematical Problems in Engineering where the tensor A ∈ R n 1 ×n 2 ×n 3 is mode-k orthogonal.
where U k and V k are the mode-k orthogonal tensors and S k ∈ R n 1 ×n 2 ×n 3 is the mode-k diagonal tensor.

Definition 8 (Tensor fibered rank). For
e mode-k t-SVD can be efficiently obtained by computing a series of matrix SVDs in the Fourier domain and and achieves more flexible and accurate HSI lowrankness characterization (see Algorithm 2).

HSI Denoising Model and Its ADMM
In this section, we show our new denoising model and ADMM for solving the proposed model is also derived.

e Logarithmic Penalty Function.
is article mainly proposes logarithmic penalty function [43] that serves as the model for the penalty function developed in the HSI denoising model below and is designed to have less bias. e logarithmic penalty is given by where a > 0 controls the degree of nonconvexity of the penalty function. is function satisfies the following conditions: In [44][45][46], the authors prove that, for 0 < a ≤ (1/λ), the function f in equation (14) is convex. erefore, the proximity operator finds an optimal solution for convex minimization problem (14). e proximity operator associated with logarithmic penalty equation (13) is a continuous nonlinear threshold function with λ as the threshold value [44], namely, and is given by

HSI Denoising Model. Zhang et al. proposed an HSI
mixed noise removal model based on low tubal-rank tensor recovery (LRTR) [37]. It can address the mixed noise in HSIs and decompose a noisy HSI into three parts, i.e., a low-tubalrank part (the clean HSI), a Gaussian noise part, and a sparse noise part. Zheng [38] proposed an HSI denoising model based on a low-fibered-rank prior, and it is formulated as where Y is the observed noisy HSI, N is Gaussian noise, S is sparse noise, A ∈ R n 1 ×n 2 ×n 3 is the clean hyperspectral signal, which has the low-fibered-rank tensor property, and λ 1 and λ 2 are regularization parameters. e Gaussian noise model (large degree of freedom) corresponds to a dense noise type [47,48]. N should not be low in rank. Otherwise, A cannot be recovered from random noise. Due to the stochastic nature of Gaussian noise, it is assumed that there is no correlation (or a weak correlation) among noise components. us, the rank of N is normally full and much larger than the rank of A. erefore, the low-fibered-rank component A can be recovered from equation (17) by properly choosing the tuning parameters λ 1 and λ 2 .
Directly minimizing the tensor fibered rank is NP-hard. Based on logarithmic penalty equation (13), we propose that the following 3DNLogATNN HSI denoising model is commonly formulated: where τ k ≥ 0 (k � 1, 2, and 3) and 3 k�1 τ k � 1 is the weight of the fibered rank. F k (A) are set as LogATNN k (A) in the HIS denoising model: where m is the number of singular values of A, (18). We use the ADMM to solve equation (18). By casting the three auxiliary variables X k (k � 1, 2, and 3), equation (18) can be rewritten as follows:

ADMM for Solving Model
e augmented Lagrangian function of equation (18) is where W k and Q are the Lagrange multipliers and ζ k (k � 1, 2, 3) and ρ are positive scalars. Now we can solve the problem within the ADMM framework.
With the other parameters fixed, X can be updated by solving which is equivalent to the following subproblem for X k ∈ R n 1 ×n 2 ×n 3 (k � 1, 2, and 3): To solve equation (23), we can rewrite it as arg min where , k), and λ � τ k /ζ k . From equation (16), ough model (20) is based on a nonconvex penalty function, the parameters of which are constrained to ensure the convexity of the subproblem X k .
With other parameters fixed, A can be updated by solving arg min which is equivalent to the following subproblem: It has the following closed-form solution: With the other parameters fixed, N can be updated by solving arg min which is equivalent to the following subproblem: and it has the following closed-form solution: With the other parameters fixed, S can be updated by solving arg min which is equivalent to the following subproblem: Using the tensor soft thresholding operator, the following solution can be obtained [38]: e Lagrange multipliers W k and Q can be updated by solving

Mathematical Problems in Engineering
Hence, the proposed algorithm for HSI denoising is summarized in Algorithm 3.

Experiment Results
To validate the effectiveness of the proposed method for HSI denoising, we perform experiments on simulated data and compare the experimental results both quantitatively and visually. e Washington DC Mall data, Pavia City Center data, and the Indian Pines data are used. In our experiments, the Washington DC Mall data uses only a subimage (191 bands and size of each band is 256 × 256). e Pavia City Center data uses only a subimage (80 bands and size of each band is 200 × 200). And, the synthetic data by Zhang et al. [37] was generated using the ground truth of the Indian Pines dataset; the size of the synthetic HSI was 145 × 145 × 224. In addition, to facilitate the numerical calculation and visualization, all the bands of the HSI are normalized into [0, 1], and they will be stretched to the original level after restoration. e three evaluation measures are the mean peak signalto-noise ratio (MPSNR), mean structure similarity (MSSIM), and spectral angle mapping (SAM).
e three metrics are defined as follows to measure the quality of the denoised result: where M × N represents the image size in the space, I(x, y) represents the original image, I(x, y) represents the distortion image, L is a pixel that can achieve the maximum value, and B is the number of PSNR: where C 1 is a constant for μ 2 x + μ 2 y , C 2 is the same as C 1 , σ x and σ y represent the x and y standard deviations, respectively, X and Y represent the original image and the distorted image, respectively, x j and y j represent the jth local window contents, and Q is the number of local windows: where Z is the number of wavelengths, v and v ′ represent the spectrum vectors, and e PSNR and structural similarity index measure (SSIM) are two conventional perceptual quality indexes (PQIs) in image processing and computer vision. ey evaluate the similarities between the target image and the reference image based on the mean square error (MSE) and structural consistency. e larger these two measures are, the closer the target HSI is to the reference HSI. e SAM is a physically based spectral classification method that uses an N-dimensional angle to match pixels to reference spectra. Different from the former two measures, the smaller the SAM is, the more similar the target HSI is to the reference HSI.
Real HSIs usually include several different types of noise. To simulate real-noise scenarios, we consider the variance of the Gaussian noise β and the variance of the impulse noise δ. We use statistical structures to simulate different types of noise, including independent and identically distributed (i.i.d.) and non-i.i.d. noise, which are elaborated as follows: Input: e noisy HSI Y, parameters τ k , ζ k (k � 1, 2, and 3), k with equation (26), k � 1, 2, and 3. Update A l+1 with equation (29). Update N l+1 with equation (32). Update S l+1 with equation (35). Update W l+1 k with equation (36), k � 1, 2, and 3. Update Q l+1 with equation (37). Let ζ k � tζ k , k � 1, 2, and 3; ρ � tρ; and l � l + 1. Check the convergence condition:  Parameter setting: we analyze the parameters involved in the proposed method on HSIs' Washington DC Mall, Pavia City Center, and Indian Pines, i.e., the weight τ k , the regularization parameters λ 1 and λ 2 , the threshold parameter ς k � (τ k /ξ), the penalty parameter ρ � (1/mean(ξ)), and a constant a in 3DLogATNN. In all the following experiments, the parameters in these compared methods were manually adjusted according to their default strategies. e regularization parameter λ 1 for 3DLogATNN: it is easy to see that λ 1 is the parameter used to restrict the sparsity of the Gaussian noise. We set λ 1 � (C/β), where β is the standard deviation of Gaussian noise and C is a tuning parameter. e results were based on the simulated data experiment in case 1-1. Figure 5 shows the restoration results as      MSSIM values, with the value of C changing from 0.002 to 0.003. erefore, we suggest the use of C � 0.002 in all the simulated data experiments. e regularization parameter λ 2 for 3DLogATNN: it is easy to see that λ 2 is the parameter used to restrict the sparsity of the impulse noise. We set and B is a tuning parameter. e results were based on the simulated data experiment in case 2-1. Figure 6 shows the e results were based on the simulated data experiment in case 2-1. Figure 7 shows the restoration results as a varied in the set {0.01, 0.04, 0.05, 0.06, 0.1, 0.5, 1, 5, 10}. It can be clearly observed from this figure that the results of the 3DLogATNN solver are relatively stable in terms of both  MPSNR and MSSIM values, with the value of a changing from 0.05 to 0.06. erefore, we suggest the use of a � 0.05 in all the simulated data experiments. We adjust the parameters to achieve the best visual result, and the parameter setting is presented in Table 1.
Compared with the state-of-the-art methods, including TRPCA + BM4D [36,49], LRMR [37], LRTR [37], LRTDTV [50], and NMoG [51], on low-rank matrix/tensor approximation and noise modeling, the extensive experimental results demonstrate that the 3DTNN and 3DLogTNN [38] methods are better at removing the mixed noise. erefore, the denoising results of the proposed method are quantitatively and visually compared with two state-of-the-art HSI denoising methods, i.e., 3DTNN and 3DLogTNN. e denoising results of all the methods in six cases are shown in Table 2. ree typical bands of the denoised HSIs in the mixture noise case obtained with different methods are shown in Figures 8-10. Figure 8 shows the denoising results at band 71 of the Washington DC Mall HSI, Figure 9 shows the denoising results at band 52 of the Pavia City Center HSI, and Figure 10 shows the denoising results at band 28 of the Indian Pines HSI. It can be seen that the proposed 3DLo-gATNN can effectively remove the mixed noise and preserve the detailed information of the original image. e proposed method obtains the best visual quality by removing all the mixture noise and preserving the details well. Table 2 shows that the 3DLogATNN method converges faster than the 3DLogTNN-based method on all the Washington DC Mall, Pavia City Center data, and Indian Pines data, and our method outperforms the compared ones for the Pavia City Center data. Besides, 3DLogATNN is more stable than the other two algorithms, because it can get the best results in most cases.

Conclusion
In this paper, we present a new 3DLogATNN method for HSI denoising by mode-k t-SVD. e logarithmic penalty function is introduced in 3DLogATNN, which enables it to extract low-rank and sparse components more accurately from a degraded HSI contaminated by several types of noise. In addition, the ADMM-based algorithm is applied to effectively solve the proposed HSI denoising model. e Data Availability e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.