Sparse Signal Recovery via ECME Thresholding Pursuits

The emerging theory of compressive sensing CS provides a new sparse signal processing paradigm for reconstructing sparse signals from the undersampled linear measurements. Recently, numerous algorithms have been developed to solve convex optimization problems for CS sparse signal recovery. However, in some certain circumstances, greedy algorithms exhibit superior performance than convex methods. This paper is a followup to the recent paper of Wang and Yin 2010 , who refine BP reconstructions via iterative support detection ISD . The heuristic idea of ISD was applied to greedy algorithms. We developed two approaches for accelerating the ECME iteration. The described algorithms, named ECME thresholding pursuits EMTP , introduced two greedy strategies that each iteration detects a support set I by thresholding the result of the ECME iteration and estimates the reconstructed signal by solving a truncated least-squares problem on the support set I. Two effective support detection strategies are devised for the sparse signals with components having a fast decaying distribution of nonzero components. The experimental studies are presented to demonstrate that EMTP offers an appealing alternative to state-of-the-art algorithms for sparse signal recovery.


Introduction
Sparsity-exploiting has recently received a great amount of attention in the applied mathematics and signal processing community.Sparse signal processing algorithms have been developed for various applications.A recent Proceedings of the IEEE special issue on applications of sparse representation & compressive sensing devoted to this topic.Some of the exciting developments addressed in [1][2][3][4][5][6][7].Compressed sensing, also known as compressive sensing, compressive sampling (CS) [8,9] has been the subject of significant activity in sparse signal processing where one seeks to recover efficiently a sparse unknown signal vector of dimension n via a much smaller number (m) of undersampled linear measurements.For k-sparse unknown signal x 0 ∈ R n , the sparse signal recovery is intimately related to solving a underdetermined system of linear equations y = Ax with sparseness constraint (P 0 ) : min where x 0 is true signal to be recovered, x 0 0 is used to denote the number of nonzero components of x 0 , A ∈ R m×n is the measurement matrix and y ∈ R m is the measurement vector.The key insight in the pioneering work on CS [8,9] is to replace 0 by 1 for the (P 0 ) problem due to non-convexity and combinatorial effect.In [10], it is the basis pursuit problem (BP) : min Hereafter, numerous researchers developed various computational algorithms for sparse signal recovery [11,12].Generally, there are three major classes of computational algorithms: convex relaxation, bayesian inference and greedy pursuit.Convex relaxation replaces the combinatorial problem (P 0 ) with a convex optimization problem (BP), such as basis pursuit [10], basis pursuit denoising [13], the least absolute shrinkage and selection operator (LASSO) [14] and least angle regression (LARS) [15].Bayesian inference derives the maximum a posteriori estimator from a sparse prior model, such as sparse bayesian learning (SBL) [16,17] and bayesian compressive sensing (BCS) [18,19].Another popular approach is to use greedy algorithms which iteratively refine a sparse solution by successively selecting one or more elements.These algorithms include matching pursuit (MP) [20], orthogonal matching pursuit (OMP) [21][22][23], subspace pursuit [24], compressive sampling matching pursuit (CoSaMP) [25] and iterative thresholding algorithms [26,27].
Iterative hard thresholding (IHT) [27] is a simple and powerful approach for sparse recovery.Recently, an alternative algorithm has been present to alleviate the convergence speed issue of IHT in [28,29].Qiu and Dogandzic [28,29] derived an expectation-conditional maximization either (ECME) iteration from a probabilistic framework based on the signals distributed Gaussian with unknown variance, and proposed an acceleration method, termed double overrelaxation (DORE) thresholding scheme, to improve the convergence speed of the ECME algorithm.In addition, Qiu and Dogandzic [28,29] further present an automatic double overrelaxation (ADORE) thresholding method for conditions that the underlying sparsity level is not available.For another, Wang and Yin [30] presented an iterative support detection (ISD) algorithm to refine the failed reconstructions by thresholding the solution of a truncated (BP) problem.
Inspired by the theoretical and empirical evidence of good recovery performance of ECME [29] and ISD [30], we combine ECME [29] and ISD [30] to devise novel sparse signal recovery methods, dubbed as ECME thresholding pursuits (EMTP).EMTP detects a support set I using a ECME iteration and estimates the reconstructed signal by solving a truncated least-squares problem on the support set I, and it iterates these two steps for a small number of times.We present two effective support detection strategies (hard thresholding and dynamic thresholding) for the sparse signals with components having a fast decaying distribution of nonzeros (called fast decaying signals in [30]), which include sparse Gaussian signals, sparse Laplacian signals and certain power-law decaying signals.
This paper considers the iterative greedy algorithms, and abstracts them into two types [31], One Stage Thresholding (OST) and Two Stage Thresholding (TST), as discussed in section 2 and 3.In section 4 we describe the proposed approaches.After that, section 5 details the experimental results.Finally, we conclude this paper in section 6.

Notation
We introduce the notation used in this paper: • x (t) : the algorithms described in this paper are iterative and the reconstructed signal x in current iteration t is denoted as x (t) .The same convention is used for other vectors and matries.
• I, A I : index set I, the matrix A I denotes the submatrix of A containing only those columns of A with indexes in I.The same convention is used for vectors.
• supp(x): the support set of a vector x, i.e. the index set corresponding to the nonzeros of x, • H k (x): the hard thresholding that sets all but the largest in magnitude k elements of a vector x to zero.
• |x|, x p , x T : the absolute value, p norm and transpose of a vector x, respectively.
• A † : the Moore-Penrose pseudoinverse of matrix A ∈ R m×n .A † = A T (AA T ) −1 for m ≤ n;

One Stage Thresholding
Qiu and Dogandzic derived an expectation-conditional maximization either (ECME) iteration based on a probabilistic framework [29], Note that ECME iteration reduces to IHT step when the measurement matrix A has orthonormal rows (that is, (AA T ) is the identity matrix).These one stage thresholding algorithms (e.g.IHT [27] and ECME [29]) are guaranteed to recover sparse signals and converge with limited iterations.However, OST is not the empirical choice for practical applications due to slow convergence.To this end, Qiu and Dogandzic proposed an acceleration method, termed double overrelaxation (DORE) thresholding scheme [28,29], to improve the convergence speed of the ECME algorithm.DORE utilizes two overrelaxation steps where α 1 , α 2 are the line search parameters.Finally, an additional hard thresholding step 2 ), (6) ensure that the resulting signal is guaranteed to be k-sparse.In addition, Qiu and Dogandzic further present an automatic double overrelaxation (ADORE) thresholding method for conditions that the underlying sparsity level is not available.

Two Stage Thresholding
The algorithms described in this paper fall into the category of a general family of iterative greedy pursuit algorithms.Following [31], we adopt the name "Two Stage Thresholding" (TST).Consider a CS recovery problem, the sparse recovery algorithms aim to detect the support set I and approximate y using Starting with initial solution x = 0, TST iterates between 2 main steps: Step 1: Support detection: detect the support set I of the signal x, i.e. select atoms of measurement matrix A which have been used to generate y, in other words, determine active atoms in sparse representation of a signal x.In some literatures, this step also is called basis selection or atom selection.

ECME Thresholding Pursuits
In this section, we describe our proposed approaches, named ECME Thresholding Pursuits (EMTP), that combine OST and TST.The detailed description of the proposed algorithms are presented as follows Step 1: Initialization: and set the iteration counter t = 1.
Step 2: Support detection: Update signal approximation: Detect the support set I: strategy 2: dynamic thresholding Step 3: Signal estimation: Estimate the signal: Update the residual: Step 4: Halting: Check the stopping condition and, if it is not yet time to stop, increment t = t + 1 and return to step 2. If it is time to stop, the recovered signal x has nonzero entries in support set I (t) and corresponding support vector lies in x (t) We present two thresholding proposals, and corresponding algorithm k-EMTP (shown as Algorithm 1) and β-EMTP (shown as Algorithm 2), where β ∈ (0, 1) is the thresholding parameter.

Remarks:
1. EMTP updates the leading elements of x (t) using a least-squares solution on the support set I.
However, DORE updates all the elements of x (t) using double overrelaxation steps.Finally, DORE needs a hard thresholding to ensure the new approximation is k-sparse.β-EMTP demands no prior knowledge about the underlying sparsity level k.
2. EMTP is a special case of TST and utilizes OST in the support detection stage.It is different from OMP [23], Subapace Pursuit [24] and CoSaMP [25].OMP, Subapace Pursuit and CoSaMP use the correlation between the residual signal and the atoms of the measurement matrix A to detect support set.
3. Like other greedy pursuits such as Subapace Pursuit [24] and CoSaMP [25], k-EMTP fixes the cardinality of support set I and removes previous false detections.However, β-EMTP refines the support set I which is increasing and nested over the iterations.
4. The dynamic thresholding strategy used in β-EMTP is inspired by ISD [30].It finds significant nonzeros by comparing a threshold rather than maintaining fixed number (k) of items.It is appealing for the conditions that the underlying sparsity level k is not available.[30].EMTP and ISD have the same idea in support detection step with iterative behavior, however, the support detection step bases on different sparse recovery methods, ECME and BP respectively.EMTP updates the reconstruction signal using a least-squares solution on the detected support set I. However, ISD iteratively refines the BP solution on the complement of the detected support set I.

EMTP resembles ISD
6. β-EMTP with large β obtains high quality reconstruction from a small number of measurements.However, because support set I grows slowly, β-EMTP takes a larger number of iterations (to be discussed in Section 5).k-EMTP wrongly detected elements can often be pruned out in later iterations.
7. Like ISD (as discussed in [30]), β-EMTP only performs well for fast decaying signals.It does not work on sparse signals that decay slowly or have no decay at all (e.g.trinary and binary sparse signals).k-EMTP performs worse for the sparse signals that nonzero elements have similar magnitude [31,33].However, we can apply EMTP to non fast decaying signals via linear or non-linear pre-mapping [34,35].

Experimental Results
To assess the performance of the proposed approaches, we conduct numerical experiments on computer simulations.All algorithms were implemented and tested in MATLAB v7.6 running on Windows XP with 2.53GHz Intel Celeron CPU and 2GB of memory.We compared the proposed approaches with the accelerated ADORE/DORE approaches.The code of ADORE/DORE is available on the authors homepage.We initialized x with zero sparse signal.The search length of ADORE was set to 1.All results were averaged over 100 times Monte Carlo problem instances.We used our unoptimized code.The least-squares solution x = arg min y − Ax 2 2 was implemented using MATLAB pseudoinverse function by x = pinv(A)*y.The MATLAB code was partially adapted from [30,33,36].In all the experiments, the The variable lambda controls the rate of decay.We set lambda=10 and lambda=0.5 for sparse Laplacian signals and power-law decaying signals, respectively.
For fair comparison, we stopped iterations once the relative error fell below a certain convergence tolerance or the number of iterations is greater than 100.The convergence tolerance is given by We empirically evaluate reconstruction performance in terms of signal-to-noise ratio (SNR) and number of iterations.The SNR is defined as where x 0 is the true signal, x is the recovered signal.

Influence Of Thresholding Parameter
We evaluated the influence of thresholding parameter β by varying β from 0.4 to 0.7.First, we fixed signal dimension n = 400, the underlying sparsity level k = 20, 40, 80 and varied the number of measurements.
The plots of influence of thresholding parameter β for sparse Gaussian signals are presented in Figure 1, 2 and 3, respectively.It is clear that larger thresholding parameter achieves better recovery performance with fewer measurements.However, the cost is more iterations.Large β is particularly competitive when the number of measurements is fairly small.It is worth noting that the number of iterations is smaller than the underlying sparsity level k for exact recovery, especially for large k.As shown in Figure 1, 2 and 3, β-EMTP achieves the SNR (dB) around 300, that is, the relative error almost as low as the double precision.Second, we tested the comparisons by fixing the number of measurements m = 80, 140, 200, as shown in Figure 4, 5 and 6, respectively.Finally, we enlarged the dimension of signals (n = 2000), as shown in Figure 7 and 8.The latter tests come to similar conclusions as the first test.We used β-EMTP with β = 0.5 for later experiments.

Comparisons With The Accelerated OST
We present comparisons in terms of SNR (dB) and number of iterations.We omit the comparisons with 1 minimization methods and some other OST/TST methods, as they have already been investigated in [31,33].

Performance For Other Fast-Decaying Signals
As discussed in [30,33]

Summary
To sum up, we have compared EMTP with ADORE/DORE.EMTP has significant advantages over the accelerated OST in terms of SNR and number of iterations.EMTP can significantly reduce the number of iterations required and achieves significantly higher SNR.For low indeterminacy level m/n, EMTP requires fewer measurements.β-EMTP can work with no prior knowledge of the underlying sparsity level k yet achieves recovery performance better than ADORE.Among all methods, β-EMTP appeared to be the best.

Conclusions And Future Work
In this paper, we propose ECME thresholding pursuits (EMTP) for sparse signal recovery.EMTP detects a support set I using a ECME iteration and estimates the reconstructed signal by solving a truncated least-squares problem on the support set I, and it iterates these two steps for a small number of times.We present two effective support detection strategies (hard thresholding and dynamic thresholding) for the sparse signals with components having a fast decaying distribution of nonzero components.The experimental studies are presented to demonstrate that EMTP offers an attractive alternative to state-of-the-art algorithms for sparse signal recovery.
EMTP can significantly reduce the number of iterations required.We then turn to the problem of reducing the computational cost of each iteration, especially for large scale applications.The future research includes three directions: 1. EMTP requires the pre-computation and storage of the pseudoinverse matrix A † .So we replace ECME iteration by IHT step in the support detection stage will be a advisable choice.IHT do not require the computation and storage of an pseudoinverse matrix A † , which leads to significant advantage for large scale applications.
2. In signal estimation stage, i.e. update reconstructed signal by solving least-squares problem, EMTP uses the orthogonal projection, that is, by calculating x I = A † I y where A † I is the pseudoinverse of A I .We will approximate the orthogonal projection efficiently by gradient pursuits [32].
3. This paper devotes efforts to devise computational algorithms which are experimentally reproducible and provides empirical sturdies as useful guidelines for practical applications, further, the theoretical analysis of EMTP will be developed for future investigations. ) measurement matrix A generated by uniform spherical ensemble (USE), i.e. we generate the measurement matrix by sampling each entry independently form the standard norm distribution and then normalize each column to have unit norm.The MATLAB code is given by A = randn(m, n); A = A ./ repmat(sqrt(sum(A.^2)),[m 1]); The underlying k-sparse vectors were generated by randomly selecting a support set of size k and each entry in the support set is sampled uniformly from a specific distribution.The sparse signals were generated in MATLAB by x = zeros(n,1); support = randperm(n); x(support(1:k)) = v; and the nonzeros v generated by following code.The sparse Gaussian signals were generated in MATLAB by v = randn(k,1); The sparse Laplacian signals were generated by z = rand([k, 1]); v = zeros([k, 1]); in = z <= 0.5; ip = z > 0.5; v(in) = 1/lambda * log(2 * z(in)); v(ip) = -1/lambda * log(2 * (1-z(ip))); and the power-law decaying signals were generated by v = sign(randn(k,1)) .*((1:k) .\^(-1/lambda))'; Figure 15 and 16.The latter tests come to similar conclusions as the first test.These figures do not fully capture the performance of β-EMTP since we only set β = 0.5, however β-EMTP achieved almost superior performance than ADORE/DORE.
Figure 17, 18, 19 and 20.The conclusions for sparse Gaussian signals also can be generalized for sparse Laplacian signals and pow-law decaying signals.Pow-law decaying signals for low indeterminacy level m/n make an exception that β-EMTP achieves inferior performance.

Figures
Figures

Figure 1 -
Figure 1 -Influence of thresholding parameter β for sparse Gaussian signals with n = 400, k = 20: comparisons in terms of SNR (dB) and number of iterations.

Figure 2 -
Figure 2 -Influence of thresholding parameter β for sparse Gaussian signals with n = 400, k = 40: comparisons in terms of SNR (dB) and number of iterations.

Figure 3 -
Figure 3 -Influence of thresholding parameter β for sparse Gaussian signals with n = 400, k = 80: comparisons in terms of SNR (dB) and number of iterations.

Figure 4 -
Figure 4 -Influence of thresholding parameter β for sparse Gaussian signals with n = 400, m = 80: comparisons in terms of SNR (dB) and number of iterations.

Figure 5 -
Figure 5 -Influence of thresholding parameter β for sparse Gaussian signals with n = 400, m = 140: comparisons in terms of SNR (dB) and number of iterations.

Figure 5 -
Figure 5 -Influence of thresholding parameter β for sparse Gaussian signals with n = 400, m = 200: comparisons in terms of SNR (dB) and number of iterations.

Figure 7 -
Figure 7 -Influence of thresholding parameter β for sparse Gaussian signals with n = 2000, k = 100: comparisons in terms of SNR (dB) and number of iterations.

Figure 8 -
Figure 8 -Influence of thresholding parameter β for sparse Gaussian signals with n = 2000, m = 400: comparisons in terms of SNR (dB) and number of iterations.

Figure 9 -
Figure 9 -Sparse Gaussian signals with n = 400, k = 20: comparisons in terms of SNR (dB) and number of iterations.

Figure 10 -
Figure 10 -Sparse Gaussian signals with n = 400, k = 40: comparisons in terms of SNR (dB) and number of iterations.

Figure 11 -
Figure 11 -Sparse Gaussian signals with n = 400, k = 80: comparisons in terms of SNR (dB) and number of iterations.

Figure 12 -
Figure 12 -Sparse Gaussian signals with n = 400, m = 80: comparisons in terms of SNR (dB) and number of iterations.

Figure 13 -
Figure 13 -Sparse Gaussian signals with n = 400, m = 140: comparisons in terms of SNR (dB) and number of iterations.

Figure 14 -
Figure 14 -Sparse Gaussian signals with n = 400, m = 200: comparisons in terms of SNR (dB) and number of iterations.

Figure 15 -
Figure 15 -Sparse Gaussian signals with n = 2000, k = 100: comparisons in terms of SNR (dB) and number of iterations.

Figure 16 -
Figure 16 -Sparse Gaussian signals with n = 2000, m = 400: comparisons in terms of SNR (dB) and number of iterations.

Figure 17 -
Figure 17 -Sparse Laplacian signals with n = 400, k = 40: comparisons in terms of SNR (dB) and number of iterations.

Figure 18 -
Figure 18 -Sparse Laplacian signals with n = 2000, k = 100: comparisons in terms of SNR (dB) and number of iterations.

Figure 19 -
Figure 19 -Pow-law decaying signals with n = 400, k = 40: comparisons in terms of SNR (dB) and number of iterations.

Figure 20 -
Figure 20 -Pow-law decaying signals with n = 2000, k = 100: comparisons in terms of SNR (dB) and number of iterations.