Compressive Sensing in Signal Processing: Algorithms and Transform Domain Formulations

Compressive sensing has emerged as an area that opens new perspectives in signal acquisition and processing. It appears as an alternative to the traditional sampling theory, endeavoring to reduce the required number of samples for successful signal reconstruction. In practice, compressive sensing aims to provide saving in sensing resources, transmission, and storage capacities and to facilitate signal processing in the circumstances when certain data are unavailable. To that end, compressive sensing relies on the mathematical algorithms solving the problem of data reconstruction from a greatly reduced number of measurements by exploring the properties of sparsity and incoherence.Therefore, this concept includes the optimization procedures aiming to provide the sparsest solution in a suitable representation domain. This work, therefore, offers a survey of the compressive sensing idea and prerequisites, together with the commonly used reconstruction methods. Moreover, the compressive sensing problem formulation is considered in signal processing applications assuming some of the commonly used transformation domains, namely, the Fourier transform domain, the polynomial Fourier transform domain, Hermite transform domain, and combined time-frequency domain.


Introduction
The fundamental approach for signal reconstruction from its measurements is defined by the Shannon-Nyquist sampling theorem stating that the sampling rate needs to be at least twice the maximal signal frequency. In the discrete case, the number of measurements should be at least equal to the signal length in order to be exactly reconstructed. However, this approach may require large storage space, significant sensing time, heavy power consumption, and large number of sensors. Compressive sensing (CS) is a novel theory that goes beyond the traditional approach [1][2][3][4]. It shows that a sparse signal can be reconstructed from much fewer incoherent measurements. The basic assumption in CS approach is that most of the signals in real applications have a concise representation in a certain transform domain where only few of them are significant, while the rest are zero or negligible [5][6][7]. This requirement is defined as signal sparsity.
Another important requirement is the incoherent nature of measurements (observations) in the signal acquisition domain. Therefore, the main objective of CS is to provide an estimate of the original signal from a small number of linear incoherent measurements by exploiting the sparsity property [3,4].
The CS theory covers not only the signal acquisition strategy, but also the signal reconstruction possibilities and different algorithms [8][9][10][11][12][13][14][15][16][17]. Several approaches for CS signal reconstruction have been developed and most of them belong to one of three main approaches: convex optimizations [8][9][10][11] such as basis pursuit, Dantzig selector, and gradient-based algorithms; greedy algorithms like matching pursuit [14] and orthogonal matching pursuit [15]; and hybrid methods such as compressive sampling matching pursuit [16] and stagewise OMP [17]. When comparing these algorithms, convex programming provides the best reconstruction accuracy, but at the cost of high computational complexity. The greedy 2 Mathematical Problems in Engineering algorithms bring about low computation complexity, while the hybrid methods try to provide a compromise between these two requirements [18].
The proposed work provides a survey of the general compressive sensing concept supplemented with the several existing approaches and methods for signal reconstruction, which are briefly explained and summarized in the form of algorithms with the aim of providing the readers with an easier and practical insight into the state of the art in this field. Apart from the standard CS algorithms, a few recent solutions have been included as well. Furthermore, the paper provides an overview of different sparsity domains and the possibilities of employing them in the CS problem formulation. Additional contribution is provided through the examples showing the efficiency of the presented methods in practical applications.
The paper is organized as follows. In Section 2, a brief review of the general compressive sensing idea is provided together with the conditions for successful signal reconstruction from reduced set of measurements and the signal recovery formulations using minimization approaches. In Section 3, the commonly used CS algorithms are reviewed. The commonly used domains for CS strategy implementation are given in Section 4, while some of the examples in real applications are provided in Section 5. The concluding remarks are given in Section 6.

Sparsity and Compressibility.
Reducing the sampling rate using CS is possible for the case of sparse signals that can be represented by a small number of significant coefficients in an appropriate transform basis. A signal having K nonzero coefficients is called -sparse. Assume that signal exhibits sparsity in certain orthonormal basis Ψ defined by the basis vectors { 1 , 2 , . . . , }. The signal can be represented using its sparse transform domain vector x as follows: (1) In matrix notation, the previous relation can be written as Commonly, the sparsity is measured using the ℓ 0 -norm, which represents the cardinality of the support of x: In real applications, the signals are usually not strictly sparse but only approximately sparse. Therefore, instead of being sparse, these signals are often called compressible, meaning that the amplitudes of coefficients decrease rapidly when arranged in descending order. For instance, if we consider coefficients | 1 | ≥ | 2 | ≥ ⋅ ⋅ ⋅ ≥ | |, then the magnitude decays with a power law if there exist constants 1 and > 0 satisfying [19] where larger means faster decay and consequently more compressible signal. The signal compressibility can be quantified using the minimal error between the original and sparsified signal (obtained by keeping only largest coefficients): Using the sparse transform domain representation of vector s given by (2), we have where A will be referred to as CS matrix.
In order to define some requirements for the CS matrix A, which are important for successful signal reconstruction, let us introduce the null space of matrix. The null space of CS matrix A contains all vectors x that are mapped to 0: In order to provide a unique solution, it is necessary to provide the notion that two K-sparse vectors x and x do not result in the same measurement vector. In other words, their difference should not be part of the null space of CS matrix A: Since the difference between two K-sparse vectors is at most 2K-sparse, then a K-sparse vector x is uniquely defined if null space of A contains no 2K-sparse vectors. This corresponds to the condition that any 2K columns of A are linearly independent; that is, and since spark(A) ∈ [2, + 1], we obtain a lower bound on the number of measurements: In the case of strictly sparse signals, the spark can provide reliable information about the exact reconstruction possibility. However, in the case of approximately sparse signals, this condition is not sufficient and does not guarantee stable recovery. Hence, there is another property called null space property that measures the concentration of the null space of matrix A. The null space property is satisfied if there is a constant ∈ (0, 1) such that [19] Mathematical Problems in Engineering 3 for all sets ⊂ {1, . . . , } with cardinality K and their complements Sc = {1, . . . , } \ . If the null space property is satisfied, then a strictly -sparse signal can be perfectly reconstructed by using ℓ 1 -minimization. For approximately -sparse signals, an upper bound of the ℓ 1 -minimization error can be defined as follows [20]: where (x) 1 is defined in (5) for = 1 as the minimal error induced by the best -sparse approximation. The null space property is necessary and sufficient for establishing guarantees for recovery. A stronger condition is required in the presence of noise (and approximately sparse signals). Hence, in [8], the restricted isometry property (RIP) of CS matrix A has been introduced. The CS matrix A satisfies the RIP property with constant for every -sparse vector x. This property shows how well the distances are preserved by a certain linear transformation. We might now say that if the RIP is satisfied for 2K with 2 < 1, then there are no two K-sparse vectors x that can correspond to the same measurement vector y. Finally, the incoherence condition mentioned before, which is also related to the RIP of matrix A, refers to the incoherence of the projection basis Φ and the sparsifying basis Ψ. The mutual coherence can be simply defined by using the combined CS matrix A as follows [21]: The mutual coherence is related to the restricted isometry constant using the following bound [22]:

Signal Recovery Using Minimization
Approach. The signal recovery problem is defined as the reconstruction of vector x from the measurements y = Ax. This problem can be generally seen as a problem of solving an underdetermined set of linear equations. However, in the circumstances when x is sparse, the problem can be reduced to the following minimization: min ‖x‖ 0 The ℓ 0 -minimization requires an exhaustive search over all ( ) possible sparse combinations, which is computationally intractable. Hence, the ℓ 0 -minimization is replaced by convex ℓ 1 -minimization, which will provide the sparse result with high probability if the measurement matrix satisfies the previous conditions. The ℓ 1 -minimization problem is defined as follows: min ‖x‖ 1 and it has been known as the basis pursuit.
In the situation when the measurements are corrupted by the noise of level e: y = ΦΨx + e = Ax + e and ‖e‖ 2 ≤ , the reconstruction problem can be defined in a form: called basis pursuit denoising. The error bound for the solution of (19), where A satisfies the RIP of order 2 with 2 < √ 2 − 1 and y = Ax + e, is given by where the constants 0 and 1 are defined as [19] For a particular regularization parameter > 0, the minimization problem (19) can be defined using the unconstrained version as follows: which is known as the Lagrangian form of the basis pursuit denoising. These algorithms are commonly solved using primal-dual interior-point methods [22]. Another form of basis pursuit denoising is solved using the least absolute shrinkage and selection operator (LASSO), and it is defined as follows: where is a nonnegative real parameter. The convex optimization methods usually require high computational complexity and high numerical precision. When the noise is unbounded, one may apply the convex program based on Dantzig selector (it is assumed that the noise variance is 2 per measurement, i.e., the total variance is 2 ): which (for enough measurements) reconstructs a signal with the error bound: The norm ℓ ∞ is infinite norm called also supreme (maximum) norm.

Review of Some Signal Reconstruction Algorithms
The ℓ 1 -minimization problems in CS signal reconstruction are usually solved using the convex optimization methods. In addition, there exist greedy methods for sparse signal recovery which allow faster computation compared to ℓ 1minimization. Greedy algorithms can be divided into two major groups: greedy pursuit methods and thresholdingbased methods. In practical applications, the widely used ones are the orthogonal matching pursuit (OMP) and compressive sampling matching pursuit (CoSaMP) from the group of greedy pursuit methods, while from the thresholding group the iterative hard thresholding (IHT) is commonly used due to its simplicity, although it may not be always efficient in providing an exact solution. Some of these algorithms are discussed in detail in this section.

Matching Pursuit.
The matching pursuit algorithm has been known for its simplicity and was first introduced in [14]. This is the first algorithm from the class of iterative greedy methods that decomposes a signal into a linear set of basis functions. Through the iterations, this algorithm chooses in a greedy manner the basis functions that best match the signal. Also, in each iteration, the algorithm removes the signal component having the form of the selected basis function and obtains the residual. This procedure is repeated until the norm of the residual becomes lower than a certain predefined threshold value (halting criterion) (Algorithm 1). The matching pursuit algorithm however has a slow convergence property and generally does not provide efficiently sparse results.

Orthogonal Matching
Pursuit. The orthogonal matching pursuit (OMP) has been introduced [15] as an improved version to overcome the limitations of the matching pursuit algorithm. OMP is based on principle of orthogonalization. It computes the inner product of the residue and the measurement matrix and then selects the index of the maximum correlation column and extracts this column (in each iteration). The extracted columns are included into the selected set of atoms. Then, the OMP performs orthogonal projection over the subspace of previously selected atoms, providing a new approximation vector used to update the residual. Here, the residual is always orthogonal to the columns of the CS matrix, so there will be no columns selected twice and the set of selected columns is increased through the iterations. OMP provides better asymptotic convergence compared to the previous matching pursuit version (Algorithm 2).
In each iteration, the residual vector r is correlated with the columns of the CS matrix A, forming a "proxy signal" z. Then, the algorithm selects 2 columns of A corresponding to the 2 largest absolute values of z, where defines the signal sparsity (number of nonzero components). Namely, all but the largest 2 elements of z are set to zero and the 2 support Ω is obtained by finding the positions of nonzero elements. The indices of the selected columns (2 in total) are then added to the current estimate of the support of the unknown vector. After solving the least squares, a 3sparse estimate of the unknown vector is obtained. Then, the 3 -sparse vector is pruned to obtain the -sparse estimate of the unknown vector (by setting all but the largest elements to zero). Thus, the estimate of the unknown vector remains -sparse and all columns that do not correspond

Input:
Transform matrix Ψ, Measurement matrix Φ CS matrix A: A = ΦΨ Measurement vector y Output: A signal estimatex ← x An approximation to the measurements y by a A residual r = y − a . A set Ω with positions of non-zero elements ofx.
to the true signal components are removed, which is an improvement over the OMP. Namely, if OMP selects in some iteration a column that does not correspond to the true signal component, the index of this column will remain in the final signal support and cannot be removed (Algorithm 3).
The CoSaMP can be observed through five crucial steps: (i) Identification (Line 5). It finds the largest 2s components of the signal proxy.
(ii) Support Merge (Line 6). It merges the support of the signal proxy with the support of the solution from the previous iteration.
(iii) Estimation (Line 7). It estimates a solution via least squares where the solution lies within a support T.
(iv) Pruning (Line 8). It takes the solution estimate and compresses it to the required support.
(v) Sample Update (Line 9). It updates the "sample," meaning the residual as the part of signal that has not been approximated.

Iterative Hard Thresholding Algorithm.
Another group of algorithms for signal reconstruction from a small set of measurements is based on iterative thresholding. Generally, these algorithms are composed of two main steps. The first one is the optimization of the least squares term, which is done by solving the optimization problem without ℓ 1minimization. The other one is the decreasing of the ℓ 1 -norm, which is done by applying the thresholding operator to the magnitude of entries in x. In each iteration, the sparse vector x is estimated by the previous version x −1 using negative gradient of the objective function defined as while the negative gradient is then Generally, the obtained estimate x is sparse, and we need to set all but the K largest components to zero using the thresholding operator. Here, we distinguish two types of thresholding. Hard thresholding [24,25] sets all but the K largest magnitude values to zero, where the thresholding operator can be written as where is the K largest component of x. The algorithm is summarized in Algorithm 4. The stopping criterion for IHT can be a fixed number of iterations or the algorithm terminates when the sparse vector does not change much between consecutive iterations.
The soft-thresholding function can be defined as and it is applied to each element of x.

Iterative Soft Thresholding for Solving LASSO.
Let us consider the minimization of the function: as given by the LASSO minimization problem (22). One of the algorithms that has been used for solving (32)  -sparse approximationx of the target signal Algorithm 3: CoSaMP.

Input:
Transform matrix Ψ, Measurement matrix Φ CS matrix A: A = ΦΨ Measurement vector y K being the signal sparsity Output: soft-thresholding algorithm (ISTA), also called the thresholded Landweber algorithm. In order to provide an iterative procedure for solving the considered minimization problem, we use the minimization-maximization [26] approach to minimize (x). Hence, we should create a function (x) that is equal to (x) at x ; otherwise, it upper-bounds (x). Minimizing a majorizer (x) is easier and avoids solving a system of equations. Hence [27], and is such that the added function is nonnegative, meaning that must be equal to or greater than the maximum eigenvalue of A A: In order to minimize (x), the gradient of (x) should be zero: The solution is The solution of the linear equation is obtained using the soft-thresholding approach: Hence, we may write the soft-thresholding-based solution of (36) as follows: Mathematical Problems in Engineering 7 In terms of iteration + 1, the previous relation can be rewritten as 3.6. Automated Threshold Based Iterative Solution. This algorithm (see [12,13]) starts from the assumption that the miss- where M is the number of available and N is the total number of samples and y is a measurement vector. Consequently, using the variance of noise, it is possible to define a threshold to separate signal components from the nonsignal components in the DFT domain. For a desired probability ( ), the threshold is derived as The automated threshold based algorithm, in each iteration, detects a certain number of DFT components above the threshold. A set of positions k corresponding to the detected components is selected and the contribution of these components is removed from the signal's DFT. This will reveal the remaining components that are below the noise level. Further, it is necessary to update the noise variance and threshold value for the new iteration. Since the algorithm detects a set of components in each iteration, it usually needs just a couple of iterations to recover the entire signal. In the case that all signal components are above the noise level in DFT, the component detection and reconstruction are achieved in single iteration.

Adaptive Gradient-Based
Algorithm. This algorithm belongs to the group of convex minimization approaches [11]. Unlike other convex minimization approaches, this method starts from some initial values of unavailable samples (initial state) which are changed through the iterations in a way to constantly improve the concentration in sparsity domain. In general, it does not require the signal to be strictly sparse in certain transform domain, which is an advantage over other methods. Particularly, the missing samples in the signal domain can be considered as zero values. In each iteration, the missing samples are changed for +Δ and for −Δ. Then, for both changes, the concentration is measured as ℓ 1 -norm of the transform domain vectors x + (for +Δ change) and x − (for −Δ change), while the gradient is determined using their difference (lines 8, 9, and 10). Finally, the gradient is used to update the values of missing samples. Here, it is important to note that each sample is observed separately in this algorithm and one iteration is finished when all samples are processed. When the algorithm reaches the vicinity of the sparsity measure minimum, the gradient changes direction for almost 180 degrees (line 16), meaning that the step Δ needs to be decreased (line 17).
The precision of the result in this iterative algorithm is estimated based on the change of the result in the last iteration (lines 18 and 19).

CS in the Standard Fourier Transform Domain.
As the simplest case of CS scenario, we will assume a multicomponent signal that consists of K sinusoids that are sparse in the DFT domain: where the sparsity level is ≪ , where is total signal length, while and denote amplitudes and frequencies of signal components, respectively. Since s is sparse in the DFT domain, then we can write where F is a vector of DFT coefficients where at most K coefficients are nonzero, while F −1 is the inverse Fourier transform matrix of size × . If s is a signal in compressive sensing application, then only the random measurements y ⊂ s are available and these are defined by the set of M positions { 1 , 2 , }. Therefore, the measurement process can be modeled by matrix Φ: The CS matrix A represents a partial random inverse Fourier transform matrix obtained by omitting rows from F −1 that corresponds to unavailable samples positions: with the Fourier basis functions: The CS problem formulation is now given as 8

CS in the Polynomial Fourier Transform Domain.
In this section, we consider the possibility of CS reconstruction of polynomial phase signals [28]. Observe the multicomponent polynomial phase signal vector s, with elements s(n): where the polynomial coefficients are assumed to be bounded integers. The assumption is that the signal can be considered as sparse in the polynomial Fourier transform (PFT) domain. Hence, the discrete PFT form is given by If we choose a set of parameters ( 2 , 3 , . . . , ) that match the polynomial phase coefficients ( 2 , 3 , . . . , ) then the th signal component is demodulated and becomes a sinusoid: (2 / )( 1 ) which is dominant in the PFT. In other words, the spectrum is highly concentrated at = 1 . Therefore, we might say that if (51) is satisfied then the PFT is compressible with the dominant th component. Note that the sparsity (compressibility) in the PFT domain is observed with respect to the single demodulated component.
In order to define the CS problem in the PFT domain, instead of the signal s itself, we will consider a modified signal form obtained after the multiplication with the exponential term: given in the PFT definition (50). The new signal form is obtained as or in the vector form: Then, the PFT definition given by (50) can be rewritten in the vector form as follows: where F is the discrete Fourier transform matrix ( × ). For a chosen set of parameters ( 2 , 3 , . . . , ) in that is equal to the set ( 2 , 3 , . . . , ) in s, X = X is characterized by one dominant sinusoidal component at the frequency 1 .

CS Scenario
. Now, assume that z is incomplete in the compressive sensing sense, and instead of z we are dealing with available measurements defined by vector y , and where A is the partial random Fourier matrix obtained by omitting rows from F that correspond to the unavailable samples. When ( 2 , 3 , . . . , ) = ( 2 , 3 , . . . , ), then X can be observed as a demodulated version of the th signal component X , having the dominant th component in the spectrum with the support 1 . The rest of the components in spectrum are much lower than X and could be observed as noise. Hence, we may write the minimization problem in the form where T is certain threshold. The dominant components can be detected using the threshold based algorithm (Algorithm 5) described in the previous section. Using an iterative procedure, one may change the values of parameters 2 , . . . , between min and max , ∀ . The algorithm will detect the support 1 when the set ( 2 , 3 , . . . , ) matches the set ( 2 , 3 , . . . , ); otherwise, no component support is detected. Hence, as a result of this phase, we have identified the sets of signal phase parameters: k = ( 1 , 2 , . . . , ) = ( 1 , 2 , . . . , ). In order to calculate the exact amplitudes 1 , 2 , . . . , of signal components, we observe the set of equations in the form
or in other words we have another system of equations given by  (11) for ∀ ∈ p: (12)  where R = [ 1 , . . . , ] contains the desired signal amplitudes. The matrix A of size × is based on the PFT: . . . . . .
] . (60) The rows of A correspond to positions of measurements ( 1 , 2 , . . . , ), and columns correspond to phase parameters k = ( 1 , 2 , . . . , ) = ( 1 , 2 , . . . , ), for = 1, . . . , . The solution of the observed problem can be obtained in the least squares sense as follows: The resulting reconstructed signal is obtained as  [29,30]: Here, the th order Hermite basis function is defined using the th order Hermite polynomial as follows: where is the scaling factor used to "stretch" or "compress" Hermite functions, in order to match the signal. The Hermite functions are usually calculated using the following fast recursive realization: 10

Mathematical Problems in Engineering
The Hermite transform coefficients can be calculated in the matrix form as follows: Following the Gauss-Hermite approximation, the inverse matrix Ψ = H −1 contains Hermite functions and it is given by . . .
] . (67) Now, in the context of CS, let us assume that a signal s is Ksparse in the Hermite transform domain, meaning that it can be represented by a small set of nonzero Hermite coefficients c: where is the th element from the vector c of Hermite expansion coefficients. In the matrix form we may write such that most of the coefficients in c are zero values. Furthermore, assume that only the compressive sensing is done using random selection of M signal values [ 1 , 2 , . . . , ]. Then, the CS matrix A can be defined as random partial inverse Hermite transform matrix: ] .
For a vector of available measurements vector y = (s( 1 ), s( 2 ), . . . , s( )), the linear system of equations (undetermined system of M linear equations and unknowns) can be written as Since c has only nonzero components, the reconstruction problem can be defined as min ‖c‖ 1 s.t. y = Ac.
If we identify the signal support in the Hermite transform domain by the set of indices [ 1 , 2 , . . . , ], then the problem can be solved in the least squares sense aŝ where ( * ) denotes the conjugate transpose operation, while respectively. The STFT at an instant n can be now rewritten in the matrix form: where F is the standard Fourier transform matrix of size × with elements If we consider the nonoverlapping signal segments for the STFT calculation, then takes the values from the set {0, , 2 , . . . , − }, where is the window length while is the total signal length. The STFT vector, for ∈ {0, , 2 , . . . , − }, is given by ] .
Furthermore, if we assume that STFT vector is not completely available but has only a small percent of available samples, then we are dealing with the CS problem in the form where y STFT is a vector of available measurements from STFT, while A is the CS matrix obtained as partial combined transform matrix Ψ = FF −1 with rows corresponding to the available samples y STFT . In order to reconstruct the entire STFT, we can define the minimization problem in the form which can be solved using any of the algorithms provided in the previous section.

Experimental Evaluation
Example 1. Let us observe the QRS complex extracted from the real ECG signal. It has been known that these types of signals are sparse in the Hermite transform (HT) domain [33,34]. As an illustration, we present the Hermite transform and discrete Fourier transform (DFT) in Figure 1, from which it can be observed that HT provides sparse representation while DFT is dense. The analyzed signal is obtained from the MIT-ECG Compression Test Database [35], originally sampled uniformly with = 1/250 [s] being the sampling period, with amplitude gain 1/400. Extracted QRS is of length = 51, centered at the peak. Note that, in order to provide the sparsest HT representation, the scaling factor should be set to the optimal value [33,34], which is 6.8038 in this example, that is, Δ = 0.027 in seconds, proportional to the scaling factor 0.017 presented in [34] for signal with 27 samples, when it is scaled with the signal lengths ratio. Also, the QRS signal is resampled at the zeros of Hermite polynomial using sinc interpolation functions as described in [33].
The QRS signal is subject to compressed sensing approach, meaning that it is represented by 50% of available measurements. The available measurements and the corresponding HT (which is not sparse in the CS conditions) are shown in Figure 2.
In order to reconstruct the entire QRS signal from available measurements, two of the presented algorithms are employed, namely, the OMP and gradient-based reconstruction algorithm.
The reconstruction results are shown in Figure 3 in both time and transform domain. It can be observed that both algorithms provide quite successful reconstruction performance, which is certainly better in the case of gradientbased approach. In order to measure and compare the reconstruction performance of these two algorithms, the mean square error (MSE) is calculated, showing that the gradient-based algorithm in the considered case provides end for (15) = arcos( Example 2. In this example, the reconstruction of images using adaptive gradient-based algorithm (Algorithm 6) is considered. The images are generally not strictly sparse in any domain and as such could be quite demanding for reconstruction. For that reason, as stated in Section 3.7, the adaptive gradient-based algorithm is employed, since it does not require the strict sparsity condition. As a suitable transform domain representation for natural image, the twodimensional DCT (I → 2DDCT) is used. We might say that the image can be observed as approximately sparse in the 2DDCT domain.
The image with missing pixels is divided into 16 × 16 blocks, where 40% of the pixels are missing in each block (missing pixels are denoted in white in Figures 4(a), 4(c), and 4(e)). Hence, the algorithm is applied on a block-byblock basis where the measurement vector y is created for each observed image block using the available block pixels. The value of gradient step Δ = 128 is used. The resulting reconstructed images are shown in Figures 4(b), 4(d), and 4(f).
The quality of the reconstructed image is measured using the structural similarity index (SSIM) [36] between the original image (with full set of pixels) and reconstructed image. The SSIM ranges between 0 and 1, 1 being identical and 0 being not similar. It is designed to improve the traditional metrics such as peak signal-to-noise ratio (PSNR) in order to be more consistent with the perception of the human visual system. The achieved values of SSIM for the test images in Figure 4 are SSIM = 0.9, SSIM = 0.94, and SSIM = 0.95. Example 3. Let us assume that a radar signal consists of stationary (sinusoidal) components belonging to the rigid body and nonstationary components belonging to the disturbances caused by rotating target parts and noise. The Fourier transform and the STFT of the considered radar signal type are given in Figure 5.
The total signal length is 4096 samples. The STFT is calculated for the window width = 64 samples (STFT is of size 64 × 64). The aim of this experiment is to extract the rigid body components which are presented as a set of stationary sinusoids.
The STFT values of the original full signal along columns and then the 75% of the largest values (possibly belonging to rotating components and noise) are set to zero or in other words discarded from the STFT plane. Also, a small percentage of the smallest values along the sorted columns are also set to zero with the aim of discarding small values belonging to noise. The remaining STFT values are unsorted and placed to their original positions in the STFT plane. These values represent a small amount of available measurements in the STFT domain and we are faced with the CS problem. In    In order to reconstruct the stationary rigid body components from the available measurements, the CS matrix is defined as partial combined transform matrix with rows corresponding to the available samples in the STFT (according to the procedure described in Section 4.4). The STFT of the reconstructed stationary components is given in Figure 6(a), while the corresponding DFT is given in Figure 6(b).

Conclusion
The paper reviews the fundamental concepts of compressive sensing theory, comprising the main requirements, conditions, and common optimization problem formulations. Several algorithms for signal reconstruction from incomplete measurements are summarized, allowing an insight into the diversity of approaches, the related complexity, assumptions, and efficiency. As a special contribution, the paper presents some interesting compressive sensing formulations for different types of signals and sparsity domains, such as the Fourier transform domain, polynomial Fourier transform domain, Hermite transform, and time-frequency domain. Particularly, this part aims to reveal the possibilities and perspectives of using compressive sensing in different signal processing scenarios.