Noise Folding in Completely Perturbed Compressed Sensing

This paper first presents a new generally perturbed compressed sensing (CS) model y = (A + E)(x + u) + e, which incorporated a general nonzero perturbation E into sensing matrix A and a noise u into signal x simultaneously based on the standard CS model y = Ax + e and is called noise folding in completely perturbed CS model. Our construction mainly will whiten the new proposed CS model and explore in restricted isometry property (RIP) and coherence of the new CS model under some conditions. Finally, we use OMP to give a numerical simulation which shows that our model is feasible although the recovered value of signal is not exact compared with original signal because of measurement noise e, signal noise u, and perturbation E involved.


Introduction
Compressed sensing (CS) model, which was proposed by Candes et al. [1] and Donoho [2], had become a hot topic and attracted a lot of researchers to study it over the past years because it can recover a signal as a technique.Thus, it had been widely applied in many areas such as radar systems [3], signal processing [4], and image processing [5].These applications depended on the main function of CS model to recover the original signal with some related algorithms including convex relaxation [6,7], greedy pursuit [7], and Bayesian algorithm [8,9], which were utilized to estimate the best approximation value of the original signal.
The classic and basic CS model in an unperturbed scenario and can be formulated as Here,  ∈   is the measurement vector or observation value and  ∈  × is a full rank measurement matrix with  ≪ .
Signal  ∈   is -sparse if no more than  entries of signal  are nonzero.Thus,  is called a -sparse signal.
But in practice, the measurement vector  in (1) was often contaminated by a noise or an error.More concretely, a noise term  ∈   , called an additive noise, was incorporated into  =  to result in a partially perturbed model [21][22][23]: where a noise or an error  ( ̸ = 0) was uncorrelated with signal .There were two methods to model noise  in [24].Here, a noise  was randomly sampled from Gaussian distribution.This model was used in many areas [21][22][23] and naturally had mature theories in recent years.For example, a number of accuracy algorithms on (2) emerged, for example, BP [1,21], OMP [21], CoSaMP [16], and Bayesian algorithm [8,9].
In 2010, Herman and Strohmer [25] first incorporated a randomly nontrivial perturbation  into matrix  in (2) to generate a general perturbation model [25][26][27] as follows: where  ∈  × was called a general perturbation or a multiplicative noise.They studied influence of  on signal  and indicated that considering this CS model was a must [25][26][27].
Intuitively, it was harder to analyze the multiplicative noise  compared to the additive noise  because  was related to signal  with .
As for (3), there were two different scenarios from different points of view [25][26][27].First, from user's point of view, the sensing process can be formulated as follows: Its recovery process can be expressed by x = (ŷ, Â, . ..) .
( 1 ) Thus, the useful measurement matrix was the perturbed matrix Â not the original measurement matrix .The system was researched on the recovery signal with BP in [16,25] and OMP in [26,27].
(  1 ) The useful sensing matrix was  not Â and the observation value was ŷ.To the best of our knowledge, no work focused on recovery signal in the context of general perturbation  except for [25][26][27].
In some practical scenarios, signal itself was often contaminated by noise and such case was applied in sub-Nyquist converter.Though introducing noise to signal was significant, no prolific paper studied such signal noise  except for [24] which first added an unknown random noise  ∈   to sinal  of  =  +  to produce noise folding CS model [24]: They analyzed the RIP and ℎ of the equivalent system after whitening and showed that the difference of the RIP and ℎ between original  and whitened matrix was small [24].Based on [24][25][26][27], we propose a new CS model and study its related properties in Section 3.

Preliminaries
In this paper, we will restrict our attention to RIP and coherence of our new CS model.By convention, sensing matrix  and perturbation  are assumed to sample independently and identically distributed (i.i.d.) Gaussian random variables since such matrix satisfies RIP and coherence, and so forth [7,24], with probability one.
Definition 1  (see [24]).For (1) and ( 2), there is another equivalent statement for RIP of , denoted by RIP  , in some special cases.For any index set Λ ⊂ {1, . . ., } of size , let  Λ denote the submatrix of  consisting of the column vectors indexed by Λ, and the matrix  possesses RIP  with constants 0 <   ≤   , if for any index set Λ ⊂ {1, . . ., } of size , where  is a positive integer.
For (6), there existed another form of RIP for matrix  which was given by Lemma 2 [24] since matrix  was whitened.
The perturbation  and sensing matrix  in (3) can be quantified below in [25][26][27] where the symbol ‖‖ 2 denotes spectral norm of a matrix , ‖‖ () 2 denotes the largest spectral norm taken over all column submatrices of matrix , and  ()  max () [25] denotes the largest nonzero singular value taken over all -column submatrices of matrix .It was appropriate to assume 0 <   ,  ()   , and   ≪ 1.
Lemma 3 (RIP for Â [25]).For  = 1, 2, . .., given the  associated with matrix  in (3) and the relative perturbation Assume that the  δ ≤ δ,max for matrix Â =  +  is the smallest nonnegative number, and the  for Â can be written as for any -sparse vector .
Definition 4 (see [7]).The coherence, (), of a matrix  is the largest absolute inner product between any two columns   ,   ,  ̸ = , of matrix  as follows:

A New Completely Perturbed CS Model.
As mentioned above, for (2), (3), and ( 6), only one noise in (2) or two noises in ( 3) and ( 6) affected the CS model.Maybe a noise , a noise , and a perturbation  simultaneously affect the CS model although no paper studies this.In terms of the idea, [24] together with [25][26][27] where  ∈   is a random noise vector with covariance  2  and  ∈   presents a random premeasurement noise vector whose covariance is  2 0  independent of .Here  and  are regarded as additive noise. ∈  × is a random perturbation matrix and more details on perturbation  can be seen in [25].Here we call CS model (15) noise folding in completely perturbed CS model.Analogous to (3) in [25][26][27], (15) can also be considered in two different situations.Similarly, from user's point of view, an incorrect sensing matrix can be obtained via an unknown measurement model: and the recovery process algorithm can be written as The only difference between ( 1 ) and ( 2 ) is noise  in ( 2 ).From the designer's view, sensing process can be formulated as ŷ = Â ( + ) + , Â =  +  (17) and its recovery process is as Similarly, compared to (  1 ), noise  belongs to (  2 ).In this paper, we only study simply its properties: RIP and ℎ after whitening.Obviously, (15) can be extended to general multiperturbation CS model: where   is perturbation.System (18) can be viewed as a generalization of our proposed CS (15), which implies that the general conclusion of ( 18) can be obtained from the special conclusion of (15).The concrete results can be seen in the next section.Simultaneously, other general CS systems can be conjectured naturally as follows: Although their properties seem to be many but we do not know how to exploit and analyze them, we leave them as open problems.Here we mainly study relative RIP and ℎ on (15) and (18).In the next section, we give general results.

Problem Formulation.
For (15), our goal is to analyze the effect of the premeasurement noise  and  on its RIP and ℎ.
Throughout this paper, assume that  is a random noise vector with covariance  2 , and  is a random noise vector with covariance  2 0  independent of .Under these assumptions, (15) will be proved to be equivalent to  = B + , where B is a matrix whose ℎ and RIP constants are very close to that of ,  is whitened noise with variance ( 2 + (/) 2 0 ), and  is identity matrix.

RIP and Coherence of Our CS Model.
We will show that the conclusion holds generally.In other words, if ( + )( + )  is not proportional to the identity matrix , (15) and (20) are roughly equivalent really.Now we describe it in detail.
Note that if  is one random matrix, Â =  +  is a random matrix.To study RIP and ℎ of Â, we must whiten noise  by multiplying  −1/2 1 with  1 = / and get the equivalent system: Note that noise vector V is whitened with covariance matrix  exactly if ( + )( + )  is proportional to identity matrix.But the biggest difference lies in measurement matrix changing from original matrix Â to B by whitening.The changing range is measured with three important indexes: RIP constant, ℎ, and .Our theory mainly depends on approximating ( + )( + )  with (/) and even Â is an arbitrary matrix.Let measure accuracy of the approximating, in which ‖⋅‖ denotes the standard operator norm in   .For derivation convenient, assume that  is very small and show that the ℎ and RIP constant of B are very close to that of .By convention, the entries of  are i.i.d.mean zero and variance 1/ random variables with Gaussian distribution; thus, it is easy to justify that  is always small.Another useful formula can be formulated: which was introduced in [24].The fact that  0 was very small had been proved in [24] with restrictions on only matrix .It is natural to think whether the difference between  and  0 is very small.Theorem 5 confirms our conjecture and further inspires us to think whether the difference between B's ℎ and RIP and Â's and 's is very small, respectively.The later related theorems will give us the positive answers.
Theorem 5. Assume that sensing matrix  ∈  × , an un- where  1 is the largest nonzero positive singular value of ; then Proof.The detailed proof is postponed to the Appendix.
Theorem 7 shows the RIP of B in the case of  < 1/2 though 0 <  < 1 is sufficient for the proof of the RIC for B.

Theorem 7. Assume that sensing matrix 𝐴 ∈ 𝑅 𝑚×𝑛 , an unknown random matrix
, suppose that Â satisfies the  of order  with 0 < αΛ ≤ βΛ , and  1 > 0 is the largest singular value of matrix ; then B satisfies the  of order  with different constants below: Proof.The detailed proof is postponed to the Appendix.
Remark 8.In Theorems 5 and 7, the condition ‖‖/‖‖ ≤   with 0 <   ≪ 1 can be taken in place of  =  [25], in which  is a simple version of , so that we can get another result.Due to paper volume, they are omitted here.But their proofs are very simple that researchers can prove them and yield perfect results.
Multiperturbation CS system (18) can be viewed as a generalization of the new proposed CS system (15) so that the general conclusion of the (18) can come from that of (15) Proof.The detailed proof is postponed to the Appendix.
Remark 12. Though 0 < η < 1 is sufficient for the proof, the RIC for B is positive in the restriction of η < 1/2.
Next, we compare the ℎ of B after whitening Â to that of Â.   ,  = 1, . . ., , is used to denoted the th column vector of a matrix .Similar to the coherence of , the coherence of Â is first given in Definition 13.
Definition 13.Assume that  ∈  × is a random matrix,  ∈  × is an unknown random matrix in CS, and Â =  + ; then ℎ of Â, denoted by ( Â), can be formulated as In fact, ( Â) is the largest absolute inner product between any columns Â , Â ,  ̸ = .
. B denotes the th column vector of whitening matrix B; Â denotes the th column vector of Â; that is, Â =   +   .
Proof.The detailed proof is postponed to the Appendix.
In [25],  is simply version of random matrix  such as  =  with 0 <  ≪ 1.The relation between ( Â) and () will be seen from Theorem 15 below in the case of  = , 0 <  ≪ 1.
Proof.The proof of Theorem 15 is similar to that of Theorem 14; here we omit it.

Numerical Experiment
Vertical coordinate (sinusoidal) denotes the degree of recovery signal.Horizontal coordinate denotes time whose unit is seconds.Black line denotes original signal and red line denotes recovery signal.
Here we use OMP to give three numerical simulation results which demonstrate that our new proposed generally perturbed CS is feasible.To compare signal recovering with OMP from three figures, signal recovery from measurement noise model  =  +  is almost exact because of only noise  in basic CS model  =  + .There are a lot of differences between recovery signal and original signal in both  = ( + )( + ) +  and  = ( + ) +  CS models because there are noises , ,  in the two CS models.Comparing the change between recovered signal and original signal of Figure 1  Compared with the change (error) between the recovery signal and the original signal in Figure 1(a), the changes (errors) in Figures 1(b) and 1(c) differ little.Namely, the differences between recovered signal and original signal from  = ( + )( + ) +  are almost the same as the differences between recovery signal and original signal from  = ( + )+, which indicates that our proposed CS model is feasible.
Comparing the change between recovered signal and original signal in Figure 1(a), the changes in Figures 1(b) and 1(c) are quite different.The fact shows that OMP is not the best algorithm to recover  from  = ( + )( + ) +  and  = ( + ) +  although OMP is used to recover exact original signal from  =  + .Thus, it is important to search for a powerful algorithm or more algorithms to recover exactly for original sparse signal from  = ( + ) +  and  = ( + )( + ) +  as open problems.And here leave these problems to the interested researchers to exploit them because the paper cannot focus on searching for optimal algorithms to recover exactly original signal from CS models  = ( + ) +  and  = ( + )( + ) + .

Conclusion
We first propose a new CS system (15) by introducing a multiplicative noise , a signal noise , and an additive noise  into unperturbed CS model (1).We derive RIP and ℎ for Â =  +  after whitening (15).As a matter of fact, this paper proves that our proposed completely perturbed CS model (15) equals to the classic CS model (2).The only difference is the changed measurement matrix by incorporating a nontrivial perturbation matrix  to measurement matrix and a nontrivial noise  to signal .And thus this induces noise variance increased by a factor of / so that a tighter upper bound and lower bound of RIC is produced.As for ℎ of deformed measurement matrix Â = + in CS model (15), the constant is nearly invariant essentially with / → 0, ,  → ∞.Finally, we use OMP to give three figures to recover signal from CS model  =  + ,  = ( + ) + ,  = (+)(+)+, respectively.Figures 1(b) and 1(c) in our experiment demonstrate that the change between recovered signal and original signal is much bigger than that in Figure 1(a) which indicates our proposed CS model is feasible and OMP is not fit for recovering signal from  = (+)+ and  = (+)(+)+.Thus, we can try to search one optimal algorithm or more algorithms to recover signal exactly from the two CS model although OMP is the best algorithm to recover exactly original signal from  =  +  now.

Future Work
Thanks to the features of our proposed CS model (15), there are many works to do.The change between recovered signal and original signal in Figures 1(a), 1(b), and 1(c) indicates that our proposed CS in this paper is feasible although the differences between the recovery signal and original signal in Figures 1(b) and 1(c) are much bigger with OMP than the differences in Figure 1(a).Thus, an obvious problem is to search one algorithm or more optimal algorithms suitable for  = (+)+ and  = (+)(+)+ to recover signal exactly.
The related RIP of  in [24] further motivates us to think that  as a perturbation sensing matrix could form one perturbed CS model as  1 = ( + ) + .Thus, (15) may consist of two similar systems  2 = (+)+ and  1 = (+ ) + .Similarly, our model may be divided into another two models  3 = (+)+,  4 = (+)+, or three basic parts   1 = +,   2 = +, and   3 = (+)+.If possible, what can we do to reduce or eliminate the influence of an error CS system   3 = ( + ) + ? Can we recover signal  from the error system   3 = (+)+?And if can, how to do it?Maybe there exists CS models  = (+) and  = (+)(+).In addition, maybe we can also consider the impulse noise and use + instead of  where  is the impulse noise.If so, maybe it can generalize our model and we can get very good results in impulse noise model.Here we cannot study such impulse noise model and leave it as an open problem, too.
These open problems are worth considering and are to be waited for studying in future work.This paper only does some elementary researches on our proposed CS and we hope that the idea and simple study in this paper will be helpful to study its wide application in the future.We hope that higher level compressed sensing model will be put forward and more and more people explore this areas in the future.The last equation holds because of ‖‖ 2 =  1 , ‖‖ 2 = ‖  ‖ 2 .Combine (A.1) with (A.2) to obtain (32).

Appendix
Proof of Theorem 7. The three different inequalities come from different proving processes but in essence they are the same.Here we only prove the first inequality in more detail; the proofs of the second and third inequality are similar.For convenience, we denote them by Cases 1, 2, and 3 related to Cases 1  , 2  , and 3  , respectively.The proofs depend on one fact that  1 = / is close to  due to the definition of .Suppose that Assume that ( 15) can be written as  = B + V, where ≜  1 . (A.5) The last equation holds because of holds when  → ∞; therefore, (A.5) → (/) 2 0 /( 2 + (/ ) 0 ) < 1.

Figure 1 :
Figure 1: (a) Recover  from  =  + .The error of the recovered signal value and original signal value is 6.4835 − 004 by the Euclidean norm which indicates that the recovered signal and original signal are almost the same.(b) Recover  from  = (+)+ .The error of the recovered signal value and the original signal value is 6.2258 by the Euclidean norm which means that the recovered signal and original signal are quite different.(c) Recover  from  = (+)(+)+.The error of the recovered signal value and the original signal value is 5.8855 by the Euclidean norm which indicates that the recovered signal is much different from the original signal.Compared with the change (error) between the recovery signal and the original signal in (a), the changes (errors) in (b) and (c) differ little.What is more, comparing the change between the recovery signal and the original signal of (c) with that of (b), the change of (c) is quite a little bigger because  is involved in (c) but  is not involved in (b).

Figure 1 (
Figure 1(c) is a bit bigger than that of Figure 1(b) because perturbation  is involved in Figure 1(c) ( = (+)(+)+) and  is not involved in Figure 1(b) ( = ( + ) + ), which shows that the different noises , , , have a different impact on signal recovering.Compared with the change (error) between the recovery signal and the original signal in Figure1(a), the changes (errors) in Figures1(b) and 1(c) differ little.Namely, the differences between recovered signal and original signal from  = ( + )( + ) +  are almost the same as the differences