A Novel Detection Scheme with Multiple Observations for Sparse Signal Based on Likelihood Ratio Test with Sparse Estimation

Recently, the problem of detecting unknown and arbitrary sparse signals has attracted much attention from researchers in various fields. However, there remains a peck of difficulties and challenges as the key information is only contained in a small fraction of the signal and due to the absence of prior information. In this paper, we consider a more general and practical scenario of multiple observations with no prior information except for the sparsity of the signal. A new detection scheme referred to as the likelihood ratio test with sparse estimation (LRT-SE) is presented.Under theNeyman-Pearson testing framework, LRT-SE estimates the unknown signal by employing the l1-minimization technique from compressive sensing theory. The detection performance of LRT-SE is preliminarily analyzed in terms of error probabilities in finite size and Chernoff consistency in high dimensional condition.The error exponent is introduced to describe the decay rate of the error probability as observations number grows. Finally, these properties of LRT-SE are demonstrated based on the experimental results of synthetic sparse signals and sparse signals from real satellite telemetry data. It could be concluded that the proposed detection scheme performs very close to the optimal detector.


Introduction
Nowadays, with network and information boom, there is massive data influx in every walk of life including military detection, industrial monitoring, IT industry, entertainment industry, and many other fields.These big data of highvolume, high-velocity, and high-variety information demand innovative forms of information processing for enhanced insight, decision-making, and signal detection [1].However, it is increasingly difficult to detect and extract the truly useful portion from massive data.
Plenty of state-of-the-art methods have been put forward to detect targets and signals for big data, for example, utilizing artificial neural networks or deep neural networks to do fault diagnosis with massive data [2,3].As the research on the sparsity of signal further develops, it is also a cutting-edge method to employ sparse representation to characterize the massive data because these data are always full of redundant information.A signal could be regarded as a sparse signal if the coefficient of the vector itself, or under certain basis, contains only a few nonzero components [4].
In this paper, the sparse signal detection problem which is to determine whether a sparse signal exists in a small fraction of the background noise has been discussed.This problem has attracted growing attention as sparse signals are usually closely interrelated to unpredictable changes, abnormality, and danger, for example, fault detection and diagnosis of satellite navigation, network anomaly detection, industrial process supervision, and prediction of natural disasters.It is of great significance to avoid serious consequences induced by these indiscoverable early warnings.
However, there are many challenges to distinguish and separate these sparse signals from noise because useful information takes up only small part of the entries in the 2 Mathematical Problems in Engineering signal.In addition, prior information is almost unavailable due to the diversity and indeterminism of the abnormal sparse signals.
It is a common way to formulate the detection problem into composite hypothesis testing.A classical approach to detect signals with unknown parameter is the generalized likelihood ratio test (GLRT) [5].The basic idea of GLRT is to replace the unknown parameter with its maximum likelihood (ML) estimates and then continue employing likelihood ratio test (LRT).The test has the form Λ  (x) = where x is the observations vector and  1 and  0 are the parameter vectors to be determined.Frequently, we work with the logarithm ln |  (x |   ).If the maximum of the log likelihood function is interior to the range of   and ln |  (x |   ) has a continuous first derivative, then a necessary condition on the ML estimation is obtained by differentiating ln |  (x |   ) with respect to   .Therefore, it is fairly important to divide the parameter space into two partitions, which has been proved to be a crucial factor of the asymptotic optimality of GLRT [6].Unfortunately, the problem in this paper could hardly meet the necessary condition of GLRT.In addition, it is proved by Hartigan [7] and Bickel [8] that the usual GLRT has nonstandard behavior within the detection boundary of LRT, as the maximized ratio tends to ∞ under  0 .Moreover, it is time-consuming and almost infeasible to calculate GLRT when the signal is of high dimension.
Inspired by the previous study, a novel scheme of detecting unknown and arbitrary signals against noise under the name of LRT-SE is proposed in this paper.LRT-SE integrates the hypothesis test under Neyman-Pearson framework and  1 -minimization technique from CS theory.There exist notable differences between this new method and GLRT owing to the fact that the new method estimates the unknown sparse signal utilizing the  1 -minimization which takes full advantage of the signal's sparsity.Preliminary analyses are provided to characterize the performance of LRT-SE in both finite and high dimension.It could be revealed that the prospective near-optimal performance is achievable by the experiment results of both synthetic sparse signals and sparse satellite telemetry data.
The rest of this paper is organized as follows.Section 2 provides a review of some related work.Section 3 introduces the background on Neyman-Pearson test and CS theory.In Section 4, the application scenario and the formulation of the problem are firstly introduced.The scheme of LRT-SE is then presented with the corresponding flow chart.In addition, theoretical analyses are discussed, preliminarily characterizing the properties of LRT-SE.These properties have been verified by the results in Section 5 based on the experiments on synthetic signals and satellite telemetry data.Section 6 discusses the results in detail.Finally, Section 7 concludes the full text and points out the research direction in the future.

Related Work
Detection of unknown sparse signal mixed with noise is a long-standing problem of composite hypothesis testing that arises in many scientific applications such as signal processing [9], wireless sensor networks [10], and remote sensing systems.A variety of methods and schemes have been developed to solve this universal yet intricate problem.
Except for the classical GLRT method, most methods so far concentrate on making assumptions of prior information about the signals of interest to alleviate the dilemma.On this premise, some researchers turned over to seek test statistics of the signal's sparsity adopting the recently prevalent compressive sensing (CS) framework.For instance, Duarte et al. estimated the relevant sufficient statistics for the test by directly extracting them from a small number of random projections without ever reconstructing the signal.They proved that the CS framework is information scalable to a much wider range of statistical inference tasks [11].In the same year, Davenport et al. further demonstrated how to solve a variety of signal detection problems including sparse signal detection and estimation problems and gave the measurements without ever reconstructing the signals themselves [12].Another paper written by Haupt and Nowak proposed a detector that collects a set of universal samples obtained without prior knowledge of the signal structure and examined the performance of CS for the problem of signal detection.However, some auxiliary channels are required to provide the so-called "future knowledge" [13].Later in 2008, Wang et al. introduced a set of detectors called subspace compressive detectors aiming at solving the problem that the compressive measurements are not efficient at gathering signal energy [14].In a recent article, the problem of sparse signal detection based on partial support set estimation with compressive measurements in a distributed network was discussed by Wimalajeewa and Varshney.The basic idea of this article resembled the work of [11] but intensively studied how to determine the minimum fraction of the support to be estimated so that the detection would perform optimally [15].
Nonetheless, it is less universal to impose on the aforementioned schemes because they more or less required prior knowledge about the sparse signals as some schemes exploited greedy algorithms or had other auxiliary conditions to improve performance.
On the other hand, the detection boundary in terms of sparsity and signal strength is another research priority.The sum of two types of error probabilities tends to zero or one depending on whether the mean of sparse signal, denoted by , exceeds this detection boundary.The sparsity exponent  is defined as the proportion of zero components to the signal dimension , while the smallest possible signal strength  is defined as a function of the sparsity exponent, denoted by In [16], if the mean of signal obeys  = √2 log , it will show that the likelihood ratio test could reach asymptotic power when  >  * (), while conversely if  <  * () the test is asymptotically indistinguishable as the total error probabilities would converge to one.Based on this conclusion, Donoho and Jin demonstrated that this boundary still stands for the Higher Criticism which is also called second-level significance testing [17].Afterward, the authors of [16] extended the existing detection boundary phenomenon to high dimensional linear regression and further established the boundaries under the condition that the variance of noise is unknown [18].Moreover, [19] considered the different detection boundaries among the analysis of variance (ANOVA), the Max test, and the Higher Criticism when detecting sparse alternatives.It proved that ANOVA could obtain optimal performance under moderate levels of sparsity, that is,  ∈ [0, 1/2], requiring  to grow as a power of .On the contrary, reliable detection is possible when employing the Max test only if the sparsity is very strong, namely,  ∈ [3/4, 1], requiring  to be on the order of √log , while the detection threshold for Higher Criticism remains unchanged.Inspired by the former papers, the authors of [20] discussed the detection problem under a general sparse mixture model and derived an explicit expression for the detection boundary under mild regularity conditions.To sum up, these papers work out the problem of when the sparse signal is detectable.
In light of these prior works and obeying the detection boundary, we are aiming to put forward a more efficient and practical detection scheme of unknown sparse signal mixture model.Detailed theoretical analyses about the properties of the new scheme are also introduced in this paper.Ultimately, we give out plentiful experiment results generated by detecting synthetic sparse signals and telemetry data.

Theoretical Background
The detection problem discussed in this paper mainly employs hypothesis test as framework.The null hypothesis is assumed to be white Gaussian noise, while the alternative hypothesis is an unknown sparse signal with no prior information except for its sparsity.As is well known, under likelihood ratio test framework, N-P criterion could achieve the largest detection probability among various decision criteria.Furthermore, the sparse estimation technique from Compressive Sampling is incorporated into the framework to estimate the unknown parameter of likelihood function.

Neyman-Pearson Test.
Statistical hypothesis testing is a crucial method to detect and classify signals.Based on the statistical theory, there exist a variety of decision criteria for hypothesis testing, for example, the Bayes Criterion, minimum total error probability criterion, Neyman-Pearson criterion, and Maximin criterion.These criteria have been widely applied to radar signal detection, multisensor nondestructive testing, robot control, medical diagnosis, and other regions.
Instead of calculating the risk as Bayes Criterion requires, Neyman and Pearson [21] were inclined to work out a criterion making   (false alarm probability) as small as possible while making   (detection probability) as large as possible.Afterward, the corresponding lemma which suggests the most powerful testing criterion was put forward referring to as Neyman-Pearson lemma.The test based on this lemma perfectly matches the situation in this paper when no prior information is available and the hypothesis is composite.In addition, the lemma tells us exactly how to find the appropriate threshold separating acceptance and rejection regions for the test [22].
Assume that  is a vector of observations from observation space with distribution   ().Constrain   =   ≤  and then the likelihood ratio test is proved to be the most powerful test where the threshold  is decided by The receiver operating characteristic (ROC) figure plotting   versus   is commonly adopted to describe the performance of the test as a function of the parameter of interest.The larger the area under the ROC curves, the better the performance the test possesses.

𝑙 1 -Minimization of Compressed Sensing. Compressed
Sensing, firstly proposed by Donoho [23], has attracted widespread attention from signal processing, image processing, computer vision, and pattern recognition fields [24].As a cutting-edge technique, the critical observation of CS is that, given the prior knowledge of a signal's sparsity, it is possible to efficiently and accurately reconstruct the full-length signal from the small amount of collected data with sampling rate far below Nyquist rate.
As one of the hottest research spots of CS, the recovery algorithms put forward so far could be roughly divided into three categories [25]: greedy pursuits (matching pursuit), basis pursuits (or  1 -minimization), and combinatorial algorithms.Each type of algorithm has its merits and applicable scenarios. 1 -minimization algorithms succeed with the least measurements and dispensing of prior information at the expense of relatively heavy computational burden.Matching pursuit shows an advantage in computation speed yet it is not the global optimal solution.For combinatorial algorithms, it requires a large number of unusual samples that may not be easily acquired [25].In reality, strong background noise always comes along with the signal.Robust recovery could be achieved by the  1 -error version of basis pursuit even when SNR is fairly low.However, greedy pursuits may perform estimation successfully only if all nonzero components of the signal are somewhat larger than the noise level [26].
1 -minimization is employed in this paper as it best fits the problem in this paper and could provide the relatively accurate estimation value of the signal.The use of  1 norm as a sparsity-promoting function dates back to several decades because minimizing subject to linear equality constraints can be easily transformed to a linear program which could be developed using more efficient solution algorithms [27].Then, we may first formulate a sensing problem and then introduce  1 -minimization.
A signal x ∈ R  can be sparsely represented by a linear combination of  vectors from Ψ. Denote -sparse vector s = {s 1 , s 2 , . . ., s  } to express the coefficients of x under Ψ.Then, one has sample x with a matrix Φ ∈ R × ,  ≪ , where rows of Φ are incoherent with columns of Ψ.Eventually, it is easy to obtain an incoherent observation as where A is defined as the measurement matrix.
After getting the sampled data y, recovering x from ( 4) is an underdetermined linear problem because  ≪ .It is nearly impossible to determine a definite unique solution directly. 0 -minimization method which exhausts all sparse subsets is also an N-P hard problem and is impractical [28].Innovatively, Chen et al. put forward an algorithm called basis pursuit (or  1 -minimization) that selects  1 -norm instead of  0norm to alleviate the dilemma [29].(5) In the case that for some positive constant , the solution to the former equation could be exact with overwhelming probability.In 2005, Candès et al. further demonstrated that, under restricted isometry property (RIP),  0 -norm and  1 -norm optimization have the same and exact solutions [30].
Restricted Isometry Property (RIP) [30].Let A  ,  ⊂ {1, 2, . . ., }, be the  × || submatrix generated by randomly extracting  columns out of matrix A. Then, define the restricted isometry constant   of A as the smallest quantity satisfying the following inequality: for all subsets  with || ≤  and all -sparse vectors x.We may loosely say that a matrix A obeys the RIP of order  if   is not too close to one.As previously described, an approach to generate a matrix A with the RIP of high order is to use random matrices such as random Gaussian matrix [31].Accordingly, there exists the denoising form of  1minimization.Suppose there is an additive noise n ∈ R

Sparse event
Sensor node Basis Pursuit Denoising (BPDN) [29] (Also Known as Lasso).This is represented as follows: where  is the size of n.

LRT-SE Algorithm
As has been stated in the preceding sections, the problem in this paper is more general and practical, for no prior information about the signal is assumed apart from its sparsity.Applying the discussed background theories, the general strategy of LRT-SE is the so-called joint estimation and detection technique, which estimates the parameter value of likelihood function that is the sparse signal by utilizing  1 -minimization.It is important to declare that we are more interested in making a decision rather than estimating the exact value of the sparse signal.In addition, we consider a multiple-observation scenario here in terms of multiple sensors or a receiver receiving signals for multiple times.Consequently, we give an overall judgment of whether the sequence of observations contains sparse signals or only noise.

Problem Model Setup.
We considered a scenario where a total of  sensors are employed as shown in Figure 1 or a receiver is receiving signals for  times in a system, monitoring an event which could be described as a signal s.
Nevertheless, this signal is -sparse in time domain which means apparent impulses only take place at  certain points while in other times the signal remains zero.Meanwhile, our cognition about the signal is limited to its sparsity.The occurrence time and the amplitude of each impulse can take any real number within range.Moreover, the detection suffers from white Gaussian noise which is a harmful effect for monitoring.Thereby, the problem addressed could be formulated as a binary composite hypothesis test.Let (x 0 , x 1 , . . ., x  ) ( ≥ 1) be a sequence of i.i.d.observations from  sensors measuring the event.According to the description of the problem, the binary hypothesis could be formulated as follows: where x  , s, n ∈ R  .s is an unknown -sparse signal with ‖s‖ 0 ≤ , and the noise n is assumed to be Gaussian white noise n ∼ (0,  2 I n ).Therefore, the distributions of the two hypotheses obey Next, a novel scheme under the name LRT-SE is put forward to solve the problem.

Scheme of LRT-SE.
Based on the problem we set above, it is obvious that all tests could never be better than a simple hypothesis test in which the receiver first measures s perfectly and then designs the optimal likelihood ratio test.However, there exists no uniformly most powerful (UMP) test as the likelihood ratio test for every possible vector of s cannot be completely defined without knowledge of s.
A reasonable approach we propose here is to estimate s assuming that  1 was true by applying  1 -minimization and then substitute the estimated signal into the likelihood function as if it was the real signal.

LRT-SE works as follows:
(1) For  = 0, estimate s from x 0 assuming that  1 was true.Denote the estimate s # = s # (x 0 ) by employing the  1 -minimization algorithm in accord with (7): where  is the size of noise, ‖‖ 2 ≤ .
(2) For  = 1, 2, . . .,  ( ≥ 1), the likelihood functions are calculated as where s #  is the th entry of s # and   is the th entry of x  .(3) The likelihood ratio for (4) Set up a reasonable test level  (0 <  < 1) and choose the matching threshold   according to N-P lemma (3).( 5) Ultimately, the detector of hypotheses testing is The structure of the scheme is displayed in Figure 2.

Sparse Estimation Accuracy.
Different from ML estimation which lacks specific analyses on estimation accuracy, researchers have already reached some conclusions about the reconstruction performance of  1 -minimization.On the basis of the aforementioned principle in theoretical background, if the measurement matrix A obeys RIP (7) and the vector  is sufficiently sparse, then Theorem 1 describes the relationship between the level of noise perturbing the measurements and the accuracy of the sparse estimation.
Then, for any signal s supported on  with || ≤  and any perturbation e = An with ‖e‖ 2 ≤ , the solution s # to (8) obeys where the positive constant   only depends on  4 , for example, typically   ≈ 8.82 for  4 = 1/5.
Later on, the Coherence-Based Guarantee for BPDN has been found by Ben-Haim et al. under the assumption of stochastic noise [26] as Theorem 2 describes.
Theorem 2 (see [26]).Suppose that n ∼ (0,  2 I) is a random noise vector.With overwhelming probability, the solution s # to ( 8) is unique and the following inequality holds: for some fairly small  > 0. This theorem presents a stronger performance guarantee because of the assumption of random noise and the characteristic of the estimator behavior for typical noise values.
Utilizing the result from Theorem 1, we proposed a corollary which states the upper bound of noise level under which the estimation could be highly correlated with the original signal.In addition, it provides the prerequisite that we could take N-P test as the UMP test when detecting the sparse signal in a composite hypothesis framework.
Proof.It is easy to discover that the size of perturbation e = An ∈ R  is still white Gaussian noise with e ∼ (0,  2 I m ).
From inequality (15), it is straightforward to obtain From the triangle inequality and the guarantee of the results from Theorem 1, we could get Then, it becomes as follows:      s #     2 ≥ ‖s‖ 2 −   , Therefore, if ‖s‖ 2 >   , denoting  = ‖s‖ 2 2 − ‖s‖ 2    and according to the abovementioned inequality, it yields that For Gaussian noise e ∼ (0,  2 I m ),  2 =  2 ( +  √ 2) is established for large probability when ‖e‖ 2 ≤ .The upper bound could be obtained where Usually,  = 2 or 3 is a reasonable choice and other choices are also possible.

Preliminary Analyses in Finite Dimension.
In practice, we are on target to detect the sparse signal even though it is weak and the noise is substantial.Therefore, the detection scheme is required to be sensitive to the target objective and to be robust against interference.In this paper, detection sensitivity and robustness are measured in terms of two kinds of error probabilities   and   .Next, finite dimensional preliminary analyses will be introduced focusing on these two properties.False alarm probability   and miss detection probability   could reflect the sensitivity and robustness of the detection scheme to the target object, respectively.However, these two properties contradict each other.By adjusting the parameter and increasing the decision threshold, the robustness of the detection system would improve as   would decrease while the sensitivity gets worse with   rising up.On the contrary, lowering the decision threshold makes the system more sensitive since   decreases yet   gets bigger.In reality, 1−  , which is also called detection probability   , is usually utilized to measure the sensitivity of the detection scheme.As adopted in this paper, Neyman-Pearson test could obtain the largest   under certain constrained   .Nevertheless, we could obtain an upper bound of   employing the corollary mentioned before.Lemma 4.Under the test of (13), supposing that  = 1, if the false alarm probability is given as   = , then the following inequality establishes where cos  = s  s # /(‖s‖ 2 × ‖s # ‖ 2 ) and  is the angle between s and s # .
Proof.In the case that there is only one observation, by taking a logarithm, we obtain that test of ( 13) could be equivalent to testing on  = (1/) ∑  =1 x   s # .Also, the threshold changes correspondingly as The distributions of the test  become as follows: As a result, for a test level of  (0 <  < 1), the threshold should satisfy the following equation: where Φ is the CDF of standard normal distribution.
As seen there, it yields that Then, utilizing the inequality for the tail probability of standard normal distribution [32], It is not hard to get Under the condition of high SNR, acting up to similar technique of Corollary 3 that cos  > 1 − 2 s /‖s‖ 2 , we could work out the following inequality: ) .This lemma indicates that the miss detection probability decays exponentially with the SNR and the accuracy of sparse estimation in terms of the angle .Considering the factor of exponent decay rate, ‖s‖ 2 / is the square root of the SNR of a matched filter when s is available, which is totally different from input signal-to-noise ratio, denoted by SNR = ‖s‖ 2 /.The latter declines linearly with the increase of signal dimension .Another factor of the decay rate is  s  which shows the cost incurred by the unknown signal nature where  is the error of sparse estimation.

Experiments and Results
A great deal of simulation experiments of LRT-SE have been undertaken using synthetic sparse signals and typical sparse signal samples from satellite telemetry to verify the properties of LRT-SE.The setup of simulation will be explained in Section 5.1, including the explanation of the sparse signals and the setting value of , , and other parameters.Section 5.2 illustrates plentiful experimental results and compares the performance of LRT-SE against the optimal LRT and other detectors.

Sparse Signals and Parameters Setting.
For the sake of evaluating the proposed detection scheme, both synthetic and real sampled sparse signals are applied to experiments.The synthetic sparse signal s ∈ R  is created by randomly choosing its support set supp(s) fl { | s  ̸ = 0} with sparsity .And the nonzero components are i.i.d.drawn from standard normal distributions.Empirically, energy estimation could be regarded as an insignificant factor in finite settings, which enables us to normalize the energy of s so that ‖s‖ 2 = 1. Figure 3 presents a typical randomly generated sparse signal.It is because the sparse signal is randomly generated that during the simulation s could be deterministic and unknown by the detectors of LRT-SE, while in the case of LRT scheme it is known.
The specific parameters of the simulation experiments on synthetic sparse signal are tabulated in Table 1.
The satellite telemetry data employed in this paper comes from a high altitude satellite consisting of a 1024dimensional synthetic angular momentum of wheel vector, a 512-dimensional yaw angle vector, and a 512-dimensional pitch angle vector, as Figure 4 shows.These data all possess the characteristics of high dimensionality, huge amount, sparsity in time domain, and noise contamination.Moreover, according to index of downlink signal, the signal power of telemetry data is −110 dBm while the noise power is −100.7 dBm.After amplification, the final SNR is approximately −9.3 dB.Also, as has been declared, there exists no prior information about the telemetry data except for their sparsity which accords with the application condition of LRT-SE.

Simulation Results
. Subsequently, a large number of simulation experiments have been conducted and several plots are generated as the result figures show.
Figures 5-7 are results from simulation experiments on synthetic sparse signals.Figure 5 depicts the ROC curves of LRT-SE and LRT.The variation trend of detection probability with respect to SNR is displayed in Figure 6. Figure 7 illustrates the ROC curve of LRT-SE as a function of the number of measurements.These figures come from the average of experiment results corresponding to 10000 test signals mixed with Gaussian noise vectors.
Figure 8 shows the ROC curves of LRT-SE detecting telemetry data.Moreover, the experiment is carried out on 500 signals extracted from the satellite downlink signal files.
(i) ROC Curves of Detecting Synthetic Signals.See Figure 5.
(iv) ROC Curves of Detecting Telemetry Data.See Figure 8.

Discussion
In this section, a detailed discussion on the results presented in previous sections is firstly carried out.Then, we further investigate the high dimensional properties of LRT-SE and discuss the corresponding result figures.

Analytical Discussion
(i) ROC Curves of Detecting Synthetic Signals.From Figure 5, it can be observed that the proposed LRT-SE has approximate performance to the optimal LRT under relatively low SNR.By choosing appropriate   , such as 0.05, the detection rate could reach 0.9 which is acceptable in most cases.The difference of the detection probabilities between the two detectors is obvious when   is less than 0.1 due to the cost of no prior information.
(ii) Detection Probability versus SNR.Similar to Figure 5, in Figure 6, the curve of LRT performs as an ideal detector when SNR is fairly low while LRT-SE and GLRT approach nearly perfect detection when SNR exceeds −5 dB with the sparsity  = 15.This is mainly because when noise is repressed, s  s # would enlarge according to Corollary 3 and bring about improvement of detection probability of likelihood test based on sparse estimation.It should be pointed out that although GLRT seems to own better asymptotic optimality compared with LRT-SE, it is an N-P hard optimization problem utilizing GLRT if the signal is compressed by measurement matrix in advance [33].GLRT has to search every set satisfying {s : ‖s‖ 0 ≤ } making it impossible to calculate for high dimensional signals in practice.Conversely, LRT-SE could estimate the unknown and arbitrary sparse signal with relatively low computational complexity.In general, in the case of strong sparsity, the asymptotic performance of LRT-SE is in the vicinity of GLRT while LRT-SE could be more easily implemented.
(iii) ROC Curves versus Measurements Number.As plotted in Figure 7, it could be discovered that, for a fixed sparsity level of 2, SNR of −5 dB, and  = 1, the detection probability increases as more measurements have been undertaken.The compression ratio is defined as /.It is not hard to expect such a result for the larger the compression ratio is, the more the crucial information may be lost inducing difficulty in accurately reconstructing the sparse signal.Although compression ratio is not specifically demanded in this paper, there exists a lower bound of measurements where  ≥  ⋅  ⋅ log(/) to guarantee the matrix obeying the restricted isometry property, just as Section 3.2 mentioned.
(iv) ROC Curves of Detecting Telemetry Data.Every figure of Figure 8 depicts three curves with  = 1, 5, 10, respectively, because the satellite telemetry system could provide at most ten observations.From Figure 8(a), it is notable that LRT-SE reaches nearly a hundred percent detection rate with   less than 0.05 even if only one observation is provided.It performs near optimally with larger .However, the curves of Figures 8(b) and 8(c) are not as good as in Figure 8(a) which is partly because of the sparsity and the signal amplitude as shown in Figures 4(b) and 4(c).Comparing Figures 4(a) and 4(b), the amplitudes of yaw angle data are less uniform than of angular momentum which is reflected in poorer detection probability.Likewise, it could be indicated that Figure 8(c) shows worse performance because the sparsity of pitch angle data is weaker than of angular momentum data.This phenomenon is worth researching in the future works.
Nonetheless, the loss caused by weak sparsity and irregular amplitudes could be compensated by obtaining more observations as the arrows in Figures 8(b) and 8(c) suggest.If possible, under certain   and SNR, the more the observations available, the better the detection rate that LRT-SE could approach.
6.2.Further Discussion.When discussing the high dimensional condition, asymptotic properties or inferences, for example, consistency, are usually adopted to describe the performance of point estimator and hypothesis test.Further investigation has been conducted under the condition that  → ∞,  → ∞.
(i) Chernoff Consistency.Consistency is a property describing the convergence of error probabilities to zero in some sense, as  → ∞.Among different definitions of consistency, Chernoff consistency requires that both   and   converge to 0 with  → ∞ [34].
Definition 5.The detector (x  ; ) of the hypothesis test is called Chernoff-consistent if and only if there exists a criterion   such that the hypothesis test (13) can perfectly distinguish samples from the binary hypotheses when  → ∞.That is, (1) lim →∞   [(x  ; )] = 1, for any x  under  1 .
(2) lim →∞   [(x  ; )] = 0, for any x  under  0 .The next theorem describes that only if the noise level is controlled within a certain range could a reliable and consistent detector be possible.Theorem 6.One has A ∈ R × , and s is the -sparse unknown signal of interest where  is a function of A. Let {x 0 , x 1 , . . ., x  } ( ≥ 1) be a sequence of i.i.d.observations sampled from one of the two hypotheses in (13).Let (x  ; ) be a detector defined in (14).
Summing up the above, the detector (x  ; ) of LRT-SE is Chernoff-consistent.
We use the smallest total error probability   +   across all possible thresholds to characterize the Chernoff consistency and check if it would reach zero ultimately.The result is plotted in Figure 9 which identifies with the theoretical derivation.When observations are plenty enough, the sum of false alarm and miss detection probabilities decreases to zero for all three sparsity levels.What is more, the stronger the sparsity of the signal is, the quicker the test approaches zero error.This indicates that, in the case of fixed SNR, the less sparse the signal is, the harder it is for detectors to separate it from background noise.
(ii) Error Exponent.Error exponent is the rate at which the error probabilities of the detector decrease as the number of observations increases.It is practical to discuss the error exponent because it provides us with a rough estimate of how many  are needed to control the error probabilities to a fairly small scale.Consequently, the error exponent for a Neyman-Pearson LRT-SE detector is defined as describing the rate at which the miss probability decays exponentially when  → ∞.
Note that the test statistic in ( 23) could be the sum of the of i.i.d.random variables () =   s # , whose distributions are The test has been simplified to better characterize the error exponent.By applying the following lemma, the ideal error exponent could be obtained.
Lemma 7 gives us a quantitative description of error exponent.Afterwards, Theorem 8 introduces the theoretical value error exponent utilizing Lemma 7.

Theorem 8. One has
where  is the angle between s and s # .Proof.The Kullback-Leibler distance is defined as where ℎ 0 () and ℎ 1 () are distributions of (32).After simple calculation of the integral, it can be derived that where cos  = s  s # /(‖s‖ 2 × ‖s # ‖ 2 ).The green curve depicts the theoretical decay curve of LRT-SE as Theorem 8 indicates, while the blue curve displays the actual miss detection probabilities of LRT-SE.It could be revealed that both actual and theoretical curves decay exponentially as the observation numbers increase because they are approximated to a straight line.Nevertheless, there exists an eternal gap between these two curves because the analysis takes no account of constant term.Beyond the result of this figure, it could be inferred and easily verified that the optimal LRT detector also possesses error exponent which could reach 10 −10 order with very few observations.From further discussion, we could reach a conclusion that the performance of LRT-SE could get promoted when  and  are big enough via asymptotic inference analysis.In order to satisfy the sufficient condition of consistency, the noise level should be under a certain limit.In addition, it is expected that the more the observations available, the faster the decay of the detection error probability.

Conclusions and Future Works
In this paper, a new method under the name LRT-SE which integrates the likelihood ratio test and sparse estimation for sparse signal detection is proposed.As long as the background noise level is limited within a relative broad range, LRT-SE could perform in close proximity to the existing asymptotic optimal methods in both finite dimension and high dimension.It is revealed that the miss detection probability has an upper bound in direct proportion to the angle between estimated signal and original signal.But as the number of observations increases and approaches infinity, the false alarm probability and miss detection probability would eventually reach zero.Moreover, the exponential decay rate of miss detection probability could be defined as the error exponent and calculated by large deviation.In short, the LRT-SE scheme provides a novel detection method which could achieve high availability and robustness of detecting sparse signal with no prior information as well as relatively low computational complexity compared with GLRT.
In the future, there are two promising improvement directions for us to do further research.Firstly, the accuracy of sparse estimation has the potential to be improved by employing a better reconstruction algorithm.Secondly, the application of LRT-SE in distributed network deserves deeper research where the distributed network could be a wireless sensor network or in a telemetry system.

Figure 4 :
Figure 4: Snapshots of the sparse telemetry data: (a) original data of synthetic angular momentum of wheel; (b) original data of yaw angle; and (c) original data of pitch angle.

Figure 5 :D
Figure 5: ROC curves of LRT-SE compared to LRT.

Figure 7 :
Figure 7: ROC curve of LRT-SE under different measurement numbers.

Figure 8 :
Figure 8: LRT-SE performance of detecting telemetry data: (a) angular momentum detection performance; (b) yaw angle detection performance; and (c) pitch angle detection performance.

Figure 9 :
Figure 9: Total error probability based on 1000 tests with fixed SNR.

Miss detection probability as a function of the number of observations for the sparse
P F = 0.02, SNR = −10 dB

Figure 10 :
Figure 10: Miss detection probability against number of observations.
(i) Result Figure of Error Exponent.Error exponent is presented in Figure 10 by plotting the miss detection probability against number of observations in exponent coordinates.