CS-PLM : Compressive Sensing Data Gathering Algorithm Based on Packet Loss Matching in Sensor Networks

The data transmission process in Wireless Sensor Networks (WSNs) often experiences errors and packet losses due to the environmental interference. In order to address this problem, we propose a Compressive Sensing data gathering algorithm based on Packet Loss Matching (CS-PLM). It is proven that, under tree routing, the packet loss on communication links would severely undermine the data reconstruction accuracy in Compressive Sensing (CS) based data gathering process. It is further pointed out that the packet loss in CS based data gathering exhibits the correlation effect. Meanwhile, we design a sparse observation matrix based on packet lossmatching and verify that the designedmatrix satisfies the Restricted Isometry Property (RIP) with a probability arbitrarily close to 1. Therefore, reliable transmission of the compressed data can be guaranteed by adopting the multipath backup routing among CS nodes. It is shown in the simulation results that, with a 60% packet loss ratio of the link, the CS-PLM algorithm can still ensure the effective reconstruction of the data gathered by the CS algorithm and the relative reconstruction error is lower than 5%. Therefore, it is verified that the proposed algorithm could effectively alleviate the sensitivity to packet losses for the CS based data gathering algorithm on unreliable links.


Introduction
The nodes in the wireless sensor networks (WSNs) are usually densely deployed and a lot of redundancy exists in the data gathered, which leads to the waste of the energy of the nodes.The compressive sensing (CS) algorithm is a new technique which could largely reduce the sampling frequency and execute the sampling process in parallel with the compression process.As a result, this technique has drawn much attention by researchers.
In order to balance and reduce the energy consumption of the nodes as well as prolong the network lifetime, researchers have proposed data gathering algorithms based on compressed sensing.At present, most of these algorithms are focused on how to effectively reduce network energy consumption and extend the network lifetime [1][2][3].For example, it was proposed in paper [4] to employ sparse measurement matrices to reduce the communication cost of each measurement.The spatial and temporal correlation of the sensing data was exploited in [5] to improve the compression ratio and further reduce the number of measurements.A multilevel hierarchical clustering topology was employed in [6] to gather the data in the network as well as reduce the number of sent and received packets at each layer of nodes.As a result, the total number of transmitted packets is reduced in the entire network.It was pointed out in [7] that the block diagonal measurement matrix could guarantee the reconstruction accuracy with a smaller number of measurements and a longer network lifetime.In recent years, with the progress in the theory of CS, researchers have started to investigate CS based data gathering algorithms for practical applications.The fact was considered in [8] that the data sparsity in realistic data sets would vary with time and space.Therefore, it was proposed to employ the autoregressive AR model to predict data changes and adaptively adjust the number of measurements to achieve the optimal reconstruction performance.It was pointed out in [9] that the environmental noise of the wireless links imposes prominent influences on the transmission of 2 Wireless Communications and Mobile Computing the undersampled CS data in the network.An approximate gradient descending algorithm was therefore proposed to reconstruct the compressed data under the influence of noise.For the TF-packet loss problem of wireless links in practical application scenarios, there are relatively few works studying CS based data gathering algorithms.Due to the dynamic and asymmetry of wireless links, channel interference, improper antenna direction and height, etc., unreliable links are often the key issue faced by data gathering algorithms in practical sensor networks [10].
For the CS based data gathering problem on unreliable links, we propose a compressive sensing data gathering algorithm based on packet loss matching (CS-PLM).In this algorithm, the nodes in the network are divided into two types, i.e., the traditional forwarding (TF) nodes and the compressive sensing (CS) nodes.The packet loss of TF nodes does not exhibit the correlation while the lost packets of CS nodes are strongly correlated.In the process of the Compressive Sensing data gathering, a packet loss will lead to the loss of the data gathered from multiple nodes; the packet loss correlation effect is caused by the superimposed transmission of the collected data from each node of the multihop link in the CS compression sampling process.The closer the packet loss node   is to Sink, the greater the effect of packet loss is.In particular, if the packet of the next-hop neighbor node of Sink is lost, the correlation effect will result in the loss of data collected by some nodes of the whole network.While TF nodes only relay data in the traditional way of data gathering, packet loss has no correlation.As a result, we design a sparse measurement matrix based on the packet loss matching to recover the gathered data if packet losses occur at TF nodes.Therefore, the recovery problem for lost packets is transformed into the sparse matching sampling process in CS.However, for the lost packets at CS nodes, we design the multipath backup transmission scheme to guarantee the reliable data transmission and avoid the correlation effect for packet loss.Therefore, the impacts of unreliable links are alleviated for the CS based data gathering process and the reconstruction accuracy is guaranteed.
The main contributions of this paper are as follows.
ABy analyzing the routing with tree structure, it is pointed out that the packet loss would seriously undermine the reconstruction accuracy of the CS based data gathering process and the packet loss in the CS based data gathering process exhibits the correlation effect.
BWe design a SPLM measurement matrix and further prove that this matrix satisfies the Restricted Isometry Property (RIP) with a probability arbitrarily close to 1.
CWe propose a multipath backup routing transmission scheme based on Hybrid CS to guarantee the reliable handover of the CS projection data.

Related Works
Due to its feature of simple encoding and complex decoding, compressed sensing theory has been widely applied to the area of data collection in WSNs.At present, the research of CS based data collection algorithms in wireless sensor networks mainly focuses on how to use CS technology to reduce the network energy consumption of the data gathering process in WSNs.Most of these works assume that the network link is an ideal link where the impact of packet loss on the CS based data gathering process is ignored.The number of transmitted data packets was presented in paper [11] under the tree topology with and without CS based data gathering algorithm.Furthermore, a hybrid CS based data gathering method was proposed which combines the conventional relay based data gathering method with the CS based data gathering method.It was shown that the network energy consumption could be further reduced by less transmitted data packets in the proposed protocol.
The application of CS was investigated in [2,12] under the clustering routing structure.Since single-hop transmission is employed in the clustering topology, the packet loss on the links does not exhibit the correlation.As a result, the CS based data gathering algorithm is insensitive to the packet loss on the links.A CS based data gathering algorithm was proposed in [13] for unreliable links under the cluster topology where the column vectors of the measurement matrix are adjusted according to the packet loss nodes in the cluster.Therefore, the influence of packet loss on CS based data reconstruction can be alleviated.The tree-like multihop network routing topology is often used for large-scale WSNs.The CS based data gathering algorithm with multihop routing is studied in paper [14,15], where the unreliability of the wireless link was ignored while special attention was paid to the optimal matching between the measurement matrix in CS and the structure of the tree-like routing topology.However, the transmission of CS data packets under the multihop routing requires the weighted superposition of the data from multiple nodes.Multiple original data can be lost once one packet loss occurs.Therefore, the transmission of CS data packets is highly sensitive to the packet loss under the multiple-hop routing.
It was shown via simulations in [16][17][18] that the data reconstruction accuracy under the tree topology can be seriously affected by the packet losses on the link and the Sparsest Random Scheduling (SRS) was further proposed for CS based data gathering in lossy WSNs.In this protocol, a sparsest measurement matrix is constructed according to the reception condition at the Sink end, which is further employed to reconstruct the original sensing data for all the nodes in the network and alleviate the influence of packet losses on CS data reconstruction [19][20][21][22].However, this algorithm is only limited to the application scenarios where the spatial correlation of the sensing data in the network is relatively strong.
There are other data gathering methods in conventional WSNs such as ARQ, multipath transmission, network coding, etc.However, there are relatively few works studying the reliable CS based data gathering algorithm in WSNs.Furthermore, the CS based data gathering algorithm is much more sensitive to packet losses than conventional methods.Therefore, the study of CS based data gathering algorithm on unreliable links is quite meaningful to the application of CS theory in practical scenarios.
In wireless sensor network, serious packet loss will undermine the communication performance, service quality, and application effect of sensor network.In recent years, the research premise of the CS based data gathering theory is the ideal link, and, because of the dynamic characteristics of the wireless link, channel interference and asymmetry of conflict, the wrong direction, and height of antenna, the unreliable link issues are commonly encountered in practical applications.There are many methods to ensure the reliable transmission of links in the traditional data gathering methods of wireless sensor networks, but to the best of our knowledge, there is little work for reliable sensor network data gathering method based on compression perception.In addition, the sensitivity of the CS data gathering method to link packet loss is much higher than that of traditional data gathering, so the research on compressed sensing data gathering algorithm under unreliable link is of great significance to the application of compressed sensing technology in real sensor network.

Network Model and Problem Description
The CS is a new technique which samples the sparse signal with a frequency below the Nyquist sampling frequency and achieves the projective transformation of the target signal from a high-dimension space to a low-dimension one.The accurate reconstruction of the compressed signal is achieved via the optimal reconstruction algorithm which is widely studied and applied in many areas due to its excellent compression performance.
Assume that  nodes are randomly deployed in the WSN and the gathered data is denoted as  = ( 1 ,  2 , . . .,   ) T .Assuming that  is sparse with respect to base Ψ × and the measurement matrix is Φ = () × , The received vector  ×1 can be expressed as  = (  ) ×1 = Φ ⋅ Ψ T ⋅ .The Sink node can reconstruct the original data with certain accuracy by solving the optimization problem in the following: where Θ = Φ ⋅ Ψ  is the sensing matrix and ‖‖  is the   norm of the sensing data vector  →  which is defined as In the data gathering process of WSNs, each round of CS based data gathering is performed with  times of independent measurements, which is expressed as follows: Assume that  ordinary sensor nodes and one immobile Sink node are deployed in the WSN.All the sensor nodes are uniformly and randomly deployed with fixed locations in a monitoring area of size a×a.The Sink node is at the center of the monitoring area while the sensor nodes periodically gather and transmit the sensing data to the Sink node.Furthermore, the transmission power of the sensor nodes can be adjusted dynamically and adaptively.The Sink node is assumed with strong computation capability so that it can periodically gather and reconstruct the sensing data and acquire the location information for all the nodes in the network.The Minimum Spanning Tree (MST) routing is established by all the nodes in the network to perform data gathering, i.e., a connected undirected graph  = (, ()) is constructed where  = {V 1 , V 2 , . . ., V  } is the set of sensor nodes, () = { 1 (),  2 (), . . .,   ()} is the set of links in the MST, and   (q) indicates that the link is connected with probability .If we set p=1-q, then p indicates the packet loss ratio of the link.In addition, the CS technique is employed for data gathering in the WSN, which exhibits the following features: A Discrete Fourier Transform (DFT) is employed for the sparse transformation base Ψ × of the sensing data vector.The sparse transformation and the orthogonal sparse base are presented in ( 5) and ( 6), respectively.When  measurements are received at the Sink end, we employ the Orthogonal Matching Pursuit (OMP) algorithm to reconstruct the original sensing data.B The relative error  is adopted in (7) as the metric to indicate the CS based reconstruction accuracy and a lower  means more accurate reconstruction.If the relative error is higher than 5%, the reconstruction is considered as a failure.
The CS based data gathering process on unreliable links under the tree-like topology is illustrated in Figure 1.
If packet loss occurs on the link between S 5 and the Sink end, all the packets corresponding to S 5 as well as the child nodes of S 5 will be lost, which is shown in the frame in Figure 1.Therefore, the loss is more serious if the packet loss occurs on a link closer to the Sink end.Furthermore, since the weighted superposition data packets are transmitted between nodes in the CS based data gathering process, the Sink endalso receives a weighted superposition data after each measurement.As a result, the Sink node cannot acquire the information on whether packet loss occurs or the number of lost packets.Assume that the Sink end receives the weighted superposition data of all the nodes in the network as the current measurement, based on which the reconstruction is performed to recover the original sensing data X.
Therefore, the CS based data gathering on unreliable links exhibits the following features: A one packet loss on the link will result in the data loss of multiple nodes; i.e., the packet loss exhibits the correlation effect.B The Sink node has no information on the packet loss situation for the nodes in the network and regards the measurement data of the nodes in the network as the data projection to perform reconstruction.
That is, the sensing data for compression does not match the sampling of the measurement matrix.

Design for SPLM Measurement Matrix.
In order to solve the mismatch problem between the sensing data and the sampling of the measurement matrix, we design a sparse measurement matrix based on Packet-loss Matching (SPLM).
In each measurement, the information of the packet loss node is omitted by the measurement matrix.As a result, the packet loss problem for CS based data gathering under tree topology is transformed into the measurement matrix projection problem based on sparse matching.Hence, the large-scale measurement and sampling are accomplished for the data in the network; meanwhile erroneous judgement for the data gathering situation can be avoided at the Sink end.
The detailed realization of this process is as follows.
Definition 1 (link state matrix, LSM).The LSM is defined as the matrix which records the link state information with size M×N, where  is the number of measurements and  is the number of nodes in the network.The entries in LSM are defined as follows: Definition 2 (dense random projections, DRP).Each row of the DRP matrix contains O(N) nonzero elements and the DRP matrix is usually constructed with The sparse measurement matrix based on packet loss matching employs the randomness of the packet losses on realistic links to construct the random sparse measurement matrix.The construction process can be achieved by multiplying the LSM with the DRP element-wise, as shown in If the packet loss ratio of the unreliable link is p, then each entry of the SPLM matrix is defined as follows: The design of the measurement matrix should guarantee that most of the orthogonal base satisfies the RIP constraint.However, the proof of the RIP condition is a NP-hard problem.It was pointed in [11] that if the measurement matrix is full-rank, then, after the projection of this matrix, the data can be accurately reconstructed with a probability arbitrarily close to 1. Due to the fact that each element in the SPLM matrix follows the discrete random distribution in (11), each row of Φ  can be regarded as a random sequence generated by the random variable   , which can further expressed by a discrete random process {(),  = 1, 2, . . ., }.Theorem 3.For a matrix Φ  = ( 1 ,  2 , . . .,   )  with independently and identically distributed (, , ) random and discrete sequences   , if the random variable   which constitutes the sequence follows the distribution in (11), the matrix Φ  will be full-rank with a probability arbitrarily close to 1.
Proof.Assuming that matrix Φ  satisfying the conditions above is not full-rank, i.e., a set of coefficients exists for the ith row of the matrix so that And not all of the coefficients  1 ,  2 . . .  are zeroes.Define the random process {(),  = 0, 1, . . ., } as the row vector   , the mean, and the variance functions are as follows: Define the random process {(),  = 0, 1, . . ., } as  1  1 +  2  2 + ⋅ ⋅ ⋅ +  +1  +1 + ⋅ ⋅ ⋅ +     ; then the mean and the variance functions are Wireless Communications and Mobile Computing 5 Therefore, X(n) and Y(n) denote different random processes, respectively.For the discrete random process X(n), the possible values of the random variable X(i) are () ∈ {+1, −1, 0}.Then the length of the state space   is 3  .For the discrete random process Y(n), the possible values of the random variable Y(i) are −+1 ≤ () ≤ −1 with () ∈ .Therefore, the length of the state space   is (2M-1)  .
Define the case that (12) holds as event A, the case that not all of the coefficients  1 ,  2 . . .  are zeroes as event B, and the case that only one of the coefficients  1 ,  2 . . .  is nonzero as event C. Then Solving the probability (A | C) can be transformed into solving the probability that two i.i.d random processes  1 (n) and  2 (n) have the same state simultaneously.According to the distribution in (11), different states have different probabilities in the state space for the random process X(n).For the simplicity of analysis and without loss of any generality, we set the parameter in (11) as p=1/3.Then Therefore, the probability of event A is extremely small; i.e., the original assumption does not hold and the matrix Φ  is full-rank with a probability close to 1.
To evaluate the performance of the SPLI matrix, the classical CS data gathering algorithm, i.e., the CDG algorithm [18], is selected for comparison.Three different methods were used for the comparison of packet loss processing, as shown in Figure 2, where the CDG-DRP means applying the CDG to an unreliable tree topology with correlation effect of packet losses.A dense measurement matrix is used to measure the entire network data; CDG-SPLI indicates that the CDG algorithm is applied to the unreliable tree topology where packet losses exhibit correlation effects.The sparse measurement matrix based on packet loss tags is used to measure the data of the entire network; the CDG-SPLI-NC indicates that the CDG algorithm employs sparse measurement matrix based on packet loss tags to measure the network data without considering the correlation effect of packet losses.It is shown in Figure 2 that the reconstruction accuracy of the CDG-SPLI algorithm and the CDG-SPLI-NC algorithm is significantly higher than that of the CDG-DRP algorithm which utilizes a dense measurement matrix.This is due to the fact that the Sink end of the CDG-DRP algorithm suffers from the incorrect judgment of data packet reception condition.However, this problem can be avoided by the CDG-SPLI and CDG-SPLI-NC algorithms which employ the SPLI matrix for measurement.It can be seen that the misjudgment of the data packet reception condition at the Sink end can seriously deteriorate the reconstruction accuracy of the CS based data gathering algorithm.As for the two algorithms employing the SPLI measurement matrix, the CDG-SPLI-NC algorithm does not consider the correlation effect of the packet losses and the CS based data reconstruction exhibit a high accuracy.The relative reconstruction error can be as low as 1.8% when the packet loss ratio is 40%.However, for the CDG-SPLI algorithm where the correlation effect exists, the relative reconstruction error exceeds 5% when the packet loss ratio is beyond 10%; i.e., the reconstruction fails.This is due to the fact that the correlation effect would cause the loss of the entire network data when the packet loss ratio is high.Therefore, the severe lack of CS measurements further deteriorates the reconstruction accuracy.Henceforth, in the data gathering process with correlated packet losses, it is not sufficient if we simply solve the misjudgment of the packet reception situation at the Sink end.

CS-PLM Algorithm.
For the unreliable link under the tree topology, the CS based data gathering not only exhibits the correlation effect for packet losses but also suffers from the problem of misjudgment on the data packet reception situation at the Sink end.However, the SPLI matrix can only solve the misjudgment at the Sink end.Therefore relevant mechanisms still need to be studied to solve the problem of correlation effect in the process of CS based data gathering.Essentially, the correlation effect of the packet loss is caused by the weighted superposition processing of data packets during the CS based data gathering process, which is also the advantage of CS based data gathering.Therefore, the most effective method for solving the correlation effect is to guarantee the reliability of link transmission and avoid the appearance of correlated loss.The cost of the guarantee for the reliable transmission in the whole network link is huge.To reduce the maintenance cost of the network, according to the performance analysis of the SPLM measurement matrix, this paper designs a hybrid CS method for the data gathering in the network and divides the nodes of entire network node into traditional forwarding (TF) nodes and CS node, where the TF node only forwards data in a traditional data gathering manner, and the packet loss does not exhibit the correlation effect.The CS node transmits and receives data in a CS based data gathering manner and the packet losses are correlated.Therefore, for the data gathering between TF nodes, simply adopting the SPLM measurement matrix can overcome the impacts of packet losses on CS based data reconstruction.However, for CS nodes, in addition to using the SPLM measurement matrix, a corresponding mechanism must be designed to ensure the reliability of data transmission between CS nodes.
This paper designs a transmission mechanism based on multipath backup routing to ensure the reliability of data transmission between CS nodes.Under normal conditions, the CS node uses MST routing to transmit and receive data packets.If a packet loss occurs on the CS link, the transmitting node   of the data packet will choose another the transmission path and use the backup path to send data packets to the Sink.The node   can be seen as the source while the Sink is the destination node.There are many ways to construct the route from the source to the destination node.To reduce the energy consumption of the backup path transmission, the minimum energy consumption spanning tree is employed for constructing the routing.The energy consumption model of the network is as follows: where  1 is the energy consumption coefficient of the circuit and  2 is the power amplifying coefficient, d is the transmission distance, and  is the path loss factor with 2 ≤  ≤ 5.The value of n usually takes 2 in the free space.Therefore, the minimization problem of the energy consumption for the routing can be modelled as the optimization problem as in (20), where  0 is the distance from the source node to the destination node, K is the number of hops, and   is the link distance between nodes.min ( ) The above optimization problem is solved using the Lagrangian multiplier method.The energy consumption of the network is the minimum if and only if the distance between the destination node and the source node is the same for each hop.We further present the value of the characteristic distance  ℎ as in (21).Finally, the optimal number of hop   takes the maximum between  and  ℎ .We propose a centralized construction method for the backup path, which can be divided into the following four steps: A according to the location information of each node, the Sink node calculates the distance  from itself to the nodes; B the Sink end compares the distance  to each node with the characteristic distance  ℎ .If  ≤  ℎ , a single-hop backup path had constructed.Otherwise, the optimal number of hops   is calculated first.Then according to   and the equal distance principle for each hop on the link from the CS nodes to the Sink end, the ideal locations of the relay nodes are further derived.C According to the ideal locations of the relay nodes, the Sink end chooses the CS nodes nearest to the ideal locations as the relay nodes.D The constructed routing with the minimal energy consumption is broadcast from the Sink node to the CS nodes and related relay nodes.
The CS-PLM algorithm first divides the nodes in the entire network into TF nodes and CS nodes.The MST routing tree is built for nodes in the entire network where the number of child nodes for node  in the network is   .Specifically,   =  for the Sink end and   =0 for the nodes at the end of the links.Define =M-1 as the threshold for discriminating the node types and the decision is made by each individual nodes.If   >, the node  participates in the data gathering in a CS method and is defined as the CS node.If   ≤, the node participates in the data gathering in a traditional relay method and is defined as the TF node.Therefore, in the data gathering process, the number of transmitted packets PN(i) for node  is During the data gathering process of the CS-PLM algorithm, the TF nodes gather data along the MST routing in the traditional relay method.As shown by the white nodes in Figure 3, the packet loss does not exhibit the correlation effect.However, the CS nodes gather data along the MST routing in the CS method.As shown by the black nodes in Figure 3, the packet losses are correlated.In the MST routing, for the CS nodes directly connected to the TF nodes, all the gathered node data needs to be weighted and superimposed after completing the data gathering for all child nodes.The superimposed data is further combined into one data packet which is suitable for transmission between CS nodes.Related flag bits are included in the node ID part of the data packets from the child nodes to identify the data reception status of each node.
In the case of an unreliable link, no processing will be performed if packet losses occur in the link between TF nodes.However, if a packet loss occurs in the link between CS nodes, the data packet is directly sent to the Sink end using the minimum energy consumption backup path.After each round of data gathering, the Sink node builds an SPLM measurement matrix according to the packet loss during each measurement process and further employs the SPLM measurement matrix and the  measurements received at the sink end to reconstruct the original sensing data.which the node networking, the accumulation of the a priori information for the reception state of each link, and the configuration for the node measurement vector are accomplished.At the second stage, the CS based data gathering is performed on the lossy link and the effective CS based sampling and gathering is achieved for all the nodes in network.At the third stage, the CS based reconstruction is achieved for the sampled data and the original data is further acquired for the nodes in the network.The operation of the algorithm is detailed as follows: (1) Initialization of the sensor network: initiated by the Sink end, the minimum spanning tree (MST) routing is established to achieve the node networking.Then the Sink end broadcasts the heartbeat data packet to all the nodes in the network.At the reception of the heartbeat data packet, the nodes in the network turn on its own timer.Within the time period T1, the nodes transmit and relay the data packet along the routing to the Sink end and the data packet contains the heartbeat information.In this process, each node tracks the real-time packet reception state on its own reception link and stores the result in its memory.This result in the memory will be further employed as the a priori information for the packet loss type prediction which is based on the sliding window scheme.That is, the reception sequence {  } for each node is initiated till the end of T1.Then the Sink end first broadcasts the random seed .At the reception of the random seed, node  combines the seed with its own ID and generates the message (, ).A column of the measurement ( i1 , i2 ,. ..,  )  is further generated for each node and stored in its own memory.
(2) CS based data gathering: in the CS based data gathering process, each node multiplies the gathered data di with the corresponding measurement coefficient  ij according to the routing.Then the messages are added up sequentially and relayed to the Sink end.When a node recognizes its packet loss in this process, the probability density functions f 1(z) and f 2(z) of the random variable z under two different packet loss types are determined according to the current bit error rate Pb of the link and the data packet length .According to (15), the decision threshold  for the two types of packet loss is calculated.In the sliding window, the value of the random variable  is calculated and we further compare the value of  with the threshold .When z<, the retransmission scheme is employed for packet recovery.Assuming that the maximal number of retransmission is max num, the sliding window is updated in a real-time method during the retransmission.If max num is reached and the received data of the node still remains unrecovered, then the value of  will be compared with the threshold  according to the latest update in the sliding window.The current packet loss type is further predicted again according to the comparison result and the corresponding recovery schemes will be adopted.Specifically, the prediction based on the correlation of the temporal sequence is adopted when z> and a k-order temporal correlated sequence Hk is constructed for the node with lost packets.According to (17), the prediction value ℎ̂  + 1 is calculated for the lost packet and further transmitted to the next one-hop node, assuming that the predicted packet is identical to the lost packet.Therefore, the CS based measurement and sampling is finished and the prediction order  is updated.
(3) Data reconstruction: when the Sink end receives  measurements in one round of data gathering, the measurement vector  = ( 1 ,  2 , . . .,   )  is constructed and the Sink end reconstructs the measurement matrix (  ) × according to the random seed  and the IDs in the whole network.According to the sparse base Ψ × , the CS reconstruction algorithm is employed to reconstruct the sparse signal  and we can recover the original signal vector  by the calculation of d=Ψ⋅S.According to the process stated above, the prediction of the packet loss type, lost packet retransmission, and the temporal correlation based prediction can be simply executed serially and the execution complexity is O(1).As a result, for performance-limited single nodes, this algorithm could guarantee the real-time online operation.For the whole network, at least N×M data packets have to be received or transmitted for one round of data gathering operation of the CS-PLM algorithm where  is the number of nodes in the network while  is the number of measurements.During the operation of the CS based data gathering algorithm, receiving or transmitting one data packet involves one multiplication, some additions, and no complex regressive operations.Furthermore, according to the CS theory, the number of measurements M≥O ( log ) where  is the sparsity degree of the original data, which could be regarded as a constant.By considering the whole network, the complexity of the CS-PLM algorithm is O( log ), which indicates that the CS-PLM algorithm causes no additional computations in comparison with conventional CS densely projective data gathering algorithms.

Performance Evaluation
In order to evaluate the performance of the CS-PLM algorithm, we employ the tool MatlabR2014a to perform the simulations.Based on the network model in this paper, we assume that the network is deployed in a 200 × 200 monitoring area and 400 sensor nodes are placed in the network.The original data sources are assumed to abide by the twodimensional Gaussian distribution.
The reconstruction accuracy of the CS-PLM algorithm is illustrated with different network packet loss ratio in Figure 4.For comparison, we also investigate the performances of the Sparsest Random Scheduling (SRS) based CS algorithm [2] and the CDG-SPLM algorithm.It is shown in Figure 4 that the reconstruction error performances of these three algorithms increase with the packet loss ratio while the CS-PLM algorithm outperforms the others.The performance gap in terms of reconstruction error between the CS-PLM algorithm and the others also widens with the packet loss ratio.The reason for the performance gap lies in the fact that the SRS algorithm employs the sparsest measurement matrix to reconstruct the original data.On the one hand, the amount of sampled data is far from enough for the purpose of measurement.On the other hand, packet loss will reduce the number of measurements and the reconstruction accuracy is further undermined due to the lack of sampling.Although the CDG-SPLM algorithm could overcome the misjudgment problem for the packet reception state at the Sink end, the correlation of the packet loss in the CS based data gathering process would seriously affect the reconstruction accuracy.However, the proposed CS-PLM algorithm could accomplish as many measurements as possible and overcome the misjudgment problem at the Sink end as well as the packet loss correlation.These improvements could guarantee the reliability of the CS based data gathering process on unreliability links and the highly accurate CS based reconstruction.Furthermore, according to the simulation results, the CS-PLM algorithm could still guarantee the effective reconstruction for the gathered data with a 60% packet loss ratio, which proves the robustness of the proposed algorithm.In order to evaluate the performance of the algorithm under different degree of correlation, we choose two data sets with different degree of sparsity; i.e., the results are illustrated in Figure 5 for the reconstruction accuracy of the CS-PLM algorithm and the SRS algorithm, where the link packet loss ratio is set to 20%.It is shown in Figure 5 that, for the same algorithm, a lower sparsity degree would improve the reconstruction accuracy because the data correlation degree increases with the sparsity degree and, according to the CS theories, the reconstruction accuracy can be further improved.When the sparsity degree is set to 11, the proposed algorithm only slightly outperforms the SRS algorithm but when the sparsity degree is 23, the performance of the CS-PLM algorithm is far better than that of the SRS algorithm.This is because when the data is weakly correlated, the SRS algorithm could not recover the data precisely since it is only based on a small amount of compressed samples.However, the CS-PLM algorithm accomplishes as many measurements as possible and the effective reconstruction can thus be guaranteed for data sets with common degree of correlation.Generally, the CS-PLM algorithm could effectively reduce the strong dependency of the CS based data gathering algorithms on the correlation of the data set.The reliable CS based data gathering can be further guaranteed on unreliable links for data sets with ordinary correlation degree.
The network lifetime performances for different algorithms are compared in Figure 6 where the CDG-Retransmission algorithm further adopts the retransmission scheme based on the CDG algorithm to tackle the packet loss on unreliable links.The network lifetime is defined as the number of rounds from the beginning to the first node failure.It is shown in Figure 6 that the proposed CS-PLM algorithm increases the network lifetime by 400% and 140% in comparison with the SRS algorithm and the CDG-Retransmission scheme.This is due to the energy imbalance of the SRS algorithm which causes the capability vacancy and the early failure of the network.The retransmission scheme is adopted  for the CDG-Retransmission algorithm for packet loss recovery while more packets are transmitted and the network lifetime is therefore shortened.
According to the theory of CS, the correlation between the measurement matrix and the sparse base of the data will affect the reconstruction performance of the algorithm.To verify the performance of the CS-PLM algorithm under different sparse basis, we choose two sparse bases, i.e., the Discrete Cosine Transform (DCT) and Discrete Fourier Transform (DFT) as a comparison.Related results are shown in Figure 7.It is shown in Figure 7 that, with different packet loss ratio, the proposed algorithm performs better in the DCT base; i.e., the DCT base exhibits weaker correlation with the SPLM matrix in comparison with the DFT base and we can obtain a higher reconstruction accuracy by choosing the DCT base to perceive the changing sparsity degree.
The impacts of packet loss ratio on relative reconstruction error are illustrated in Figure 8 with different number of measurements.The number of measurements for the CS-PLM algorithm is set to 60, 80, 120, and 140, respectively.It is shown in Figure 8 that when the number of measurements is 60, the relative reconstruction error changes significantly with the packet loss ratio and the increasing packet loss ratio would undermine the recovery accuracy.However, when the number of measurements increases, the impacts of packet loss ratio on the reconstruction accuracy are further alleviated.For example, when the number of measurement is 140, the relative reconstruction errors are almost the same for the packet loss ratio of 10%, 30%, and 50%.The reason behind this phenomenon is that the proposed CS-PLM algorithm adopts the measurement matrix based on packet loss matching to sample the data in the network.The increase of the packet loss ratio would lead to a higher sparsity degree in the measurement matrix.In order to maintain the reconstruction accuracy, the number of measurements DFT, packet loss rate=40% CDT, packet loss rate=40% DFT, packet loss rate=20% DCT, packet loss rate=20% should be increased to deal with the higher sparsity degree.Therefore, more measurements can alleviate the impacts of the packet loss ratio on the reconstruction accuracy to some extent.
When the decay coefficient of the event source is n=0.01 and the neighborhood range is r=2, the performance of the CS-PLM algorithm is shown in Figure 9.When the bit error rate is relatively low, i.e., Pb=10 −5 , the performances of these three algorithms are almost the same.However, when the bit error rate is as high as Pb=10 −3 , the CS-PLM algorithm outperforms the others.It is shown in Figure 7  affects the performance of the CDG algorithm.The SNR for data reconstruction of the SRS-DG algorithm is 29.72dB.Since the SRS-DG algorithm is designed for recovering block loss, the data block is abandoned once transmission error occurs and the error-free nodes are measured with the sparse measurement matrix.Therefore, the measured information is reduced every time.The lost data block is then compensated by increasing the number of measurements through the next round of data gathering and the reconstruction SNR is therefore increased.As a result, the SNR is not high for this round of data reconstruction.The data reconstruction SNR obtained by the CS-PLM algorithm is 35.91dB.The transmission error is predicted by the spatial correlation of the data under certain conditions and the amount of abandoned information is therefore reduced.Henceforth, in the wireless scenarios with high bit error rate, the CS-PLM algorithm does not cause additional communication energy consumption and it can overcome the impacts of erroneous data blocks on data reconstruction.The efficiency of the CS-PLM algorithm is therefore verified.For the CS-PLM algorithm, under the premise of ensuring the constant data compression rate, different network sizes require different measurement numbers in the data gathering process, which in turn causes the changing threshold for the judgment of node type.Therefore, different network size leads to different proportions of node types, which not only affects the total packet throughput of the network, but also affects the complexity of constructing the backup paths between CS nodes as well as the reliability of their transmission.Therefore, the performance of the algorithm is affected by the network size and related results are illustrated in Figure 10, where the packet loss ratio on the link is set to 20%.According to the curves in Figure 10, it is shown that as the number of nodes increases, the proportion of CS nodes decreases in the network.As for the reason, in large-scale  WSNs, the branches in tree topologies will be rather large and there are many smaller branches.However, according to the node classification rules in hybrid CS schemes, usually only the nodes in the main branch will be classified as CS nodes.Therefore, the proportion of CS nodes decreases, so does the complexity of constructing backup paths between CS nodes.According to the curves of the relative reconstruction error in Figure 10, the reconstruction accuracy of the CS-PLM algorithm improves with the increasing number of nodes in the network.This is due to the fact that as the proportion of CS nodes in the network decreases, the packet losses occurring in the network are uncorrelated with a larger probability.As a result, the reconstruction is less affected by the reliability of the backup paths and higher reconstruction accuracy can be guaranteed.According to the simulations above, the CS-PLM algorithm exhibits better performances in large-scale WSNs.Since the SPLI measurement matrix is designed in the CS-PLM algorithm, according to the CS theory, the correlation between the measurement matrix and the sparse base will affect the reconstruction performance of the algorithm.In order to investigate the performances of the CS-PLM algorithm under different sparse bases, we adopt the DCT base and the DFT base for comparison, as shown in Figure 11.It is shown that, with different number of measurements, the algorithm performs better under the DCT base.Therefore, it can be concluded that the SPLI matrix exhibits a lower correlation with the DCT base.That is, a better reconstruction performance of the algorithm can be ensured by choosing the DCT base to sense the changing of the data sparsity.

Conclusions
In order to address the CS based data gathering problem on unreliable links, we have proposed a CS-PLM algorithm.We designed the SPLM measurement matrix by analyzing the influence of the packet loss on CS based data gathering  and further verified through simulations that the correlation of packet losses would undermine the reconstruction performance of the SPLM measurement matrix.Therefore, the nodes in the network are divided into TF nodes and CS nodes.Packet losses between TF nodes do not exhibit correlation and we simply adopt the SPLM measurement matrix to perform measurement projection.However, besides adopting the SPLM measurement matrix for measurement projection, the CS nodes also guarantee the transmission reliability by the minimum energy consumption backup paths and avoid the occurrence of correlated lost packets.It was shown in the simulation results that when the packet loss ratio on the link is 60%, the CS-PLM algorithm could still guarantee the effective reconstruction of the compressed data.Compared with other algorithms, the proposed algorithm showed great improvements in terms of reconstruction accuracy and the sparsity degree of the data set.It can accurately reflect the influences of the network size, sparse base, and the number of measurements for data gathering on the performance of the CS-PLM algorithm.Future work may focus on the impacts of packet loss ratio for mobile nodes on the reconstruction accuracy when the network flow is sufficiently large or small.

Figure 1 :
Figure 1: CS based data gathering on unreliable links.

Figure 2 :
Figure 2: Performance comparisons for sparse measurement matrices based on packet loss tags.

Figure 4 :
Figure 4: Reconstruction error with different packet loss ratio.

Figure 5 :
Figure 5: Network packet loss ratio with different sparsity Degree.

Figure 6 :
Figure 6: Network lifetime for the CS based algorithms.

Figure 9 :
Figure 9: Performance analyses of the CS-PLM algorithm.

Figure 10 :
Figure 10: Relative reconstruction error with different network scale.

Figure 11 :
Figure 11: Relative reconstruction error under different sparse bases.