Data Reduction with Quantization Constraints for Decentralized Estimation in Wireless Sensor Networks

The unknown vector estimation problem with bandwidth constrained wireless sensor network is considered. In such networks, sensor nodesmake distributed observations on the unknown vector and collaborate with a fusion center to generate a final estimate. Due to power and communication bandwidth limitations, each sensor nodemust compress its data and transmit to the fusion center. In this paper, both centralized and decentralized estimation frameworks are developed.The closed-form solution for the centralized estimation framework is proposed. The computational complexity of decentralized estimation problem is proven to be NP-hard and a Gauss-Seidel algorithm to search for an optimal solution is also proposed. Simulation results show the good performance of the proposed algorithms.


Introduction
The developments in microelectromechanical systems technology, wireless communications, and digital electronics have enabled the deployment of low-cost wireless sensor networks (WSNs) in large scale using small size sensor nodes [1].In such networks, the distributed sensors collaborate with a fusion center to jointly estimate the unknown parameter.If fusion center receives all measurement data from all sensors directly and processes them in real time, the corresponding processing of sensor data is known as the centralized estimation, which has several serious drawbacks, including poor survivability and reliability, heavy communications, and computational burdens.Since all sensors have limited battery power, their computation and communication capability are severely limited; the decentralized estimation methods are widely discussed in recent years [2][3][4][5][6].In the decentralized estimation framework, every sensor is also a subprocessor.It first preprocesses the measurements in terms of a criterion and then transmits its local compression data to the fusion center.Upon receiving the sensor messages, the fusion center combines them according to a fusion rule to generate the final result.In such networks, less information is transmitted leading to a significant power-saving advantage which is very important in the case of WSNs.
To minimize the communication cost, only limited amount of information is allowed to be transmitted through networks; dimensionality reduction estimation methods have attracted considerable attentions [7][8][9].The basic idea of the dimensionality reduction estimation strategy is to prefilter the high-dimensional observation vector by a linear transformation (matrix) to project the observation onto the subspace spanned by basis vectors and filter the result with a low-rank estimation.Indeed, dimensionality reduction estimation and filtering are important for a wide range of signal processing applications where data reduction, robustness against noise, and high computational efficiency are desired.
Quantization has been viewed as a fundamental element in saving bandwidth by reducing the amount of data to represent a signal and well studied in digital signal processing and control where a signal with continuous values is quantized due to a finite word-length of microprocessor [10].In WSNs, quantization is also necessary to reduce the energy consumption as communications consume the most energy as the amount of energy consumed is related to the amount of data transmitted.An interesting distributed estimation approach based on the sign of innovation (SOI) has been developed for dynamic stochastic systems in [11] where only transmission of innovation of a single bit is required.A general multiple-level quantized innovation Kalman filter for Mathematical Problems in Engineering estimation of linear dynamic stochastic systems has been presented in [12].The solution to the optimal filter is given in terms of a simple Riccati recursion as in the standard Kalman filter.A random field estimation problem with quantized measurements in sensor networks has been considered in [13].In the early work [14], the trade-off between dimension reduction and quantization in minimum mean squared error estimation problem is investigated.
In this paper, different from the existing work, the dimensionality reduction and quantization for local data compression are considered in an integrated way.Data reduction with quantization constraints estimation for an unknown vector is formulated as an optimization problem.Both centralized and decentralized estimation frameworks are developed.The closed-form solution for the centralized estimation framework is proposed.By using computational complexity theory, the intractability of decentralized estimation problem is established.A Gauss-Seidel type iteration algorithm to search for an optimal solution is also proposed for the decentralized estimation problem.
The rest of this paper is organized as follows.With given communication bandwidth, the bits allocation problem has been formulated as an optimization problem in Section 2. The closed-form solution for the centralized estimation framework is proposed in Section 3. The computational complexity of decentralized estimation problem is proved to be NPhard and a Gauss-Seidel algorithm to search for an optimal solution is also proposed in Section 4. Simulation results are reported in Section 5 to show the performance of our methods.Concluding remarks are given in Section 6.

Problem Formulation
Consider a sensor network deployed with  sensor nodes.Each sensor, say the th sensor, can take observation   ∈ R   which is correlated with an unknown random parameter  ∈ R  .The observations will be transmitted to a fusion center to estimate the unknown parameter  under some certain criterion.In this paper, we consider the minimum mean squared error (MMSE) criterion [15,16].
Through a transform matrix   ∈ R   ×  ,   ≤   , each sensor transforms the observation into a   × 1 vector     , whereafter the transformed vector will be quantized into several bits and transmited to the fusion center.In this paper, we assume that there is no information exchange among sensors.We also assume without loss of generality that the unknown parameter  and observations   are zeromean.The auto-and cross-covariance matrices   ,    ,      , ∀,  ∈ {1, . . ., } are available at the FC.The role of FC is to combine the received quantization information according to where (⋅) is the fusion function and (⋅) is a given quantizer.
Our goal is to design the linear transforms { 1 , . . .,   } and the fusion function (⋅) such that the mean squared error (MSE) is as small as possible under the constraint that the total number of bits can be transmitted to FC.Throughout this work, we will focus only on the linear design of fusion function (⋅) which can be represented in the form The given quantizer (⋅) is considered as a minimum squared error distortion quantizer [17].We assume that the quantizer input vector  = ( 1 , . . .,   )  is a random vector with uncorrelated components.Each component has zero mean and variance  2  .Under the Gaussian assumption, the quantizer output () is treated as a noise source that introduces independent white noise V as whose mean is zero and covariance is [17] where  2  =  2  2 −2  is the squared error distortion for the th component and   is the bits used by the the th component   .
Therefore, the mean squared error at the fusion center can be calculated as follows: where   is the local linear transform operator,   = {   } ×  is the fusion operator at the fusion center and   is the quantization bits for the tth elements of vector     .The optimal estimation of random vector  under individual sensor bandwidth constraint can be formulated as follows: By appropriate pre-and postwhitening process if necessary, we assume without loss of generality that the auto-and cross-covariance matrices   ,    ,      , ∀ ,  ∈ {1, . . ., } have full rank and the elements of observation vector taken by each sensor are uncorrelated [18].

Centralized Data Reduction with Quantization Constraints
In this section, we consider a simple centralized framework where the entire data is available at a single sensor node and the centralized case of optimization problem ( 7) can be simplified as follows: where () is the approximation matrix and B is the total bits to be transmitted.The optimal estimation without observation compression in the MMSE sense is as follows: with estimation error covariance matrix where  is called optimal estimation matrix.We write formula ( 9) as a linear model by introducing an estimation error  as We consider the problem that the optimal estimation matrix  is replaced by an approximating matrix () with lower rank  < .With a given compressed dimension , we want to find an optimal () such that the MMSE is as small as possible.The linear model ( 11) is modified as The estimation error covariance matrix can be calculated as follows: Therefore, the approximation matrix () introduces an extra variance term The matrix  1/2  =    −1/2  has an SVD of the form By minimizing  2 , it is not hard to show that the best approximation matrix is The extra variance is then where  +1 , . . .,   are the smallest  −  singular values of  1/2  .
Therefore the optimization problem ( 18) is as follows: If  is given, we can solve this optimization problem by a Laplacian multiplier [19].

Decentralized Data Reduction with Quantization Constraints
Let us now consider the estimation framework in a multisensor setup, under a total available rate  which has to be shared among all sensors.In the decentralized manner, the th sensor transforms the observation   ∈ R   into a   × 1 vector     through a transform matrix   ∈ R   ×  ,   ≤   , whereafter the transformed vector will be quantized into several bits and transmited to the fusion center.By the linear fusion rule at the fusion center, the mean squared error at the fusion center can be calculated as follows: where   is the local linear transform operator,   = {   } ×  is the fusion operator at the fusion center and   is the quantization bits for the tth elements of vector     .Therefore, the decentralized estimation of random vector  under individual sensor bandwidth constraint can be formulated as follows: Theorem 1.The computational complexity of solving problem (20) is NP-hard even in the case with absence of channel distortions for quantization of each sensor.
Proof.We present the simplified formulations to analyze the computation complexity of problem (7).Let the  distributed sensor nodes make observations on a common random parameter vector  ∈ R  according to where   ∈ R   × is the observation matrix and V  ∈ R   is the additive noise which is zero mean and spatially uncorrelated.According to [18], we can assume that the sensor noises are uncorrelated with the input signal .Without loss of generality, we can assume that the unknown parameter vector  has an autocovariance matrix   =   and the noise covariance matrix is  V  =    .
The MSE at the FC can be calculated as follows: where the last step follows from the independence assumptions and the fact that the autocovariance of  and V  is normalized to   and    , respectively; the notation Tr(⋅) denotes the trace of a matrix, and the subscript  denotes the usual Frobenius norm of a matrix.Therefore, the optimal linear DES design problem under individual sensor power constraint can be formulated as follows: where  is the total rate constraint since the transmission power for sensor  to send     to fusion center is linearly proportional to Tr(     ).
From (23), the MSE at the FC can be written as Following the fact of matrix derivatives of traces [20], we can eliminate variables {  } by minimizing ‖ − x‖ 2 with respect to {  }.As a result, the optimization problem ( 24) is equivalent to minimize Tr ( When   is a vector, problem ( 27) is equivalent to "minimum sum of squares" problem which is NP-complete [21].Therefore, the computational complexity of solving problem (20) is NP-hard even in the case with absence of channel distortions for quantization of each sensor.
Remark 2. The NP-completeness of optimization problem (20) leads to the intractability of finding the globally optimal solution in polynomial time.Instead of finding a globally optimal solution, a locally optimal solution may be sufficient in many applications.An effective heuristic algorithm should be proposed to search for the optimal solution of optimization problem (20).
An algorithm that could be used to search for the optimal solution is the Gauss-Seidel type iteration algorithm which may converge to a locally optimal solution and widely used in estimation, detection, and classification with sensor networks [22][23][24][25][26][27].In this paper, a Gauss-Seidel type iteration algorithm is proposed to search the optimal solution for that problem sensor by sensor.
Suppose that all sensor nodes except node  have fixed transformation matrix   ,  ̸ = ,  = 1, . . ., .The goal is to determine the optimal   .From the perspective of a selected node , suppose that all other nodes have decided on (arbitrary) suitable approximations of their observations, and the question becomes to optimally choose the approximation to be provided by terminal , where we without loss of generality set  = 1.Observations taken by sensor node 1 are denoted by  1 .The remaining observations which may be thought of as being merged into one node are denoted by  2 .In line with this, we can partition the covariance matrix of the entire vector into four parts, according to Denoting  =  −  1  1  1 ,  =  2  2 , the distortion by dimension reduction is (29) The optimal estimation matrix is (30) Take ( 30) into (29) as Denote Equation ( 31) is simplified as Obviously, the optimal solution of sensor by sensor optimization problem can be solved because the question has been reduced to that in the centralized case.Based on the previous analysis, it is easy to construct a Gauss-Seidel type iteration algorithm to search for an optimal solution of optimization problem (20).We omit it here.

Simulations
In this section, we implement several simulations to show the performance of our proposed method.Both centralized and decentralized estimation frameworks are considered.

Centralized Estimation Framework.
In centralized estimation framework, entire data is available at a single sensor where  ∈ R × and  is a white noise with covariance matrix Σ  =  2 .In addition,  and  are uncorrelated.In simulation, we set  = 50,  = 10,  2 = 1, and Σ  =   , where  is drawn from a standard normal distribution.The estimation performance for centralized estimation framework with different bandwidth constraints is shown in Figure 1.The bottom solid line is the Cramer-Rao lower bound (CRLB).The dimension reduction and quantization lead to the gap between the centralized estimation curve and CRLB.
We plot the estimation performance for different reduced dimensions in Figure 2. Three cases of bandwidth constraint are considered ( = 20, 25, 30).The bottom solid line is the CRLB.The blue line with circle is the MSE for the data only with dimension reduction.When quantization is implemented after dimension reduction, the optimal strategy allocates the bandwidth to the most important dimension.Do not waste the bandwidth on the less important dimension which would leads to bad performance.
The comparison of estimation performance for different signal-to-noise ratios (SNR) is shown in Figure 4.The SNR is defined as

Decentralized Estimation Framework.
In decentralized estimation framework, the distributed sensors collaborate with a fusion center to jointly estimate the parameter .Since all sensors have limited battery power, their computation and communication capability are severely limited.As a result, local data compression is needed in order to reduce communication requirement.
Let the 3 distributed sensor nodes make observations on a common random parameter vector  ∈ R  according to where   ∈ R   × is the observation matrix and V  ∈ R   is the additive noise which is zero mean and spatially uncorrelated.In addition,  and V  are uncorrelated.In simulation, we set   = 15,  = 10,  2 = 1, and Σ  =   , where  is drawn from a standard normal distribution.The comparison of centralized and decentralized estimation performance is shown in Figure 3.The bottom solid line is the CRLB.The estimation for centralized and decentralized framework are plotted in red dash line with circle and blue dot line with square for different bandwidth constraints, respectively.The decentralized estimation performance is slightly worse than the centralized estimation since the Gauss-Seidel method cannot guarantee the optimal solution [22].

Conclusion
In this paper, we have considered a bandwidth constrained sensor network in which a set of distributed sensors and   a fusion center collaborate to estimate an unknown vector.With given communication bandwidth, the bits allocation problem has been formulated as an optimization problem.Both centralized and decentralized estimation frameworks have been developed.The closed-form solution for the centralized estimation framework has been proposed.The computational complexity of decentralized estimation problem has been proved to be NP-hard and a Gauss-Seidel type iteration algorithm to search for an optimal solution has been also proposed.Simulation results show the good performance of the proposed algorithms.

Figure 1 :
Figure 1: Estimation performance for centralized estimation with different bandwidth constraints.

Figure 2 :
Figure 2: Comparison of estimation performance with different reduced dimensions.

Figure 3 :
Figure 3: Comparison of estimation performance with different SNR.

Figure 4 :
Figure 4: Comparison of estimation performance for centralized and decentralized framework.