Block Least Mean Squares Algorithm over Distributed Wireless Sensor Network

,


Introduction
A wireless sensor network (WSN) consists of a group of sensors nodes which perform distributed sensing by coordinating themselves through wireless links.Since the nodes operate in a WSN function with limited battery power, it is important to design the networks with a minimum of communication among the nodes to estimate the required parameter vector [1,2].In the literature, a number of research papers have appeared which address the energy issues of sensor networks.According to the energy estimation scheme based on the 4th power loss model with Rayleigh fading [3], the transmission of 1 kb of data over a distance of 100 m, operating at 1 GHz using BPSK modulation with 10 −6 bit-error rate, requires 3 J of energy.The same energy can be used for executing 300 M instructions in a 100 MIPS/watt general purpose processor.Therefore, it is of great importance to minimize the communication among nodes by maximizing local estimation in each sensor node.
Each node in a WSN collects noisy observations related to certain desired parameters.In the centralized solution, every node in the network transmits its data to a central fusion center (FC) for processing.This approach has the disadvantage of being nonrobust to the failure of the FC and also needs a powerful central processor.Again the problem with centralized processing is the lack of scalability and the requirement for a large communication resource [1].If the intended application and the sensor architecture allow more local processing, then it would be more energy efficient compared to communication extensive centralized processing.Alternatively, each node in the network can function as an individual adaptive filter to estimate the parameter from the local observations and by cooperating with the neighbors.So there is a need to search for new distributed adaptive algorithms to reduce communication overhead for low-power consumption and low-latency systems for realtime operation.
The performance of distributed algorithms depends on the mode of cooperation among the nodes, for example, incremental [4,5], diffusion [6], probabilistic diffusion [7], and diffusion with adaptive combiner [8].To improve the robustness against the spatial variation of signal-to-noise ratio (SNR) over the network, recently an efficient adaptive combination strategy has been proposed [8].Also a fully distributed and adaptive implementation to make individual decisions by each node in the network is dealt with in [9].
Since in block filtering technique [10], the filter coefficients are adjusted once for each new block of data in contrast to once for each new input sample in the least mean square (LMS) algorithm, the block adaptive filter permits faster implementation while maintaining equivalent performance as that of widely used LMS adaptive filter.Therefore, the block LMS algorithms could be used at each node in order to reduce the amount of communications.
With this in mind, we present a block formulation of the existing cooperative algorithm [4,11] based on the distributed protocols.Distinctively, in this paper, the adaptive mechanism is proposed in which the nodes of the same neighborhood communicate with each other after processing a block of data, instead of communicating the estimates to the neighbors after every sample of input data.As a result, the average bandwidth for communication among the neighboring nodes decreases by a factor equal to the block size of the algorithm.In real-time scenarios, the nodes in the sensor network follow a particular protocol for communication [12][13][14], where the communication time is much more than the processing time.The proposed block distributed algorithm provides an excellent balance between the message transmission delay and processing delay, by increasing the interval between two messages and by increasing the computational load on each node in the interval between two successive transmissions.The main motivation here is to propose communication-efficient block distributed LMS algorithms (both incremental and diffusion type).We analyze the performance of the proposed algorithms and compare them with existing distributed LMS algorithms.
The reminder of the paper is organized as follows.In Section 2, we present the BDLMS algorithm and its network global model.The performance analysis of BDLMS and its learning characteristics obtained from a simulation study are presented in Section 3. Performance analysis of the BILMS and its simulation results are presented in Section 4. The performance of the proposed algorithms in terms of communication cost and latency is compared with the conventional distributed adaptive algorithms in Section 5. Finally, Section 6 discusses the conclusions of the paper.

Block Adaptive Distributed Solution
Consider a sensor network with N number of sensor nodes randomly distributed over the region of interest.The topology of a sensor network is modeled by an undirected graph.Let G be an undirected graph defined by a set of nodes V and a set of edges E .Nodes i and j are called neighbors if the are connected by an edgey that is, (i, j) ∈ E .We also considered a loop which consists of a set of nodes i 1 , i 2 , . . ., i N such that the node i k is i k+1 's neighbor, k = 1, 2, . . ., N, and i 1 is i N 's neighbor.Every node in the network i ∈ V is associated with noisy output d i to the input data vector u i .We have assumed that the noise is independent of both input and output data; therefore, the observations are spatially and temporally independent.The neighborhood of node i is defined as the set of nodes connected to node i which is defined as Now, the objective is to estimate an M × 1 unknown vector w • from the measurements of N nodes.In order estimate this, every node is modeled as a block adaptive linear filter where each node updates its weights using the set of errors observed in the estimated output vector, and broadcasts that to its neighbors.The estimated weight vector of the kth node at time n is denoted as w k (n).Let u k (n) be the input data of kth node at time instant n, then the input vector to the filter at time instant n is The corresponding desired output of the node for the input vector u k (n) is modeled as [16,17] where υ k (n) denotes a temporally and spatially uncorrelated white noise with variance σ 2 υ,k .The block index j is related to the time index n as where L is the block length.The jth block contains time Combining input vectors of kth node for block j to form a matrix given by the corresponding desired response at jth block index of kth node is represented as Let e j k represent the L × 1 error signal vector for jth block of kth node and is defined as where w j k estimated weight vector of the filter when jth block of the data is input at the kth node and of the order of M × 1.
The regression input data and corresponding desired responses are distributed across all the nodes and are represented in two global matrices: The objective is to estimate the M×1 vector w from the above quantities, those collected the data across N nodes.By using this global data, the block error vector for the whole network is Journal of Computer Networks and Communications 3 Now, the vector w can be estimated by minimizing MSE function as The time index is dropped here for simple mathematical representation.Since the quantities are collected data across the network in block format; therefore, the block mean square error (BMSE) is to be minimized.The BMSE is given by [17,18] Let the input regression data u be Gaussian and defined by the correlation function r(l) = σ 2 α |l| in the covariance matrix, where α is the correlation index and σ 2 is the variance of the input regression data, then the relation between correlation and cross-correlation quantities among blocked and unblocked data can be denoted as where , which are the autocorrelation and cross-correlation matrices for global data in blocked form.Similarly, the correlation matrices for unblocked data are defined as R , and R where the global distribution of data across the network is represented as T .These relations are also valid for node data in individual nodes.Now, the block mean square error (BMSE) in ( 10) is reduced to Comparing (12) with the MSE of conventional LMS for global data [17,19], it can be concluded that the MSE in both the cases is same.Hence, block LMS algorithm has similar properties as that of the conventional LMS algorithm.Now, (9) for blocked data can be reduced to a form similar to that of unblocked data as The basic difference between blocked and unblocked LMS lies in the estimation of the gradient vector used in their respective implementation.The block LMS algorithm uses a more accurately estimated gradient because of the time averaging.The accuracy increases with the increase in block size.Taking into account the advantages of block LMS over conventional LMS, the distributed block LMS is proposed here.

Adaptive Block Distributed Algorithms.
In adaptive block LMS algorithm, each node k in the network receives the estimates from its neighboring nodes after each block of input data to adapt the local changes in the environment.Two different types of distributed LMS in WSN have been reported in literature, namely, incremental and diffusion LMS [6,19].These algorithms are based on conventional LMS for local learning process which in terms needs large communication resources.In order to achieve the same performance with less communication resource, the block distributed LMS is proposed here.

The Block Incremental LMS (BILMS) Algorithm.
In an incremental mode of cooperation, information flows in a sequential manner from one node to the adjacent one in the network after processing one sample of data [4].The communications in the incremental way of cooperation can be reduced if each node need to communicate only after processing a block of data.For any block of data j, it is assumed that node k has access to the w j k−1 estimates from its predecessor node, as defined by the network topology and constitution.Based on these assumptions, the proposed block incremental LMS algorithm can be stated by reducing the conventional incremental LMS algorithm (( 16) in [19]) to a blocked data form as follows, where μ k is the local step size, and L is the block size.

The Block Diffusion LMS (BDLMS) Algorithm.
Here, each node k updated its estimate by using a simple local rule based on the average of its own estimates plus the information received from its neighbor N k .In this case, for every jth block of data at the kth node, the node has access to a set of estimates from its neighbors N k .Similar to block incremental LMS, the proposed block diffusion strategy for a set of local combiners c kl and for local step size μ k can be described as a reduced form of conventional diffusion LMS [6,20] as The weight update equation can be rewritten in more compact form by using the data in block format given in ( 4) and ( 5) as Comparing ( 15) with ( 19) in [21], it is concluded that the weight update equation is modified into block format.

Performance Analysis of BDLMS Algorithm
The performance of an adaptive filter is evaluated in terms of its transient and steady-state behaviors, which, respectively provide the information about how fast and how well a filter is capable to learn.Such performance analysis is usually challenging in interconnected network because each node k is influenced by local data with local statistics {R dx,k , R X,k }, by its neighborhood nodes through local diffusion, and by local noise with variance σ 2 υ,k .In case of block distributed system, the analysis becomes more challenging as it has to handle data in block form.The key performance metrics used in the analysis are MSD (mean square deviation), EMSE (excess mean square error), and MSE for local and also for global networks and are defined as and the local error signals such as weight error vector and a priori error at kth node for jth block are given as The algorithm described in (15) is looking like the interconnection of block adaptive filters instead of conventional LMS adaptive algorithm among all the nodes across the network.
As shown in (12) that the block LMS algorithm has similar properties to those of the conventional LMS algorithm, the convergence analysis of the proposed block diffusion LMS algorithm can be carried out similar to the diffusion LMS algorithm described in [18,21].The estimated weight vector for jth block across the network is defined as Let C be the N × N metropolis with entries [c kl ], then the global transaction combiner matrix G is defined as G = C ⊗ I M .The diffusion global vector for jth block is defined as Now, the input data vector at jth block is defined as The desired block responses at each node k are assumed which have to obey the traditional data model used in literature [16][17][18], that is, where v j k is the background noise vector of length L. The noise is assumed to be spatially and temporarily independent with variance σ 2 υ,k .Using blocked desired response for single node (17), the global response for kth block can be modeled as where w • g is the optimum global weight vector defined for every node and is written as w • g = [w • ; , . . ., ; w • ] and is the additive Gaussian noise for jth block index.
Using the relations defined above, the block diffusion strategy in (15) can be written in global form as where the step sizes for all the nodes are embedded in a matrix S, Using (20), it can be written as 3.1.Mean Transient Analysis.The mean behavior of the proposed BDLMS is similar to diffusion LMS given in [18,21].The mean error vector signal is given as where Hence, (28) can be written as Comparing (30) with that of diffusion LMS ((35) in [21]), we can find that both block diffusion LMS and diffusion LMS yield the same characteristic equation for the convergence of mean; and it can be concluded that block diffusion protocol defined in (15) has the same stabilizing effect on the network as diffusion LMS,

Mean-Square Transient Analysis.
The variance estimate is a key performance indicator in mean-square transient analysis of any adaptive system.The variance relation for block data is similar to that of conventional diffusion LMS Using E[X j T X j ] = LE[U j T U j ] from the definition in (32), we obtain which is similar to (45) in [21].Using the properties of expectation and trace [18], the second term of (31) is solved as where the noise variance vector n j is not in block form, and it is assumed that the noise is stationary Gaussian.Equations ( 31) and (32) may therefore be written as It may be noted that variance estimate (36) for BDLMS algorithm is exactly the same as that of DLMS [21].In the block LMS algorithm, the local step size is chosen to be L times that of the local step size of diffusion LMS in order to have the same level of performance.As the proposed algorithm and the diffusion LMS algorithm have similar properties, the evolution of their variances is also similar.Therefore, the recursion equation of the global variances for BDLMS will be similar to (73) and (74) in [21].Similarly, the local node performances will be similar to (89) and (91) of [21].

Learning Behavior of BDLMS Algorithm.
The learning behavior of BDLMS algorithm is examined using simulations.The characteristic or variance curves are plotted for block LMS and are compared with that of DLMS.The row regressors with shift invariance input [18] are used with each regressor having data as In block LMS, the regressors for L = 3 and M = 3 are given as The desired data are generated according to the model given in literature [18].The unknown vector w The input sequence {u k (i)} is assumed to be spatially correlated and is generated as Here, a k ∈ [0, 1) is the correlation index, and n k (i) is a spatially independent white Gaussian process with unit variance and b k = σ 2 u,k • (1 − a 2 k ) .The regressors power profile is given by {σ 2 u,k } ∈ (0, 1].The resulting regressors have Toeplitz covariance with corelation sequence r k Figure 1 shows an eight-node network topology used in the simulation study.The network settings are given in Figures 2(a

The Simulation Conditions.
The algorithm is valid for any block of length greater than one [10], while L = M is the most preferable and optimum choice.The background noise is assumed to be Gaussian white noise of variance σ 2 υ,k = 10 −3 , and the data used in the study is generated using In order to generate the performance curves, 50 independent experiments are performed and averaged.The results are obtained by averaging the last 50 samples of the corresponding learning curves.The global MSD curve is shown in Figure 3.This is obtained by averaging E w j−1 k 2 across all the nodes over 100 experiments.Similarly, the global EMSE curve obtained by averaging E e j a,k 2 , where e j a,k = x j k w j−1 k , across all the nodes over 100 experiments is displayed in Figure 4.The global MSE is depicted in Figure 5.It shows that in both the cases the MSE is exactly matching.Since the weights are updated and then communicated for local diffusion after every L data samples, the number of communications between neighbors is reduced by L times compared to that of the diffusion LMS case where the weights are updated and communicated after each sample of data.
The global performances are the contributions of all individual nodes, and it is obtained by taking the mean performance of all the nodes.The simulation results are provided to compare with that obtained by diffusion LMS for individual node.The local MSD evolution at node 1 is given in Figure 6(a) and at node 5 is given in Figure 6(b).Similarly, the local EMSE evolution at nodes 1 and 7 is depicted in Figure 7.The convergence speed is nearly the same in both MSD and EMSE evolution, but the performance is slightly degraded in case of BDLMS.The loss of performance in case of BDLMS could be traded for the huge reduction in of communication bandwidth.

Performance Analysis of BILMS Algorithm
To show that the BILMS algorithm has guaranteed convergence, we may follow the steady-state performance analysis of the algorithm using the same data model as the one which is commonly used in the conventional sequential adaptive algorithms [5,22,23].The weight-energy relation is derived by using the definition of weighted a priori and a posteriori error [18] Since ( 40) is similar to that of (35) in [19].Thus, the performance of BILMS is similar to that of ILMS.The variance expression is obtained from the energy relation (40) by replacing a posteriori error by its equivalent expression and then averaging both the sides The variance relation in ( 41) is similar to the variance relation of ILMS in [19].The performance of ILMS is studied in detail in literature.It is observed that the theoretical performance of block incremental LMS and conventional incremental LMS algorithms are similar because both have the same variance expressions.Simulation results provide the validation of this analysis.

Simulation Results of BILMS Algorithm.
For the simulation study of IBLMS, we have used the regressors with shift-invariance as with the same desired data used in the case of BDLMS algorithm.The time-correlated sequences are generated at every node according to the network statistics.
The same network has been chosen here for simulation study as defined for block diffusion network in Section 3.3.
In incremental way of cooperation, each node receives information from its previous node, updates it by using own data, and sends the updated estimate to the next node.The ring topology used here is shown in Figure 8.We assume that the background noise to be temporarily and spatially uncorrelated additive white Gaussian noise with variance 10 −3 .The learning curves are obtained by averaging the performance of 100 independent experiments, generated by 5,000 samples in the network.It can be observed from figures that the steady-state performances at different nodes of the network achieved by BILMS matche very closely with that of ILMS algorithm.The EMSE plots which are more sensitive to local statistics are depicted in Figures 9(a) and 9(b).A good match between BILMS and ILMS is observed from these plots.In [19], the authors have already proved the theoretical matching of steady-state nodal performance with simulation results.As the MSE roughly reflects the noise power and the plot indicates the good performance of the adaptive network, it may be inferred that the adaptive node performs well in the steady state.
The global MSD curve shown in Figure 10 is obtained by averaging E ψ ( j−1) k 2 across all the nodes and over 50 experiments.Similarly, the global EMSE and MSE plots are displayed in Figures 11 and 12, respectively.These are obtained by averaging E e a,k ( j) 2 , where e a,k ( j) = x k, j ψ ( j−1) k across all the nodes over 50 experiments.
If the weights are updated after L data points and then communicated for local diffusion, the number of communications between neighbors is reduced by L times that of ILMS where the weights are updated after processing each sample of data.Therefore, similar to BDLMS, the communication overhead in BILMS also gets reduced by L times that of ILMS algorithm.
The performance comparison between two proposed algorithms BDLMS and BILMS for the same network is shown in Figures 13-15.One can observe from Figure 13 that the MSE for BILMS algorithm is converging faster than BDLMS.Since the same noise model is used for both the algorithms, therefore after convergence, the steady-state performances are the same for both of them.But in case of MSD and EMSE performances in Figures 14 and 15, little difference is observed.It is due the different cooperation scheme used for different algorithms.However, the diffusion cooperation scheme is more adaptive to the environmental change compared to the incremental cooperation.But

Performance Comparison
In this section, we present an analysis of communication cost and latency to have a theoretical comparison of the performances of distributed LMS with block distributed LMS.

Analysis of Communication Cost.
Assuming that the messages are of fixed bit width, the communication cost is modeled as the number of messages transmitted to achieve the steady-state value in the network.Let N be the number of nodes in the network, and let M be the filter length.The block length L is chosen to be the same as the filter length.Let h be the average time required for the transmission of one message, that is, for one communication between the nodes [24][25][26].

ILMS and BILMS Algorithms.
In the incremental mode of cooperation, every node sends its own estimated weight vector to its adjacent node in a unidirectional cyclic manner.Since at any instant of time, only one node is active/allowed to transmit to only one designated node, the number of messages transmitted in one complete cycle is N − 1.Let K be the number of cycles required to attain the steadystate value in the network.Therefore, the total number of communications required to converge the system to steady state is given by In case of BILMS also, at any instant of time, only one node in the network is active/allowed to transmit to one designated follower node, as in the case of ILMS.But, in case of BILMS, each node sends its estimated weight vector to its follower node in the network after an interval of L sample periods after processing a block of L data samples.Therefore, the number of messages sent by a node in this case is reduced to K/L, and accordingly, the total communication cost is given by network.So the total number of messages transmitted by all the nodes in a cycle is where n i is the number of nodes connected to the ith node, and the total communication cost to attain convergence is given by In this proposed block diffusion strategy, the number of connected nodes n i and the total size of the messages remain the same as that of DLMS.But, in case of BDLMS algorithm, each node distributes the message after L data samples.Therefore the communication is reduced by a factor equal to the block length, and the total communication cost in this case is given by

Analysis of Duration for Convergence.
The time interval between the arrival of input to a node and the time of reception of corresponding updates by the designated node(s) may be assumed to be comprised of two major components.In case of DBLMS, the total communication delay per cycle is reduced by a factor of L, which can be expressed as The mathematical expressions of communication cost and latency for the distributed LMS and the block distributed LMS algorithms are summarized in Table 1.A numerical example is given in Table 2 to show the advantage of block-distributed algorithms over the sequential-distributed algorithms.The authors have simulated the hardware for 8-bit multiplication and addition in TSMC 90 nm.The multiplication and addition time are found to be T A = 10 −5 ns, T M = 10 −3 ns.We assume the transmission delay h = 10 −2 s.Looking at the convergence curves obtained from the simulation studies, we can say that the network attains steady state after 250-input data in DLMS and 50-input data in ILMS case.The filter length M as well as the block size L are taken to be 10 in the numerical study.

Conclusion
We have proposed the block implementation of the distributed LMS algorithms for WSN.The theoretical analysis and the corresponding simulation results demonstrate that the performance of the block-distributed LMS algorithms is similar to that of the sequential-distributed LMS.The remarkable achievement of the proposed algorithms is that a node requires L (block size) times of less communications compared to the conventional sequential-distributed LMS algorithms.This would be of great advantage in reducing the communication bandwidth and power consumption involved in the transmission and reception of messages across the resource-constrained nodes in a WSN.In the coming years, with continuing advances in microelectronics, we can accommodate enough computing resources in the nodes to reduce the processing delays in the nodes, but the communication bandwidth and communication delay could be the major operational bottlenecks in the WSNs.The proposed block formulation therefore would have further advantages over the sequential counterpart in the coming years.

Figure 1 :
Figure 1: Network topology used for block diffusion LMS.

Figure 2 :Figure 3 :
Figure 2: Network statistics used for the simulation of BDLMS.(a) Network corelation index per node.(b) Regressor power profile.

Figure 5 :
Figure 5: Global mean-square deviation (MSE) curve for diffusion and block diffusion LMS.

Table 2 :
Numerical comparison of performances of sequential and block distributed adaptive algorithms.