^{1}

^{2}

^{3}

^{1}

^{1}

^{2}

^{3}

We consider a problem of minimum cost (energy) data aggregation in wireless sensor networks computing certain functions of sensed data. We use in-network aggregation such that data can be combined at the intermediate nodes en route to the sink. We consider two types of functions: firstly the summation-type which includes

Let us take an example. Consider the network in Figure

An example of computing and communicating a summary.

Next we consider extreme type summary functions such as

Our results show that the summary function

In general the single-sink aggregation problem to minimize (

The case where

There have been many studies regarding the intermediate data combining in conjunction with routing in order for an efficient retrieval of the complete sensor readings. Scaling laws for achievable rates under joint source coding and routing are studied in [

The above works aim at retrieving the

We are given an undirected graph

In this paper a summary function is defined to be a nonnegative function denoted by

Abusing notation for the sake of simplicity, we let the function

We will define the problem of minimizing communication costs as follows. There exists a sink to which the data is to be aggregated. Our goal is to find a minimum-cost aggregation tree

We will use the entropy function

In addition, the measured data is transmitted as a packet in the network. Hence for each packet transmission, there is an overhead of metadata, for example, packet header. For any measurement

We consider the summary functions of

We first discuss the case where

When the source data is i.i.d., we will show that there exists a randomized algorithm which finds an aggregation tree whose expected cost is within a factor of

Suppose

Goel and Estrin [

The function

For the first property, it trivially holds that

Next we consider

Next we consider the case where

Suppose

Consider the information communicated over Edge

In summary, the key question was whether

Note that some properties regarding

The discussion so far enables us to deal with more general objective functions extended from

In this section we consider summary functions regarding the extreme statistics of measurements, that is,

For extreme-type summary functions, we will show that

We consider the problem of retrieving the maximum of i.i.d. Gaussian RVs. We assume that

The differential entropy of the maximum of a set of i.i.d. RVs distributed according to an RV

We consider the problem of

We will show that

Consider the extreme data retrieval problem. The aggregation cost function

Since

On the right of Figure

In general, for a convex and decreasing

Before we describe our algorithm we present the motivation behind the algorithm. An important observation for the data aggregation problems was made in [

Consider the three examples of aggregation cost functions denoted by

Aggregation cost functions which are convex and decreasing for

The example provides us with some insights. Since

An outline of the proposed algorithm is presented as follows. The algorithm consists of multiple stages. A hub-and-spoke problem (or facility location problem) is approximately solved at each stage. The flows from source nodes are merged at the hubs. The hubs at the present stage become the source nodes in the next stage, that is, the flows are merged hierarchically. Instead of solving complex facility location problem, we find a

At each stage, once the center nodes are determined, we build an approximately optimal Steiner tree with respect to the center nodes and the sink. We use algorithm in [

Each stage outputs an aggregation tree. The output tree at Stage

Hence, over the stages, the algorithm progressively changes the balance between microscopic and macroscopic aspects of cost reduction in the output trees. Roughly speaking, the output trees from later stages are more biased towards the microscopic aspect. After the stages are over, we pick the tree with the minimum cost among the output trees. As a result the algorithm empirically searches for the point of the “best” balance between the two aspects of cost reduction over the stages. Hence one could expect that our algorithm will work well for any convex and decreasing

We present a formal description of the proposed algorithm followed by an explanation of further details. For given aggregation tree

Begin Algorithm

(Metric completion of

(Initialization)

(Initialize flows at sources)

(Initial output is a Steiner tree) Jump to Step 7.

(Minimum weight edge cover) Let us denote the subgraph of

(Node selection) Suppose

Remove all the noncenter nodes from

(Steiner tree construction) Build

(Merging trees) If

If

(Loop) If

(Tree selection) The final output is the tree

that is, the minimum cost tree among the output trees from all the stages.

End Algorithm

We explain the details of several steps in the algorithm. In Step 3 the flow variables denoted by

In this section we analyze the performance of HCST algorithm. For set

For given network graph

Denote the optimal cost by

We first observe that

Next we would like to derive an upper bound for

Now let us consider the cost of output tree at Stage

In conclusion, we have that, from (

An interpretation for ratio

Next we discuss constant

In this section we consider a simple convex and decreasing

Figures

Performance bounds under varying number of sources.

Performance bounds under varying network sizes.

Next we present an example of the application of the HCST algorithm to a specific graph. An example of

(a)

(a) A complete graph of the sources. (b) MWEC from Stage 1. (c) MWEC from Stage 2.

Figure

(a) MWEC in

(a) MWEC in

Next consider

In our simulation we randomly generate

An example of randomly generated

We will compare the performance of the HCST algorithm with HM algorithm [

Energy cost associated with a fixed number of sources.

Figure

Energy cost associated with a varying number of sources.

In this paper we have studied a single-sink aggregation problem for wireless sensor networks computing several widely used summary functions. It is observed that the problem is characterized by the aggregation cost function

The authors declare that there is no conflict of interests regarding the publication of this paper.

This work was supported by Basic Science Research Program through The National Research Foundation of Korea (NRF) funded by The Ministry of Science, ICT & Future Planning (NRF-2013R1A1A1062500), and in part by the ICT R&D program of MSIP/IITP, (10-911-05-006, High Speed Virtual Router that Supports Dynamic Circuit Network).