We consider a problem of minimum cost (energy) data aggregation in wireless sensor networks computing certain functions of sensed data. We use in-network aggregation such that data can be combined at the intermediate nodes en route to the sink. We consider two types of functions: firstly the summation-type which includes sum, mean, and weighted sum, and secondly the extreme-type which includes max and min. However for both types of functions the problem turns out to be NP-hard. We first show that, for sum and mean, there exist algorithms which can approximate the optimal cost by a factor logarithmic in the number of sources. For weighted sum we obtain a similar result for Gaussian sources. Next we reveal that the problem for extreme-type functions is intrinsically different from that for summation-type functions. We then propose a novel algorithm based on the crucial tradeoff in reducing costs between local aggregation of flows and finding a low cost path to the sink: the algorithm is shown to empirically find the best tradeoff point. We argue that the algorithm is applicable to many other similar types of problems. Simulation results show that significant cost savings can be achieved by the proposed algorithm.
1. Introduction
Motivation. In this paper we consider the problem of minimum cost (energy) data aggregation in wireless sensor networks (WSN) where the aggregated data is to be reported to a single sink. A common objective of WSN is to retrieve certain summary of sensed data instead of the entire set of data. The relevant summary is defined as a certain function applied to a set of measured data [1]. Specifically we are given a function g(·) such that, for a set of measurement data x1,…,xn, the goal of the sink is to retrieve g(x1,x2,…,xn). Examples of g(·) are mean, max, min, and so forth. When mean function is used, g(x1,…,xn)=∑i=1nxi/n. For applications such as “alarm” systems, one can use max as g(·), for example, g(x1,…,xn)=maxi=1,…,nxi where xi can be temperature values in forest-fire monitoring systems or the structural stress values measured in a building. We will refer to g(·) as a summary function throughout this paper. Certain types of g(·) allow us to combine data at the intermediate nodes en route to the sink. Such combining techniques are commonly referred to as in-network aggregation [2–4]. By using in-network aggregation one can potentially save communication costs by reducing the amount of traffic [5–7]. For instance, in the applications such as wireless multimedia sensor networks (WMSN) where the transmitted multimedia data has a far greater volume than that in typical WSNs, the in-network aggregation technique is crucial for the purpose of saving energy and extending network lifetime [8, 9]. While in-network aggregation offers many benefits, it poses significant challenge for network design, for example, designing routing algorithms so as to minimize costs such as energy expenditure and delay. In particular, we show that it is crucial to take into account how the summary function g(·) affects the statistical properties of sensed data.
Objectives. In this paper we study the minimum cost aggregation problem for several types of g(·). The performance of in-network aggregation relies heavily on the properties of the function g(·). To be specific let us briefly look at the problem formulation. Consider the single-sink aggregation problem where we define the cost function as follows. Let E denote the set of links in the network. We would like to minimize(1)∑e∈Eweϕe,where we represents the weight associated with link e and ϕe represents the average number of bits transmitted over e. Note that the objective similar to (1) has been considered in [10–14] as well. The most relevant objective associated with (1) is the energy consumption. To see this, let us define weight we:=kede-α where de is the distance between nodes connected by Link e, α is the path loss exponent, and ke is the related channel parameter. Hence (1) is proportional to the total transmitted energy consumed throughout the data aggregation. Note in [13, 14], the authors consider the same energy cost function. We refer to ϕe as the aggregation cost function (we will use notation ϕ to denote the cost function in general, whereas ϕe is used to denote the cost function specifically on Link e). Note that ϕe depends on the source measurements aggregated on e, and also on g(·) which is the summary function applied to the measurements. The work in [15] also studies an aggregation problem in sensor networks computing summary functions, assuming that all the packets generated in the network have the same size. However, the amount of information generated at intermediate nodes may vary, since a summary of data can be statistically different from the original data, which is our key observation.
Let us take an example. Consider the network in Figure 1 where Nodes 1 and 2 are the source nodes, and the node in shaded color represents the sink. The sink wants to receive a summary of information from Nodes 1 and 2. The sensor readings generated at Nodes 1 and 2 are represented by the random variables (RV) X1 and X2, respectively. Since Node 1 is a “leaf” node, Node 1 will simply transmit the raw reading X1 to Node 2. Node 2 will combine X1 with its own data, X2, by computing the summary function g(X1,X2) which is then transmitted to the sink. We define the aggregation cost function ϕ as follows. Suppose the sensor information to be transmitted on Edge e is random variable Y. The average number of bits to be transmitted on e, or ϕe, is defined as (We temporarily ignore communication overheads incurred in addition to the sensor information, e.g., the packet header size. We will however take such overheads into account later when we formally define ϕ.)(2)ϕe=HY,where H(·) denotes the entropy function. Note that the entropy function has been also adopted as cost function in [10, 12], and throughout this paper we will define ϕ in terms of H(·). The average numbers of bits transmitted on Edges 1 and 2 are, respectively, given by(3)ϕ1=HX1,ϕ2=HgX1,X2.Suppose g(·) is given by sum. Since H(g(X1,X2))=H(X1+X2)≠H(X1), the costs incurred at Edges 1 and 2, that is, ϕ1 and ϕ2, are different. If we had used other types of g(·), such as max, we would have that ϕ2=H(max(X1,X2)) which would incur different cost from the case where g(·) was sum. In many cases we will assume symmetric sources; that is, ϕ depends only on the number of sensor readings to which g(·) is applied. In those cases we will treat ϕ as a function ϕ:Z+→R+; that is, ϕ(m)=H(g(X1,…,Xm)) (we will also examine the cases of asymmetric sources as well). We will show that g(·) determines the properties of ϕ(·) such as convexity and monotonicity, and the structure of the aggregation problem heavily depends on those properties. Hence the aggregation scheme must be designed to capture key aspects of aggregation cost functions under the given summary function. The abovementioned links among summary functions, cost functions, and optimal aggregation strategies have not been previously well studied, as we will see in Section 2 through reviewing related works.
An example of computing and communicating a summary.
Contributions. In this paper we investigate the minimum energy aggregation problem for several widely used summary functions. We consider two types of g(·). The first type is called the summation type which involves sums of measurements: specifically sum, mean, and weighted sum. The second type is called the extreme type which is related to the extreme statistics of the data: specifically max and min. We will use the entropy function as the measure of information rate. We show that, when g(·) is sum or mean, and if the source data is i.i.d., ϕ is indeed concave and increasing, irrespective of the distribution of the source data. This implies that one can use well-known algorithms such as the Hierarchical Matching (HM) algorithm [16] in order to approximate the optimal cost. When g(·) is weighted sum however, it is unclear how we make association between the flow aggregation problem and the cost function. Nonetheless we prove that, if the source data is independent Gaussian random variables, there exists an efficient algorithm for the problem of aggregating weighted sum of data with arbitrary weights.
Next we consider extreme type summary functions such as max. We will show that for certain distributions of source data, ϕ can be convex and decreasing in the (nonzero) number of aggregated measurements. Note that the single-sink aggregation problems for concave/increasing [16–20] or convex/increasing cost functions [21, 22] have been widely studied, however convex and decreasing ϕ has not been well studied yet. We propose a novel algorithm which effectively captures such properties of ϕ. We begin by observing that there are two aspects in cost reduction as follows. Since ϕ is convex and decreasing, ϕ decreases faster when the number of aggregated data is smaller. The intuition is that it pays to locally aggregate data among nearby sources in the early stages of aggregation, that is, when the number of measurements aggregated at sensors is small. This leads us to find a low-cost local clustering of sources, which is a “microscopic” aspect of cost reduction. Meanwhile we need to simultaneously find a low-cost route to the sink, which must take the global structure of the network into account and thus is a “macroscopic” aspect of cost reduction. These are conflicting aspects and a good tradeoff point between them should be sought. To that end we propose Hierarchical Cover and Steiner Tree (HCST) algorithm. The algorithm consists of multiple stages and is designed to empirically find the best tradeoff point over the stages. We show that, by simulation, the algorithm can significantly reduce cost compared to baseline schemes such as a greedy heuristic using shortest path routing, or the HM algorithm.
Our results show that the summary function g(·) can significantly impact the design of aggregation schemes. However there are many choices for g(·): suppose for example, we would like to compute Lp norm of the vector of measurement data. Note sum and max functions which we study in this paper are in fact related such that, if the measurement data is always positive, then sum function is simply L1 norm and max is L∞ norm of a data vector. One could ask: what are good aggregation strategies if we take g(·) as a different Lp norm, say L2 norm? We leave such questions as future work.
Paper Organization. We briefly review related work in Section 2. Section 3 introduces the model and problem formulation. Sections 4 and 5 discuss the optimal routing problem for summation and extreme type summary functions, respectively. Simulation results are presented in Section 6. Section 7 concludes the paper.
2. Related Work
In general the single-sink aggregation problem to minimize (1) is NP-hard [23], and a substantial amount of research has been devoted to designing approximated algorithms depending on certain properties of ϕ. In our case it is important to note that such properties of ϕ are determined by the choice of g(·). Let us briefly review the related work on the single sink aggregation problem for two types of ϕ. Most research on the single sink aggregation problem has focused on the case where ϕ is concave and increasing. Due to the concavity of ϕ, the link costs associated with the amount of aggregated data exhibit economies of scale, that is, the marginal cost of adding a flow at a link is cheaper when the current number of aggregated flows is greater at the link. Buy-at-bulk network design [23, 24] is based on such property of ϕ. A number of approximation algorithms have been proposed, for example, [17–19]. When ϕ is known in advance, a constant factor approximation to the optimal cost is possible [20, 25]. Even when ϕ is unknown but is concave and increasing, Goel and Estrin [16] have proposed a simple and randomized algorithm called Hierarchical Matching (HM). The algorithm computes minimum weight matchings of the source nodes hierarchically over stages, and outputs a tree for aggregation. HM algorithm can approximate the optimal cost by a factor logarithmic in the number of sources [16]. Nonuniform variants of this problem such that ϕ differs among the links are also studied [26, 27] in which a polylogarithmic approximation to the optimal cost is shown to be achievable.
The case where ϕ is convex and increasing in the number of aggregated measurements has been studied in [21, 22]. Here ϕ exhibits (dis)economies of scale, that is, the marginal cost of routing a flow at a link is more expensive when a greater number of flows are aggregated at the link. Such phenomenon can be observed from many applications, such as speed scaling of microprocessors modeled by ϕ(m)=βma where m is the clock speed, β≥0 and a≥1 are constants, and ϕ(m) is the energy consumption at the processor. Notably the authors show that the problem can intrinsically differ from that for a concave and increasing ϕ. For example the authors show that constant-factor approximation algorithms do not exist for certain convex and increasing ϕ [21]. They nevertheless proposed a constant factor approximation algorithm for the case ϕ(m)=βma. These results show that the single-sink aggregation problem crucially depends on certain properties of ϕ such as convexity. However, none of the above works deal with convex and decreasing ϕ which we will study in the sequel.
There have been many studies regarding the intermediate data combining in conjunction with routing in order for an efficient retrieval of the complete sensor readings. Scaling laws for achievable rates under joint source coding and routing are studied in [28]. The work [11] studies the problem of minimizing the flow costs under distributed source coding. They show that when ϕ(m) is linear in m, firstly applying Slepian-Wolf coding at the sources, and secondly routing coded information via shortest path tree from the sources to the sink is optimal. In [10] a single-input coding model was adopted in which the coding of information among the nodes can be done only in pairs, but joint coding of the source data from more than two nodes is not allowed. Assuming reduction in packet size is a linear function of correlation coefficient between each pair of nodes, they proposed a minimum-energy routing algorithm. The impact of spatial correlation on routing has been explored in [12]. They showed that, assuming the correlation decays over distance, it pays to form clusters of nearby nodes and aggregate data at the clusterheads. The aggregated information is then routed from clusterheads to the sink. The algorithm is shown to perform well for various correlation models. The tradeoff between integrity of aggregated information and energy consumption has been studied in [29]. Further works on in-network aggregation combined with routing include [30, 31] which propose efficient protocols for routing excessive values among sensed data. A scheme using spatially adaptive aggregation so as to mitigate traffic congestion was proposed in [32].
The above works aim at retrieving the entire set of data, instead of a summary, subject to certain degrees of data integrity. In our case, we design energy efficient aggregation schemes to compute the summary function g(·) of the sensor readings. Also in the above mentioned works, the in-network aggregation reduces cost mainly by removing correlation among the data set. In our work, by contrast, we will focus on losslessly retrieving a summary of statistically independent sensor readings. We assume the independence of sensor readings because we would like to decouple the cost savings from removing correlation, and the savings from applying the summary function in association with aggregation strategies; we focus on the latter. Moreover, the assumption of the independence among the readings represents the “worst” case in terms of cost savings, since one cannot reduce the energy cost by removing correlations in sensor readings. In fact, the independence assumption can be valid in certain cases. For example, consider a large sensor network assuming that the sensed data is spatially correlated and the correlation decays quickly over distance. If the source nodes are sparsely deployed and thus tend to be far apart from one another, the correlation among their data can be very weak. Obviously such sparse node placement is motivated by cost efficiency: sparse placement of nodes enables us to reap as much information given a fixed number of sensor devices, assuming that the network senses a homogeneous field and the measure of information is given by the joint entropy function.
3. Model3.1. Preliminaries
We are given an undirected graph G=(V,E) where V=1,2,…,n and E⊆V×V denote the set of vertices and edges, respectively. For u,v∈V, (u,v)∈E denotes the (undirected) edge connecting nodes u and v. For each edge in E we associate a weight defined by w:E→R+. A weight captures the cost of transmitting unit amount of data between two nodes, for example, expenditure of transmission energy in order to compensate path loss. The set S⊂V denotes the set of source nodes, that is, the nodes which generate measurement data to be reported to the sink. Also define η:=S where · denotes the cardinality of a set. For a source node u∈S, its measured data is modeled by an RV denoted by Xu. We assume that Xu’s are independent and identically distributed among the sources. The measured data is to be aggregated at the sink node denoted by t∈V. The nodes which are not source nodes act as relays in the aggregation process. For simplicity we will assume that any node in the network transmits data at most once during the aggregation process. Such an assumption has been made in other works such as [15]. Thus the routes for aggregation constitute a tree whose root is given by t. We refer to such tree as an aggregation tree. The aggregation process is performed as follows. The sources initiate transmissions. An intermediate node waits for all the data from the sources which are descendants of the node to arrive. Next the node computes the summary function of the aggregated data which is then relayed to the next hop.
In this paper a summary function is defined to be a nonnegative function denoted by g(·) which is a divisible function. Divisible functions are a class of summary functions which can be computed in a divide-and-conquer manner [1]. Divisible functions are defined as follows: given n data samples, consider a partition of the samples into sets of size k and n-k denoted by {x1,x2,…,xk} and {xk+1,…,xn}, respectively. If g(·) is divisible, g(g(x1,…,xk),g(xk+1,…,xn))=g(x1,x2,…,xn) holds for any n and k. Examples of divisible functions are sum, max and min. Particularly when g(·) is divisible, the aggregation can be performed in a divide-and-conquer manner as follows. Suppose a set of data samples are aggregated at a node. If the node is a source, it applies g(·) to the collected samples and its own data. If the node is simply a relay, it applies g(·) to the aggregated data samples to obtain a summary of the samples where the summary of its aggregated data is transmitted to the next hop.
Abusing notation for the sake of simplicity, we let the function g(·) take a set, a vector, or their combination as its argument. For example if g(·) is sum, g(x,y)=x+y, and g({x,y})=x+y, and also g({x,y},z)=x+y+z. For some U⊆V, we define XU as the set of RVs representing the measurements from the nodes in U; that is, XU:=u∈U∣Xu. Thus g(XU) is the aggregation function applied to the set XU, for example, if g(·) is sum then(4)gXU=∑u∈UXu.
3.2. Problem Formulation
We will define the problem of minimizing communication costs as follows. There exists a sink to which the data is to be aggregated. Our goal is to find a minimum-cost aggregation tree T=(V′,E′)⊆G rooted at the sink. We would like to solve the following aggregation problem:(5)PMinimizeT=V′,E′⊆G∑e∈E′weϕe,where ϕe represents the average number of bits communicated over Edge e. Note that the objective of (5) has been considered in the works [10, 12] as well. We call ϕe as aggregation cost function which we define as follows.
We will use the entropy function H(·) as our measure of information rate similar to works [13, 14]. We assume that the average number of bits to represent random sensor measurement X is given by H(X). A precise definition of the entropy function H(X) depends on the nature of X: if X is a discrete RV, H(X) denotes the usual Shannon entropy. If X is a continuous RV, H(X) is implicitly defined to be H(X^) where X^ is a discrete RV obtained by applying uniform scalar quantization to X with some quantization step size, say 2-b for some integer b>0. If the quantization precision is sufficiently high, it is known [33] that H(X^)≈b+h(X) where h(·) denotes the differential entropy of continuous RVs. Note that a similar approximation has been made in defining the information rates for continuous RVs in [13, 14]. Hence in this paper, we will assume that continuous RV X incurs the cost of b+h(X) bits where b>0 is a sufficiently large parameter, and we denote such costs by H(X):=b+h(X).
In addition, the measured data is transmitted as a packet in the network. Hence for each packet transmission, there is an overhead of metadata, for example, packet header. For any measurement Y, no matter how small H(Y), there is always an overhead of transmitting such metadata in practice. We will assume the header length is fixed to α>0 bits throughout this paper. Hence the average number of bits required to send measurement information Y per transmission over a link is given by(6)α+HY.For a given aggregation tree T=(V′,E′), let p(s)⊂E′ denote the path from a source s∈S to the sink. For a given Edge e∈E, let U(e)⊆S denote the set of source nodes whose aggregated measurements are transmitted over e, that is, U(e)=s∈S∣e∈ps. The information to be communicated over Edge e is the function g(·) applied to the set of measurement value from U(e), that is, g(XU(e)). Hence we define the aggregation cost function as follows:(7)ϕe=0Ue=∅α+HgXUeotherwise.We would like to solve (P) using the definition of ϕe given by (7). In the following sections we investigate several widely used summary functions and the associated optimal aggregation problems.
4. Aggregation Schemes for Summation-Type Summary Functions
We consider the summary functions of sum, mean, and weighted sum.
4.1. sum and mean
We first discuss the case where g(·) is sum. We have that(8)HgXUe=H∑u∈UeXu.Clearly sum is a divisible function. Thus the aggregation process is as follows: a node simply applies sum function to the aggregated data, and relays the aggregated information to the next hop.
When the source data is i.i.d., we will show that there exists a randomized algorithm which finds an aggregation tree whose expected cost is within a factor of (log(η)+1) of the optimal cost to (5).
Proposition 1.
Suppose Xi’s are i.i.d. For any distribution of Xi, there exists an algorithm yielding the mean cost within a factor of log(η)+1 of the optimal cost of (P).
Goel and Estrin [16] studied a single-sink data aggregation problem as follows. A source generates a unit flow which needs to be routed to a sink where the flows are aggregated though a tree. Their objective is to minimize the following cost function:(9)∑e∈Eweψme,where we is the weight on Edge e, me is the number of flows on Edge e and ψ:R+→R+ is a function that maps the total size of flow to its cost. They proposed an algorithm to minimize (9) when ψ is a canonical aggregation function defined as follows.
Definition 2 (see [16]).
The function ψ:R+→R+ is called a canonical aggregation function (CAF) if it has the following property:
ψ(0)=0.
ψ(·) is increasing.
ψ(·) is concave.
Their algorithm, called Hierarchical Matching (HM) [16], guarantees the mean cost to be within the factor of log(η)+1 of the optimal irrespective of ψ, provided that ψ is a CAF. As mentioned previously, since Xi’s are i.i.d., ϕe depends only on me:=Ue. Specifically we will define ϕ as follows:(10)ϕe=ϕme=α+H∑u∈UeXu,me>0,0,me=0.We will show that ϕ(·) is a CAF by showing that ϕ(·) satisfies the three properties of Definition 2. Note this implies that HM algorithm can be used to approximately solve (P), since (9) and the objective of (P) are identical.
Proof of Proposition 1.
For the first property, it trivially holds that ϕ(0)=0. For the second property, for any two independent RVs X1 and X2, it is known that H(X1+X2)≥H(X1) implying that ϕ(2)≥ϕ(1), that is, the sum of independent RVs always increases entropy [33], which implies that ϕ(m) is increasing in m. For the third property, consider the following. It is shown in [34] that the entropy of the sum of independent RVs is a submodular set function. That is, the following holds for independent RVs Y1, Y2 and Y3 [34, Theorem I]:(11)HY1+Y3+HY2+Y3≥HY1+Y2+Y3+HY3.Now consider m+2 sensor measurements X1,…,Xm+2, and make substitutions Y1:=Xm+1, Y2:=Xm+2, and Y3:=∑i=1mXi in (11). We have that(12)HXm+1+∑i=1mXi+HXm+2+∑i=1mXi≥H∑i=1m+2Xi+H∑i=1mXi.If we apply the definition of ϕ given by (10) to (12), the following holds due to symmetry:(13)ϕm+1+ϕm+1≥ϕm+2+ϕm.Hence ϕ(m+2)-ϕ(m+1)≤ϕ(m+1)-ϕ(m) holds, or the slope is decreasing in m, which implies that ϕ(·) is concave on the domain of integers. Thus ϕ(·) satisfies all the properties of Definition 2, and is a CAF. This implies that, by using HM algorithm, one can achieve the expected cost which is within the factor of 1+log(η) of the optimal cost of (P).
Next we consider mean as the summary function. Note that mean, as well as weighted sum considered in the next section, are not divisible functions in general. We will nevertheless show that the problem for those summary functions can be reduced to sum problem as follows. Suppose every source node is aware of the total number of the sources, that is, η. In our scheme every source simply scales its measurement by η-1 prior to transmission, that is, Source i transmits η-1Xi, then such scaled measurements are aggregated in a similar way as the sum problem. The average number of bits transmitted over Edge e can be written as α+H{∑u∈U(e)((1/η)Xu)}. Since ((1/η)Xu)’s are i.i.d., for the minimum cost aggregation problem for mean we can use the same algorithm as that used for sum, for example, HM algorithm.
4.2. weighted sum
Next we consider the case where g(·) is weighted sum as follows. We assign arbitrary weights ai, i∈S, to the source nodes. The goal of the sink is to compute ∑i∈SaiXi. Our method of aggregation is similar to that for the case of mean, that is, Source i scales its measurement by ai, then transmits aiXi where the aggregation process is the same as that for sum. However the effective source data aiXi seen by the network is no longer i.i.d., unless ai’s are identical for all i∈S. The aggregation cost function is given by(14)ϕe=α+H∑i∈UeaiXi.The difficulty lies in it is difficult to associate a “flow” with the source data aiXi due to asymmetry, that is, the problem is no longer a flow optimization. Moreover, it is easily seen that (14) is not a CAF in general. Thus we restrict our attention to a specific distribution of Xi. We will show that, if Xi are independent Gaussian RVs, the problem for weighted sum is indeed a single-sink aggregation problem with concave costs, and there exist algorithms similar to HM algorithm which have good approximation ratio. Specifically we prove that our problem is equivalent to the single-sink aggregation/flow optimization problem with nonuniform source demands.
Proposition 3.
Suppose Xi~N(μi,σi2), and X1,…,Xη are independent. Let g(·) be weighted sum with arbitrary weights a1,…,aη. For sufficiently large b, there exists an algorithm yielding the mean cost within a factor of log(η)+1 of the optimal cost of (P).
Proof.
Consider the information communicated over Edge e denoted by Y:(15)Y=∑i∈UeaiXi.Since Xi’s are independent Gaussian RVs, Y is also Gaussian with variance σe2 where σe2:=∑i∈U(e)ai2σi2. Thus the differential entropy of Y is given by(16)hY=12log2πe∑i∈Ueai2σi2.We observe that, from (16), we can treat ai2σi2 as the “flow” generated by Source i, and the sum of flows at Edge e incurs the entropy cost as in (16). Specifically we will make the following definitions:(17)fi:=ai2σi2,i∈S,(18)f∗:=mini∈Sfi,(19)ψx:=α+b+12log2πex,x≥0,(20)ϕx:=ψx,x≥f∗,ψf∗f∗x,0≤x<f∗.Here fi represents the (unsplittable) flow demand generated by Source i, and f∗ denotes the minimum demand. Hence under a flow routing scheme, the total amount of flow at Link e is given by ∑i∈U(e)fi. Then from (16), the associated communication cost incurred at Link e is given by ψ(∑i∈U(e)fi) bits, that is, ψ(·) represents the information rate of a flow aggregated at Link e. Unlike the previously defined cost functions, ψ is no longer a function of the number of sources on a link, but instead the function of the amount of flow on that link. Finally we define the aggregation cost function ϕ in terms of ψ as in (20) in order to meet the concavity condition for ϕ as follows: ϕ is essentially identical to ψ, and if(21)b≥12log12πef∗,one can show that ϕ(x) is concave and increasing for all x≥0. Hence under the condition (21), ϕ is an increasing concave function of the total flow aggregated on a link. In that case we can use the algorithm proposed by Meyerson et al. [19] which essentially extends the HM algorithm to the problems with nonuniform source flow demands, and can approximate the optimal cost by a factor of log(η)+1 on average.
In summary, the key question was whether (P) can be cast as a flow aggregation problem, if g(·) is weighted sum. In general, it is difficult to make such association due to asymmetry; however, we revealed that such formulation is possible for independent Gaussian sources.
4.3. Discussions
Note that some properties regarding Xi’s such as the submodularity relation in (11), used to show that ϕ is a CAF rely heavily on the independence of Xi’s. When Xi’s are correlated, we can find examples of ϕ which are not CAF for the summary function of sum as follows. Let X1 and X2 be jointly Gaussian with the same marginal given by N(0,1) with E[X1X2]=ρ. Then X1+X2 is distributed according to N(0,2(1+ρ)), thus we have that, if ρ<-0.5, then(22)hX1+X2=12log2πe·21+ρ<12log2πe=hX1.Thus the entropy function does not satisfy the second condition of Definition 2, that is, the increasing property, as a CAF. Hence for arbitrarily correlated sources, presumably few meaningful arguments can be made on optimal aggregation problems, even for simple summary functions such as sum.
The discussion so far enables us to deal with more general objective functions extended from (P). Consider a function γ:R+→R+ which is concave and increasing. We now define communication overhead on an edge as the function γ of the average number of bits transmitted over the edge. Namely, we consider the following extension of (P):(23)P'MinimizeT=V′,E′⊆G∑e∈E′weγϕe.Consider (P') for the summary function sum for i.i.d. sources and weighted sum for independent Gaussian sources. Note that the composition of two concave and increasing functions is also concave and increasing [35]. Thus γ(ϕe) is a concave and increasing function of the amount of flows at an edge, and thus is a CAF. Hence HM algorithm can be used to approximate (P').
5. Aggregation Schemes for Extreme-Type Summary Functions5.1. Case Study
In this section we consider summary functions regarding the extreme statistics of measurements, that is, max or min. We will first investigate the entropy of the extreme statistics of a set of RVs. Consider m measurements denoted by Xi, i=1,…,m. Since max1≤i≤mXi=-min1≤i≤m(-Xi), we will focus only on max without loss of generality. It is easily seen that max function is divisible, thus the aggregation process is similar to that for sum: a node simply applies max function to the aggregated data. For example, suppose a node receives data given by X1,…,Xm. The node simply computes maxi=1,…,mXi and forwards it to the next hop.
For extreme-type summary functions, we will show that ϕ is in general not a CAF. In particular we consider several cases of practical importance.
Case 1 (Gaussian RVs).
We consider the problem of retrieving the maximum of i.i.d. Gaussian RVs. We assume that Xi~N(0,1) for i∈S where we again assume that ϕ(m)=α+b+h(maxi=1,…,m[Xi]) for m≥1 and some constant b. We provide a numerical evaluation of h(maxi=1,…,m[Xi]) on the left of Figure 2. We observe that ϕ(m) is strictly convex and decreasing in m for m≥1, thus ϕ is not a CAF.
The differential entropy of the maximum of a set of i.i.d. RVs distributed according to an RV X. On the left, X~N(0,1), and on the right, X is uniform on [0,1].
Case 2 (Extreme data retrieval problem).
We consider the problem of extreme data retrieval defined as follows. Assume that a source node i∈S measures some physical quantity which is distributed according to a continuous RV Yi. We assume Yi’s are independent but not necessarily identically distributed. Suppose with some probability Yi is equal to a large number, which indicates an “abnormal” event. An important application of sensor networks is to detect the maximum abnormality among the measurements. The abnormality is defined as how far a sensor’s measurement has deviated from its usual statistics as follows. Let us denote the cumulative distribution function (CDF) of Yi by Fi(·) or P(Yi≤y)=Fi(y), i∈S. Consider realizations of Y1,…,Yη given by y1,…,yη. We will quantify the abnormality at Source i in terms of how unlikely the measurement yi is: specifically the goal of the sink is to retrieve mini∈SP(Yi>yi), or alternatively,(24)maxi∈SFiyi,thus the abnormality of yi is defined by Fi(yi). Let Xi=Fi(Yi). We will assume that the nodes transmit and aggregate Xi instead of Yi, and the goal of the sink is to retrieve maxi∈SXi. Note since Xi=Fi(Yi) is the RV evaluated at its distribution function, one can show that Xi’s are i.i.d. RVs uniformly distributed on [0,1]. Thus the problem reduces to an optimal aggregation problem retrieving max of i.i.d. uniform RVs.
We will show that ϕ associated with the extreme data retrieval problem is convex and decreasing function when the number of aggregated measurements is greater than or equal to 2. Suppose the data aggregated at a node is given by X1,…,Xm and define Zm:=maxi=1,…,mXi. As previously we assume that the node requires on average ϕ(m)=α+b+h(Zm) bits to transmit Zm.
Proposition 4.
Consider the extreme data retrieval problem. The aggregation cost function ϕ(m) is convex and decreasing for m≥2.
Proof.
Since Zm is the maximum of m i.i.d. uniform RV’s, the CDF of Zm denoted by FZm(·) is given by(25)FZmz=zm.Thus the probability density function (pdf) of Zm denoted by fZm is given by mzm-1. If we compute h(Zm),(26)hZm=-∫01fZmzlogfZmzdz=-∫01mzm-1logmzm-1dz=-logm+m-1m.Thus(27)ϕm=α+b-logm+1-m-1,m≥1,0,m=0.By regarding m as a continuous variable, we have that, for m≥1,(28)dϕdm=-1m+1m2,d2ϕdm2=1m2-2m3=1m21-2m.Clearly ϕ(m) is decreasing for m≥1, and since its second order derivative is nonnegative for m≥2, ϕ(m) is convex for m≥2.
On the right of Figure 2 the plot of h(Zm) is shown. Note h(Zm) is strictly convex for m≥2, but overall appears to be approximately convex. Note that h(Zm) is nonpositive, thus one could select a sufficiently large b such that ϕ(η)=α+b+h(Zη)≥0, so that ϕ(m)≥0 for all 1≤m≤η.
In general, for a convex and decreasing ϕ, (P) is clearly NP-hard since the problem contains the Steiner tree problem as a special case. In the following section we present a novel algorithm which captures key properties of convex and decreasing ϕ. Later we show by simulation the algorithm effectively achieves low cost.
5.2. Algorithm for Convex and Decreasing Aggregation Cost Functions5.2.1. Motivation
Before we describe our algorithm we present the motivation behind the algorithm. An important observation for the data aggregation problems was made in [25] for concave and increasing ϕ. They proposed a “hub-and-spoke” model for so-called facility location problem. The idea is that when ϕ is concave and increasing, one should first aggregate flows to some “hubs,” then route the aggregated flow from the hubs to the sink at the minimum cost; this is done by building an approximately optimal Steiner tree where the hubs (facility locations) are the Steiner nodes. The rationale is that, once multiple flows are aggregated at hubs, the cost of routing them collectively to the sink is cheaper than routing the sources’ flows separately, due to the concavity of ϕ. We observe two aspects in such hub-and-spoke schemes. Firstly by local aggregation of flows at hubs we aim at greedily reducing costs based on local information, which we view as the microscopic approach to reduce cost. Secondly by building an approximately optimal Steiner tree with respect to the hubs and the sink, we take the global network structure into account, which can thus be seen as the macroscopic aspect for cost reduction. Hence there exists a tradeoff between microscopic and macroscopic aspects of the cost reduction. A similar observation on such tradeoff was made in [12]. However our key question is that, how do we achieve an optimal tradeoff between those aspects for a convex and decreasing ϕ?
Consider the three examples of aggregation cost functions denoted by ϕ1, ϕ2, and ϕ3 which are decreasing and convex for m≥1 as shown in Figure 3. In case of ϕ1, we see that ϕ1 is flat for m≥1, that is, the average number of bits communicated over a link is constant irrespective of the number of flows passed through it. Thus, the minimum cost routing problem reduces to a Steiner tree problem, in which case a completely “macroscopic” solution is optimal. In case of ϕ2, we see that ϕ2 decreases slowly in m. Thus, the more number of flows merges at a link, it takes the less number of bits to transmit the merged information. Suppose we use the hub-and-spoke scheme to aggregate flows in a local manner. The amount of aggregated flows at a hub is at least 2: note that however, ϕ2 is approximately “flat” for m≥2. This implies that, once more than two flows are aggregated, the benefits from further local flow aggregation will be negligible. Hence the optimal routing problem from the hubs to the sink approximately reduces to the Steiner tree problem! Thus one could expect that local aggregation (microscopic approach) followed by an optimal Steiner tree construction (macroscopic approach) would yield a good solution. Now let us consider ϕ3. The overall rate of decrease of ϕ3 is higher than that of ϕ2. It appears that when the number of aggregated flows is significantly high, for example, m is greater than 6, ϕ3 becomes effectively “flat.” This suggests that, one should keep aggregating flows until sufficient amount of flows, say 6, is aggregated, that is, the microscopic cost reduction should be applied for multiple times in a hierarchical manner, then build an optimal Steiner tree with respect to the aggregated sources, that is, applying macroscopic reduction.
Aggregation cost functions which are convex and decreasing for m≥1.
The example provides us with some insights. Since ϕ(m) is convex decreasing, the marginal benefit of local aggregation is large for small m but decreases with increasing m. In other words, when m is small, that is, in the early stages of the overall aggregation process, one should focus on low-cost local aggregation in order to benefit from high rate of decrease of ϕ(m) for small m. Meanwhile, once a large number of flows are aggregated, it pays to perform macroscopic cost reduction from there on by building the optimal Steiner trees since ϕ becomes more “flat” with increasing m. This suggests that there exists a tradeoff point at which such microscopic and macroscopic reduction are optimally balanced. Unfortunately it is difficult to know such a tradeoff point in advance. The proposed algorithm not only exploits both the microscopic and macroscopic aspects of cost reduction for a convex and decreasing ϕ, but also empirically searches for the optimal tradeoff point. Details are presented in the following section.
5.2.2. Outline
An outline of the proposed algorithm is presented as follows. The algorithm consists of multiple stages. A hub-and-spoke problem (or facility location problem) is approximately solved at each stage. The flows from source nodes are merged at the hubs. The hubs at the present stage become the source nodes in the next stage, that is, the flows are merged hierarchically. Instead of solving complex facility location problem, we find a minimum weight edge cover (MWEC) on the source nodes at each stage as a simple approximation. The rationale is that we would like to cluster sources for local aggregation at low costs, and by definition the MWEC incurs low cost in doing that. MWEC consists of multiple connected components, each of which is a tree. For each connected component we select a source as a hub and call it a center node (details on the selection of center nodes are provided later). The flows in that component is aggregated at the center node.
At each stage, once the center nodes are determined, we build an approximately optimal Steiner tree with respect to the center nodes and the sink. We use algorithm in [36] for the Steiner tree construction. Their algorithm provides the best known ρS-approximation for Steiner tree problem where ρS≈1.39.
Each stage outputs an aggregation tree. The output tree at Stage i is the union of the paths from all the hierarchical aggregations found up to Stage i and the Steiner tree built at Stage i. Namely, the output tree at Stage i is a combination of i consecutive hierarchical aggregations (microscopic cost reduction) and a Steiner tree with respect to the sink and Stage i hubs (macroscopic cost reduction).
Hence, over the stages, the algorithm progressively changes the balance between microscopic and macroscopic aspects of cost reduction in the output trees. Roughly speaking, the output trees from later stages are more biased towards the microscopic aspect. After the stages are over, we pick the tree with the minimum cost among the output trees. As a result the algorithm empirically searches for the point of the “best” balance between the two aspects of cost reduction over the stages. Hence one could expect that our algorithm will work well for any convex and decreasing ϕ.
5.2.3. Algorithm Description
We present a formal description of the proposed algorithm followed by an explanation of further details. For given aggregation tree T⊆G, let c(T) denote the total energy cost associated with T, as in the objective of (P).
Hierarchical Cover and Steiner Tree (HCST) Algorithm
Begin Algorithm
(Metric completion of G) If G is not a complete graph, perform a metric completion of G to yield a complete graph. Namely, if there exist any pair of vertices without an edge, create an edge between the pair and assign the edge a weight which is the distance between the pair. The distance is measured in terms of the sum of the weights on the shortest path between the pair.
(Initialization) i←0, S0←S.
(Initialize flows at sources) fu←1, for all u∈S.
(Initial output is a Steiner tree) Jump to Step 7.
(Minimum weight edge cover) Let us denote the subgraph of G induced by Si-1 by Gs. Find a minimum-weight edge cover Mi in Gs. Let Ci=(Si-1,Mi) be the subgraph of G induced by the cover.
(Node selection) Suppose Ci has ν connected components, and denote the jth connected component of Ci by Kj=(Vj,Ej) for 1≤j≤ν. For each Kj, select a node with the maximum degree (ties are arbitrarily broken), say uj, which is called a center node. Kj is a tree, and uj becomes the root of Kj. All the flows in Kj are aggregated at uj such that every node transmits data to its parent node after the data from its child nodes has been aggregated at the node. The total flow at uj is updated as follows:(29)fuj⟵∑u∈Vjfu.
Remove all the noncenter nodes from Si-1, and let Si be the resulting set of source nodes.
(Steiner tree construction) Build ρS-optimal Steiner tree Tis with respect to the source nodes in Si and the sink, using the algorithm in [36].
(Merging trees) If i>0, merge all the MWECs found up to the present stage and the Steiner tree found in Step 7; that is, let(30)Ti⟵Tis∪⋃j=1iCj.
If i=0, T0←T0s. We call Ti the output tree of Stage i.
(Loop) If |Si|>1, i←i+1 and go back to Step 5. If |Si|=1, continue to Step 10.
(Tree selection) The final output is the tree Tj∗ such that(31)j∗=argminj=0,…,icTj,
that is, the minimum cost tree among the output trees from all the stages.
End Algorithm
5.2.4. Comments
We explain the details of several steps in the algorithm. In Step 3 the flow variables denoted by fu, u∈S, associated with the source nodes are initialized where we will track the amount of flows throughout the algorithm. In Step 6 it is natural to select a node with the maximum degree as the center node, since such node is literally a “hub.” When solving the hub-and-spoke problem at each stage, we choose to solve the MWEC problem whereas in [25] the load-balanced facility location problem is solved. An advantage of solving MWEC problem is that it is considerably simpler than load-balanced facility location problems since an MWEC problem can be reduced to a minimum weight perfect matching problem [37]. Note that the algorithm in [25] solves the hub-and-spoke problem only once, that is, its output is analogous to the output tree from Stage 1 of our algorithm. Meanwhile HM algorithm solves minimum weight perfect matching at each stage in order to locally aggregate flows with low costs. HM algorithm solves the matching problem hierarchically until all the flows are aggregated to a single source, and the final output is the union of those matchings. Thus its final output is analogous to that from the final stage of our algorithm. In other words, the outputs of the abovementioned algorithms correspond to those from intermediate stages in our algorithm. The HIERARCHY algorithm proposed in [20] hierarchically constructs Steiner trees and solves load-balanced facility location problems, however in a way which heavily relies on the concave and increasing property of ϕ. Thus the algorithm may not be suitable for convex and decreasing ϕ.
5.3. Performance Analysis
In this section we analyze the performance of HCST algorithm. For set E of weighted edges, let E denote the sum of its edge weights, that is, ∑e∈Ewe. For given source set Σ, let TS(Σ) denote the edge set of the optimal Steiner tree associated with Σ.
Proposition 5.
For given network graph G=(V,E), the cost achieved by HCST algorithm is higher than the optimal algorithm by a factor of at most B defined as(32)B:=1ϕZ·mini=0,…,Iϕ2iρS+∑k=1iϕ2k-1ρk,where I(I≤log2η) denotes the stage at which HCST algorithm terminates. ρS≈1.39 denotes the approximation ratio for Steiner tree problem, and ρi∈[0,1] is the ratio of the sums of edge weights between MWEC Mi at Stage i of HCST algorithm and the Steiner tree associated with source set S, that is,(33)ρi=MiTSS.Also Z is defined as(34)Z:=∑i=1η/2iwi+η∑i=0n-η/2-1w|E|-i∑i=1η/2wi+∑i=0n-η/2-1w|E|-i,where n:=V, and w[i] denotes the ith smallest value of the edge weights of G. Note that the second summation term of (32) is defined to be 0 if i=0.
Proof.
Denote the optimal cost by OPT. We first find a lower bound for OPT. Let T∗ denote the set of edges of the optimal aggregation tree. Let us sort the amount of edge flows of T∗ in increasing order, and denote them by di, that is, 0<d1≤d2≤⋯≤dl where T∗ has l edges. There are at least η nonzero flows since there are η sources, hence dη>0 and l≥η hold. In addition l is at most n, since T∗ is a tree. Also it is clear that di≤i, for i=1,…,η, and di is at most η for η<i≤l. This implies that, since ϕ is decreasing, ϕ(di)≥ϕ(i), i=1,…,η. Let us denote the weight of the edge that carries flow di by vi. For real numbers a and b, let a∧b:=min(a,b). We have that(35)OPT=∑i=1lviϕdi≥∑i=1lviϕi∧η(36)OPT≥∑i=1lviϕ∑i=1li∧ηvi∑i=1lvi(37)OPT≥TSSϕ∑i=1li∧ηvi∑i=1lvi,where (36) is by Jensen’s inequality due to the convexity of ϕ, and (37) is from the definition of Steiner trees. Considering that ϕ is decreasing, we would like to make the argument of ϕ in (37) as large as possible in order to find a lower bound for OPT. Hence we would like to maximize λ(v1,…,vl) defined as(38)λv1,…,vl:=∑i=1li∧ηvi∑i=1lvi,where vi, i=1,…,l are chosen from the edge weights of G. For the purpose of maximizing (38), we will assume v1≤v2≤⋯≤vl WLOG, because over all possible permutations π(1),π(2),…,π(l) of {1,2,…,l}, ∑i(i∧η)vπ(i) is maximized when vπ(1)≤vπ(2)≤⋯≤vπ(l).
We first observe that λ(·) is decreasing in v1,…,vη/2, since if k≤η/2, we have that(39)∂λ∂vk=k∑ivi-∑ii∧ηvi∑ivi2=∑j=1k-1k-jvj-v2k-j+∑j=2klk-j∧ηvj∑ivi2≤0.Hence λ(·) can be maximized over v1,…,vη/2 by choosing η/2 smallest weights from the edge weights of G, that is, by letting vi=w[i] for i=1,…,η/2.
Next we would like to derive an upper bound for λ(w[1],…,w[η/2],vη/2+1,…,vl) as follows:(40)λw1,…,wη/2,vη/2+1,…,vl=∑i=1η/2iwi+∑i=η/2+1li∧ηvi∑i=1η/2wi+∑i=η/2+1lvi(41)≤∑i=1η/2iwi+η∑i=η/2+1lvi∑i=1η/2wi+∑i=η/2+1lvi(42)≤∑i=1η/2iwi+η∑i=0n-η/2-1wE-i∑i=1η/2wi+∑i=0n-η/2-1wE-i.For inequality (42), we used the fact that (41) is increasing in ∑i=η/2+1lvi, hence we chose l=n and the largest possible weights w[|E|],w[|E|-1],… for vη/2+1,vη/2+2,…, in order to maximize ∑i=η/2+1lvi. From (42), we obtain λ(v1,…,vl)≤Z. Hence from (37), we obtain(43)OPT≥TSSϕZ.
Now let us consider the cost of output tree at Stage i of HCST algorithm, or c(Ti). Recall that in HCST algorithm, Si denotes the source set at Stage i, and Ti denotes the output tree at Stage i. The cost of Ti is divided into (i) the cost incurred by hierarchical MWECs M1,…,Mi, and (ii) the cost of ρS-approximate Steiner tree Tis associated with Si. Hence(44)cTi=∑k=1i∑e∈Mkweϕde+∑e∈Tisweϕde,where d(e) denote the amount of flow at Edge e under HCST algorithm. Note that, the amount of flow in the network at Stage i is at least 2i-1, since the flows are agglomerated through MWECs at every stage. Since ϕ(·) is decreasing, the first summation of (44) is at most(45)∑k=1iϕ2k-1∑e∈Mkwe=∑k=1iϕ2k-1Mk.Note that the first summation of (44) is 0 for Stage 0. As for the second summation of (44),(46)∑e∈Tisweϕde≤ϕ2iTis(47)≤ρSϕ2iTSSi(48)≤ρSϕ2iTSS.Inequality (48) is due to Si⊆S; specifically, the Steiner tree for S is a tree that spans Si, hence by definition, the sum of edge weights of TS(Si) is no more than that of the Steiner tree associated with S.
In conclusion, we have that, from (43), (45) and (48),(49)cTi≤ρSϕ2iTSS+∑k=1i-1ϕ2k-1Mk≤OPTϕZρSϕ2i+∑k=1iϕ2k-1MkTSS.Since the cost of HCST algorithm is mini=0,…,Ic(Ti), the proposition is proved.
An interpretation for ratio B in (32) is as follows: the first term in the bracket of B represents a bound on the macroscopic cost associated with the Steiner tree approximation. The second term in the bracket of B is a bound on the cost associated with the hierarchical aggregation of flows, that is, the microscopic cost reduction. Clearly we have that M1≥M2≥⋯, due to S1⊇S2⊇⋯, thus ρ1,ρ2,…, is a decreasing sequence where 0≤ρi≤1, for all i. The progressive cost reduction due to hierarchical flow aggregation is reflected in ρ1,ρ2,…. As in (32), B is the minimum of I+1 numbers, each of which contains a weighted sum of ϕ(·) in different combination of weights ρi. Hence B represents the empirical minimum of different degrees of tradeoff between microscopic and macroscopic cost reduction.
Next we discuss constant Z in (34). Firstly observe that Z≤η; the first summation of the numerator of (34) is at most η∑i=1η/2vi, in which case the first term of (34) is at most η. Note that a naive upper bound for λ(v1,…,vn) is simply η, yielding a lower bound OPT≥TS(S)ϕ(η); however we observe that our bound (43) improves such a bound since ϕ(Z)≥ϕ(η).
B can be numerically computed for a given graph, and in the next section we provide numerical examples of B. We also apply HCST algorithm to a specific graph as an example.
5.4. Illustrating Examples
In this section we consider a simple convex and decreasing ϕ. As previously the packet header length is α bits, and we assume that the maximum packet size is 10 times the header length, that is, 10α. We will accordingly consider ϕ(m) which is convex and decreasing for m≥1 of the following form:(50)ϕm=9αm-1+α,m≥1,0,m=0.Clearly α<ϕ(m)≤10α holds for m≥1.
Figures 4 and 5 show the numerical examples of the performance bound B. B is computed and averaged over randomly generated graphs of uniformly distributed nodes in a square area. In Figure 4, network size n is fixed to 200, and B is plotted against the number of source nodes η. We consider two types of cost functions: the curve labelled “harmonic” represents the cost function (50) in which ϕ(·) decreases as a harmonic sequence. The curve labelled “exp” corresponds to the case where the term m-1 in (50) is replaced by exp(-δ(m-1)) where the parameter δ>0 controls the decay rates of the cost function. We set δ=0.2 in this example. In addition, we compare B with a simple analytical bound; suppose we build a ρS-approximate Steiner tree based on S. The cost under that tree is at most ρSTS(S)ϕ(1). By combining that cost with (43), we obtain a simple approximation ratio of ρSϕ(1)/ϕ(Z) for the approximately optimal Steiner tree. In Figure 4, the plots of such bounds based on ρS-approximate Steiner tree are added for both harmonic and exponential cost functions, and are labelled as “Steiner(har)” and “Steiner(exp),” respectively. We observe that B provides improved bounds as compared to those based on ρ-approximate Steiner tree. In Figure 5, B is plotted against varying n under the aforementioned harmonic and exponential cost function where we fixed δ to 10. In Figures 4 and 5, we observe that B eventually becomes nearly constant, or increases very slowly at most, even if the system size grows. Hence we conclude that B provides an approximation ratio which remains effectively constant irrespective of the system size.
Performance bounds under varying number of sources.
Performance bounds under varying network sizes.
Next we present an example of the application of the HCST algorithm to a specific graph. An example of G is given in Figure 6(a). G consists of n=10 nodes where Node 1 is the sink, that is, t=1. There are four source nodes: S={2,3,4,5} where the sources are depicted in a shaded color. Each source generates 1 unit of data. We will again consider convex and decreasing ϕ(·) given by (50), and assume α=1. Figure 6(b) shows the output of Stage 0 or T0 which is an approximately optimal Steiner tree. Figure 7 shows the MWECs over the stages. Figure 7(a) shows the metric completion of the subgraph induced by S. Figure 7(b) shows the MWEC at Stage 1. Node 4 and 5 became the center nodes as emphasized in the figure. Figure 7(c) shows the MWEC and the center node at Stage 2.
(a) G in the example. (b) The output T0 from Stage 0.
(a) A complete graph of the sources. (b) MWEC from Stage 1. (c) MWEC from Stage 2.
Figure 8(a) shows the full paths of the MWEC at Stage 1, that is, that in Figure 7(b), in G. By building an approximately optimal Steiner tree T1s associated with {1,4,5} and taking the union of T1s and C1 as in Step 8, we get T1 as in Figure 8(b). Similarly Figure 9 demonstrates Stage 2 of the algorithm. The full paths for the MWEC from Figure 7(c) in G are shown in Figure 9(a). Note that Node 4 is selected as the center node, and the output from Stage 2 or T2 is shown in Figure 9(b). Let us compare the energy costs from all the stages. For T0, a total of three flows pass through the link between Node 1 and 3, while the flow on the other links is simply 1. Thus, the cost of T0 from Stage 0 is given by(51)cT0=8×ϕ3+8+6+5+10×ϕ1=8×10+29×4=322.Similarly, we have that(52)cT1=314.8,cT2=275.5.Thus the final output of HCST is T2 with the final cost of 275.5. Note that in this example, the Shortest Path Tree (SPT) heuristic incurs the energy cost of 374.
(a) MWEC in G from Stage 1. (b) The output T1 from Stage 1.
(a) MWEC in G from Stage 2. (b) The output T2 from Stage 2.
Next consider ϕ(m) such that(53)ϕm=1,m≥1,0,m=0.Assume that the algorithm has yielded the same T0, T1 and T2 as the previous case. Since ϕ is constant for m≥1, the problem reduces to the Steiner tree problem, thus one would expect that T0 would perform the best since T0 is intended to be an approximately optimal Steiner tree. The energy costs are given by(54)cT0=37,cT1=44,cT2=43;thus indeed the HCST algorithm will output T0 as the best solution with cost 37, whereas the SPT heuristic will yield the energy cost of 41. This demonstrates that our algorithm can effectively deal with various types of convex and decreasing aggregation cost functions. In the following section we will evaluate the performance of the HCST algorithm by simulation.
6. Simulation
In our simulation we randomly generate G as follows. The node locations are generated independently and uniformly on a unit square. We define G as the Delaunay graph induced by the node locations. An example of G is depicted in Figure 10 for n=20. As previously it is assumed that the average number of bits required to transmit the aggregated information g(X1,…,Xm) is approximately α+b+h(g(X1,…,Xm)) where we set header length a to 1 and the number of quantization bits b to 3. The edge weights are randomly selected from {1,…,10} which represents the energy consumption per transmitted bit. In our simulation two types of sources are considered. The first type, called uniform type, is associated with the extreme data retrieval problem, that is, Xi are i.i.d. uniformly on [0,1]. The second type, called Gaussian type, is associated with retrieving the maximum of Gaussian source data where Xi~N(0,1). The summary function g(·) is given by max function.
An example of randomly generated G for simulation.
We will compare the performance of the HCST algorithm with HM algorithm [16] and SPT heuristic. Figure 11 shows the average energy consumption of the algorithms when we fix the number of sources to 8 with varying n. The energy cost shown on the left (resp. right) of Figure 11 is associated with the sources of uniform (resp. Gaussian) type. We observe that the HCST algorithm achieves lower energy costs than the SPT heuristic in both types of the sources. The gain in the energy savings by the HCST algorithm ranges 35–38% for uniform type sources and 24-25% for Gaussian type sources. Compared to HM algorithm, our algorithm reduces the energy consumption by 20-21% and 14-15% for uniform and Gaussian type sources, respectively. HM algorithm focuses on microscopic cost reduction, which may be effective for concave and increasing cost functions, however not for convex and decreasing cost functions. Comparing SPT heuristic and HCST algorithm, we observe that the difference in the mean energy consumption of the algorithms slightly increases with n. This can be interpreted as follows: for larger networks, there is further room for improvement by HCST, for example, there are more choices for Steiner nodes and more ways to merge sources at low costs by MWEC. Thus the performance gain from the HCST algorithm relative to the SPT heuristic is expected to grow with n as shown in the simulation.
Energy cost associated with a fixed number of sources.
Figure 12 shows the mean energy costs with varying n where we scale the number of the sources proportional to n. Specifically in the simulation we let η=n/5, that is, one out of five nodes is a source node. In the figure we see that the HCST algorithm again outperforms the SPT heuristic. The relative savings in energy by HCST algorithm ranges 19–41% for uniform type sources and 14–27% for Gaussian type sources. Relative to HM algorithm, HCST algorithm saves energy costs by 20–23% and 14–17% for uniform and Gaussian type sources, respectively. The difference in the energy cost of the algorithms increases with n similar to the case of fixed number of sources, however, such a rate of increase is higher in the case of varying number of sources. This can be explained as follows. When we increase the network size, the number of sources also increases proportionally. When the network size grows, from the previous argument such that there is further room for improvement by HCST, its relative gain will increase with the network size. In addition to that, since the number of sources grows, the total number of stages at the end of the HCST algorithm will also increase. Since HCST chooses the best tree from the intermediate output trees collected over stages, a large number of stages implies that we can choose the final output tree from a large pool of trees having various degrees of tradeoff between microscopic and macroscopic aspects of the cost reduction. Thus the abundance of source nodes enables us to choose an aggregation tree with a “refined” tradeoff, which is crucial for a convex and decreasing ϕ. This explains the enhanced performance of HCST with increasing number of sources. Hence we conclude from the simulation that the HCST algorithm can improve performance for various proportions of source nodes among the network.
Energy cost associated with a varying number of sources.
7. Conclusion
In this paper we have studied a single-sink aggregation problem for wireless sensor networks computing several widely used summary functions. It is observed that the problem is characterized by the aggregation cost function ϕ which maps the amount of aggregated measurements to transmission costs at a link. We show that the properties of ϕ depend heavily on the chosen summary function g(·). When g(·) is given by sum or mean, we showed that ϕ is concave and increasing, implying that there exist algorithms such as the HM algorithm which can approximate the optimal algorithm by a factor logarithmic in the number of sources. A similar argument was made when g(·) is weighted sum for i.i.d. Gaussian sources. When g(·) is given by max, however, we have shown that ϕ is convex and decreasing for certain types of sources. For such ϕ we identify that there exists a tradeoff between the following two aspects of cost reduction: firstly local clustering of sources which is the microscopic aspect, and secondly a low-cost routing from the clustered sources to the sink which is the macroscopic aspect. We proposed the Hierarchical Cover and Steiner Tree algorithm which empirically finds the best tradeoff point between the aspects. Numerical examples and simulation results were presented to demonstrate that the HCST algorithm is versatile and improves performance for various types of convex and decreasing ϕ. A future direction would be investigating the optimal aggregation problems for a wider range of summary functions. In addition, the evaluation of the HCST algorithm in a real-world testbed environment is also part of our future work.
Conflict of Interests
The authors declare that there is no conflict of interests regarding the publication of this paper.
Acknowledgments
This work was supported by Basic Science Research Program through The National Research Foundation of Korea (NRF) funded by The Ministry of Science, ICT & Future Planning (NRF-2013R1A1A1062500), and in part by the ICT R&D program of MSIP/IITP, (10-911-05-006, High Speed Virtual Router that Supports Dynamic Circuit Network).
GiridharA.KumarP. R.Computing and communicating functions over sensor networks200523475576410.1109/JSAC.2005.8435432-s2.0-17144395505HeidemannJ.SilvaF.IntanagonwiwatC.GovindanR.EstrinD.GanesanD.Building efficient wireless sensor networks with low-level naming200135514615910.1145/502059.502049IntanagonwiwatC.GovindanR.EstrinD.Directed diffusion: a scalable and robust communication paradigm for sensor networksProceedings of the 6th annual international conference on Mobilecomputing and networkingAugust 2000ACM56672-s2.0-0034539015KrishnamachariL.EstrinD.WickerS.The impact of data aggregation in wireless sensor networksProceedings of the 22nd International Conference on Distributed Computing Systems Workshops2002IEEE57557810.1109/ICDCSW.2002.1030829BagaaM.ChallalY.KsentiniA.DerhabA.BadacheN.Data aggregationscheduling algorithms in wireless sensor networks: solutionsand challenges20141631339136810.1109/surv.2014.031914.000292-s2.0-84897120679VillasL. A.BoukercheA.RamosH. S.de OliveiraH. A. B. F.de AraujoR. B.LoureiroA. A. F.DRINA: a lightweight and reliable routing approach for in-network aggregation in wireless sensor networks201362467668910.1109/tc.2012.312-s2.0-84874837585MaJ.LouW.LiX.-Y.Contiguous link scheduling for data aggregation in wireless sensor networks20142571691170110.1109/tpds.2013.296ZhangZ.-J.LaiC.-F.ChaoH.-C.A green data transmission mechanism for wireless multimedia sensor networks using information fusion2014214141910.1109/mwc.2014.6882291AkyildizI. F.MelodiaT.ChowduryK. R.Wireless multimedia sensor networks: a survey200714632392-s2.0-3724906670310.1109/mwc.2007.4407225Von RickenbachP.WattenhoferR.Gathering correlated data in sensor networksProceedings of the Joint Workshop on Foundations of Mobile Computing2004ACM6066CristescuR.Beferull-LozanoB.VetterliM.Networked slepianwolf: theory, algorithms, and scaling laws200551124057407310.1109/tit.2005.8589802-s2.0-29144528107PattemS.KrishnamachariB.GovindanR.The impact of spatial correlation on routing with compression in wireless sensor networksProceedings of the 3rd International Symposium on Information Processing in Sensor Networks (IPSN '04)April 2004Berkeley, Calif, USA283510.1109/IPSN.2004.1307320CristescuR.Beferull-LozanoB.VetterliM.On network correlated data gathering4Proceedings of the 23rd AnnualJoint Conference of the IEEE Computer and Communications Societies (INFOCOM '04)March 20042571258210.1109/infcom.2004.13546772-s2.0-8344251697LiuJ.AdlerM.TowsleyD.ZhangC.On optimal communication cost for gathering correlated data through wireless sensor networksProceedings of the 12th Annual International Conference on Mobile Computing and Networking (MOBICOM '06)September 2006ACM3103212-s2.0-33751030853HariharanS.ShroffN.Maximizing aggregated revenue in sensor networks under deadline constraintsProceedings of the 48th IEEE Conference on Decision and Control201048464851GoelA.EstrinD.Simultaneous optimization for concave costs: single sink aggregation or single source buy-at-bulkProceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA '03)2003SIAM499505AndrewsM.ZhangL.The access network design problemProceedings of the of IEEE Symposium on Foundations of Computer Science (FOCS '98)1998IEEE Computer Society40BartalY.On approximating arbitrary metrics by tree metricsProceedings of the 30th Annual ACM Symposium on Theory of ComputingMay 1998ACM1611682-s2.0-0031623888MeyersonA.MunagalaK.PlotkinS.Cost-distance: two metric network designProceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS '00)2000IEEE Computer Society6242-s2.0-0034506018GuhaS.MeyersonA.MunagalaK.A constant factor approximation for the single sink edge installation problemsProceedings of the 33rd Annual ACM Symposium on Theory of Computing2001ACM383388AndrewsM.AntaA.ZhangL.ZhaoW.Routing for energy minimizationin the speed scaling modelProceedings of the IEEE INFOCOMMarch 2010San Diego, Calif, USAIEEE1910.1109/INFCOM.2010.5462071AndrewsM.AntonakopoulosS.ZhangL.Minimum-cost network design with (Dis)economies of scaleProceedings of the IEEE 51st Annual Symposium on Foundations of Computer Science (FOCS '10)October 2010Las Vegas, Nev, USA58559210.1109/focs.2010.612-s2.0-78751555863SalmanF. S.CheriyanJ.RaviR.SubramanianS.Buy-at-bulk network design: approximating the single-sink edge installation problemProceedings of the 8th Annual ACM-SIAM Symposium on Discrete AlgorithmsJanuary 1997Society for Industrial and Applied Mathematics6196282-s2.0-0030733354AwerbuchB.AzarY.Buy-at-bulk network designProceedings of the 38th IEEE Annual Symposium on Foundations of Computer Science (FOCS '97)October 1997542547IEEE Computer Society10.1109/sfcs.1997.6461432-s2.0-0031331683KargerD. R.MinkoffM.Building Steiner trees with incomplete global knowledgeProceedings of the 41st Annual Symposium on Foundations of Computer Science (FOCS '00)November 2000IEEE Computer Society6136232-s2.0-0034497391CharikarM.KaragiozovaA.On non-uniform multicommodity buy-at-bulk network designProceedings of the 37th Annual ACM Symposium on Theory of Computing2005ACM1761822-s2.0-34848840958ChekuriC.HajiaghayiM. T.KortsarzG.SalavatipourM. R.Approximation algorithms for non-uniform buy-at-bulk network designProceedings of the 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS '06)October 2006Berkeley, Calif, USA67768610.1109/focs.2006.152-s2.0-34250370296ScaglioneA.ServettoS. D.On the interdependence of routing and data compression in multi-hop sensor networksProceedings of the 8th ACM Annual International Conference on Mobile Computing and Networking (MobiCom '02)September 2002Atlanta, Ga, USA14014710.1145/570645.570663GalluccioL.PalazzoS.CampbellA. T.Efficient data aggregation in wireless sensor networks: an entropy-driven analysisProceedings of the IEEE 19th International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC '08)September 2008162-s2.0-6994912611510.1109/pimrc.2008.4699915KandrisD.TsioumasP.TzesA.NikolakopoulosG.VergadosD. D.Power conservation through energy efficient routing in wireless sensor networks2009997320734210.3390/s909073202-s2.0-70350008054KandrisD.TsagkaropoulosM.PolitisI.TzesA.KotsopoulosS.Energy efficient and perceived QoS aware video routing over wireless multimedia sensor networks20119459160710.1016/j.adhoc.2010.09.0012-s2.0-79951677879GalluccioL.CampbellA. T.PalazzoS.Concert: aggregation-based congestion control for sensor networksProceedings of the 3rd International Conference on Embedded Networked Sensor SystemsNovember 2005San Diego, California, USA27427510.1145/1098918.1098951CoverT.ThomasJ.1991New York, NY, USAJohn Wiley & SonsMadimanM.On the entropy of sumsProceedings of the Information Theory Workshop (ITW '08)May 2008Porto, PortugalIEEE30330710.1109/ITW.2008.4578674BoydS.VandenbergheL.2004Cambridge, UKCambridge University Press10.1017/cbo9780511804441ByrkaJ.GrandoniF.RothvoßT.SanitàL.An improved LP-based approximation for steiner treeProceedings of the 42nd ACM Symposium on Theory of Computing (STOC '10)June 201058359210.1145/1806689.18067692-s2.0-77954711587SchrijverA.2003New York, NY, USASpringer