^{1}

^{2}

^{3}

^{1}

^{2}

^{3}

^{1}

^{2}

^{3}

Maximizing the lifetime of wireless sensor networks is NP-hard, and existing exact algorithms run in exponential time. These algorithms implicitly use only one CPU core. In this work, we propose to use multiple CPU cores to speed up the computation. The key is to decompose the problem into independent subproblems and then solve them on different cores simultaneously. We propose three decomposition approaches. Two of them are based on the notion that a tree does not contain cycles, and the third is based on the notion that, in any tree, a node has at most one parent. Simulations on an 8-core desktop computer show that our approach can speed up existing algorithms significantly.

In wireless sensor networks, each sensor node has only a limited amount of energy. When a node sends or receives messages, it consumes the corresponding amount of energy. Thus, the amount of traffic of a node influences how long the node can work, which in turn determines the lifetime of the network. To this end, finding a routing tree to get longer lifetime is a key issue, which is known to be NP-hard [

In fact, all existing exact algorithms run in exponential time [

In this work, instead of designing a new algorithm, we consider speeding up existing exact algorithms by using multicore CPUs to their full potential. The basic idea is to decompose the problem into independent subproblems and then solve them on different cores using existing exact algorithms. The challenge is how to decompose the problem. We propose three decomposition methods for different exact algorithms. The first is based on the fact that a tree does not contain (undirected) cycles, so we can break the network into subnetworks whenever we encounter an undirected cycle. This approach applies to all algorithms that consider the network as either an undirected graph or a directed graph. The second is based on directed cycle, and the network is divided whenever we find a directed cycle. The third is based on the fact that every node has only one parent node, so the network is divided according to different parent choices of a given node. The second and the third approaches apply to algorithms that consider the network as a directed graph.

Our contributions can be enumerated as follows:

We consider using the multicore of current computers to speed up existing algorithms. The proposed approaches are applicable to all exact algorithms based on one CPU core.

We propose three problem decomposition approaches. These approaches can decompose the problem into subproblems, which can be solved on different cores using any exact algorithm. We also propose a mechanism to expose information of solved subproblems to help solve other subproblems.

We implement our approach on an 8-core desktop computer and perform numerical simulations. The results suggest that, in general, the proposed approaches can reduce the empirical time of existing exact algorithms, especially when the problem size is large.

The rest of the paper is organized as follows. Section

Finding routing paths of messages to maximize lifetime is a critical problem in wireless sensor networks (e.g., [

A simple method is to enumerate all spanning trees [

Contrary to these works, our work in this paper focuses on how to use the multiple cores in current computers to their full potential. The proposed approaches can be incorporated with existing exact algorithms. Though the idea of using multicores in wireless sensor networks is not new, existing works do not focus on our problem. For example, [

We first review the problem and then introduce the solution framework. A sensor network contains

In this work, we assume that an operating system does not perform automatic multicore optimization; i.e., a single thread program can use at most one CPU core. To this end, we perform a simple experiment as follows. We run a dead loop program on two computers, one of which has 4 cores and the other has 8 cores. Both computers are equipped with the Windows operating system. The CPU utilization ratio is roughly 25% on the 4-core computer and is about 13% on the 8-core computer, which is consistent with our assumption. Note that if there are multiple threads, then the operating system will distribute the threads on different cores automatically.

We refer to the set of feasible solutions of a lifetime maximization problem as its solution space, i.e., the set of directed trees pointing to the sink. A subproblem is a lifetime maximization problem with smaller solution space. The basic idea is to find a set of subproblems whose solution space contains at least one optimal solution. A decomposition method is

Each subproblem is feasible; i.e., each subproblem contains at least one feasible solution.

At least two subproblems are returned, unless the original problem has only one feasible solution; i.e., the original network graph is itself a tree.

The union of the solution spaces of all subproblems contains at least one optimal solution to the original problem.

Figure

A problem decomposition framework.

A challenge for the above framework is that a feasible decomposition method might not generate a sufficient number of subproblems. For example, a decomposition method may only give two subproblems. To this end, we observe that sequentially combining several feasible decomposition methods results in a feasible decomposition method.

Suppose

Simply note that an optimal solution to subproblem

Therefore, we can repeatedly apply a feasible decomposition method until the number of subproblems is sufficient. Algorithm

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

In Algorithm

Algorithm

To see that Algorithm

For the number of calls to

To prove this claim, observe that each iteration of the while loop either increases

For the last part, if

This theorem suggests that when

The straightforward method is to create threads whose number is equal to the number of subproblems. Each thread invokes an exact algorithm on a subproblem. Then the operating system will schedule the threads on available CPU cores automatically. Unfortunately, there are several drawbacks in this approach.

First, if the number of subproblems is greater than the number of cores, then there exists a core on which several threads are running. These threads compete in the core unnecessarily, wasting precious CPU time. Second, if the number of subproblems is required to be less than or equal to the number of cores, then some cores are wasted if their threads terminate early. Third, subproblems are solved independently, so that solving one subproblem cannot help solving the other problems. For example, if a solved subproblem has a solution with lifetime 100, then for the other unsolved subproblems, we should not waste time finding solutions with less lifetime.

To address these limitations, we create a thread for a core, so that

Retrieve an unsolved problem and the best solution up to now.

Invoke an exact algorithm on the unsolved subproblem with the best solution up to now as a lower bound.

Mark the subproblem as solved, and update the best solution up to now.

Figure

Incorporating feedback to the framework to reduce unnecessary computation. Solved subproblems can provide lifetime lower bound to unsolved subproblems.

We can see that this approach does not have the above limitations. First, since the number of threads is equal to the number of cores, no two threads compete in the same core. Second, CPU cores are fully used, since they will keep running until all subproblems are solved. Third, when a thread attempts to solve a subproblem, it will retrieve the status of solved subproblems, e.g., the lifetime of the current best solution, which will help reduce unnecessary computation.

We propose three decomposition methods based on different observations. First, a tree does not contain undirected cycles. Second, a tree does not contain directed cycles. Third, a node has only one parent in a directed tree.

This approach applies to undirected graphs. Observe that a feasible solution to the lifetime maximization problem is a tree, so any feasible solution cannot contain a cycle. The basic idea of our approach is to find an undirected cycle and create subproblems by breaking the cycle, i.e., removing one edge at a time. Each decomposed subproblem contains one less edge than the original problem. Figure

Breaking an undirected cycle to create three subproblems.

One design issue is to decide which cycle to break. We propose to choose the cycle containing the minimum number of edges. The motivation is to generate a small number of subproblems at each time, so that the total number of subproblems can be controlled more easily when calling Algorithm

Algorithm

(1) find a cycle with minimum length by the MIN_CIRCUIT algorithm in [

(2)

(3)

(4)

(5)

(6)

Algorithm

It is easy to verify that the first two conditions of a feasible decomposition method are satisfied, since a cycle contains at least three edges. For the third condition, let the original problem be

To prove this, consider an arbitrary feasible solution

For the running time, note that the algorithm in [

When the network graph is directed, we can see that no solution contains a directed cycle. Thus, we first find a directed cycle and create a subproblem by removing one edge from the cycle. We choose the minimum cycle to create subproblems, so that the total number of subproblems can be better controlled. Figure

A directed network graph and three subproblems by breaking directed cycle ABCA.

One problem for this approach is that there may exist a subproblem that does not contain any feasible solution to the original problem. For example, in Figure

Algorithm

(1)

(2) let

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

(12)

(13)

(14)

(15)

(16)

Algorithm

Consider the recursion tree of Algorithm

For the running time, line (2) in Algorithm

Observe the fact that a node except for the sink has one parent in a directed tree. Thus, given a node, we can create subproblems by keeping one out edge to fix its parent and deleting other out edges. Figure

A directed network graph and three subproblems by fixing the parent node of vertex

Two issues need to be solved. First, a subproblem may not be feasible. This is similar to Section

Algorithm

(1)

(2) sort nodes in ascending order by initial energy, and let

(3)

(4)

(5)

(6)

(7)

(8)

(9)

(10)

(11)

Algorithm

Algorithm

For the running time, sorting nodes in line (2) runs in

In this section, we analyze the overall running time of algorithms and discuss several related issues.

Suppose there are

It follows from the pigeonhole principle.

Incorporating Algorithm

Let

UnCycle runs in

DCycle runs in

FixP runs in

Observe that the running time of each algorithm consists of two parts, one of which is for dividing the problem into

The second part uses

Note that this theorem studies the worst-case running time. Though FixP seems to have the same complexity with DCycle, the edges of each subproblem in FixP are usually less than

Another concern for our approach is that the same tree may be produced by different subproblems, so that computations are wasted. This is indeed true for UnCycle and DCycle. However, this happens only if the two subproblems are being solved on different cores at the same time, because if they are solved sequentially, then the solved subproblem provides feedback to the unsolved subproblem, eliminating redundant trees. This mechanism is shown in Figure

In this paper, we consider constructing a single tree for the network. If multiple trees are allowed, i.e., the network uses a different routing tree after some time, then the overall lifetime can be further extended. The drawback of this approach is that sensor nodes need to perform complex operations, e.g., either to record multiple routing paths in memory to change parents periodically or to receive commands from the network periodically. We plan to extend our result to this scenario in the future.

Finally, we discuss the motivation for finding the minimum cycle in UnCycle and the minimum directed cycle in DCycle. There are several reasons. Ideally, we should find a cycle with length

We compare our approach with previous single-thread approaches on randomly generated sensor networks. Sensors are uniformly and randomly distributed in a

Desktop computer configuration.

Operating System | Windows 7 64 bit |
---|---|

CPU | Intel(R) Core(TM) i7-4770 Processor |

| |

Number of Cores | 8 |

| |

Memory | 8GB |

| |

Java Runtime Environment | JRE 1.8.0 |

| |

ILP Solver | lp_solve 5.5 |

A randomly generated network consisting of 41 nodes.

We implement the proposed decomposition methods to generate subproblems including decomposition by breaking undirected cycle (UnCycle), decomposition by breaking directed cycle (DCycle), and decomposition by fixing the parent (FixP). We consider networks with

We study the improvement of our approach on ILP-B in terms of average running time. Figure

Average running time of ILP-B with three decomposition methods under different numbers of subproblems.

UnCycle

DCycle

FixP

We have two observations from the figure. First, our approach can significantly reduce the average running time. This is in line with intuition since all CPU cores are used. Second, when the number of subproblems is either small (8) or large (20), the average running time is not the smallest. We get smaller running time when the number of subproblems is 12 or 16. When the number of subproblems is small, most subproblems are still very similar to the original problem. But when we get too many subproblems, even though most subproblems are simpler, it is more likely that we encounter a difficult subproblem. Indeed, in lifetime maximization problem, small problem size is not a guarantee of less running time. Thus, we recommend to set the number of subproblems to within twice the number of CPU cores.

We show that the approximation of the running time of unsolved instance as ten minutes is reasonable in that the relationship of running time of different algorithms remains the same under this approximation. To this end, we vary the maximum allowed running time from 2 minutes to 10 minutes with increments of 2 minutes. Figure

We study the improvement on ILP-BD with the same problem instances in Section

Average running time of ILP-BD with three decomposition methods under different numbers of subproblems.

UnCycle

DCycle

FixP

Impact of the maximum allowed running time on the computation of average running time.

We can see again that our approach greatly reduces the average running time. The improvement is more significant when the number of nodes becomes larger. Setting the number of subproblems to 12 or 16 gives smaller running time than 8 or 20. Note that it is not fair to compare ILP-BD with ILP-B using Figures

Besides average running time, we show that the number of failed networks is smaller for our approach. Figures

Number of failed networks of ILP-B.

Number of failed networks of ILP-BD.

We test the algorithms on a real network reported in [

Running time of each method on the network in [

Methods | Running time |
---|---|

Single-thread | 175.285(s) |

| |

UnCycle | 1.395(s) |

| |

DCycle | 37.831(s) |

| |

FixP | 0.746(s) |

A network with 49 nodes in [

In this paper, we proposed to use multiple cores to speed up existing exact algorithms for finding optimal routing tree of wireless sensor networks. The basic idea is to decompose the original problem into multiple subproblems and run them on different CPU cores. We propose three decomposition methods and prove their correctness. Numerical results show that the three methods can speed up the calculation significantly in terms of average running time and the number of solved problems.

The data used to support the findings of this study are available from the corresponding author upon request.

The authors declare that they have no conflicts of interest.

This work was supported by the National Natural Science Foundation of China (61502232) and China Postdoctoral Science Foundation (2015M570445, 2016T90457).