Toward Performant and Energy-Ef ﬁ cient Network Queries: A Parallel and Stateless Approach

,


Introduction
In our last conference paper [1], we aimed to construct a three-layer topology for wireless sensor networks (WSNs) based on the Chinese remainder theorem (CRT). A mechanism to reduce data redundancy is proposed, and a faulttolerant and low-power transmission path is designed to improve the transmission efficiency and realize parallel query. In this paper, we not only refine various previous works, but also propose a set of event neighbor nodes to eliminate the redundancy of raw data during data transmission. At the same time, anew sleep mechanism and wake-up mechanism are designed to enable the sensor nodes to work in an orderly and efficient manner, thereby further improving the entire network life cycle.
With the rapid development of the Internet and perception technology, people have gradually entered the era of Internet of Things (IoT), which is connected to all things. IoT has also become the backbone to promote the social and economic development [2]. As one of the most important perceptual terminals of the IoT, WSN plays a very important role in promoting the development of the IoT. As a revolution of information perception and collection technology, WSN is one of the most important technologies in the 21st century. It has a wide application prospect in military, environmental monitoring and forecasting, industrial and agricultural production, medical and health care, smart home, smart city, and space exploration and other fields. Specifically, the emerging trend of WSNs is to adopt the large numbers of lightweight, inexpensive, low-power, and versatile sensor nodes, which can carry out simple computing and storage tasks by themselves instead of only transmitting the data to a remote data center. These sensors are usually battery-powered and deployed to environments that are possibly inappropriate (e.g., hydrothermal) for people to work in. Even if there is no technical barrier to replace the batteries, the cost may be prohibitively high due to the mass numbers of sensor nodes (e.g., climate forecasting, forest fire warning). To address that, various energy-conservation techniques were proposed, Olatinwo and Joubert [3] develop a successive wireless power sensor network (WPSN) system embedded with a scheduling algorithm to the allocation of resources in IoT sensor network; Nakas et al. [4] summarized how many protocols save energy consumption to extend the life cycle of WSNs; Choubin and Taherpour [5] improved the performance of WSNs by utilizing the optimal collaboration between a multiantenna fusion center (FC) and a sensor with energy acquisition (EH) capability; other literature focuses on how to use and optimize solar energy solutions to improve the life cycle and network performance of WSNs [6][7][8]. At a high level, these state-of-the-art techniques can be grouped into two categories: (i) better management of energy consumption using new, ad hoc algorithms and protocols and (ii) energy harvesting from the environment (e.g., solar and tidal). In general, in WSNs, the limited energy is the most important factor that restricts the development of WSNs.
While existing WSN literature mostly focused on reducing the energy consumption without incurring much performance overhead of queries [9,10], this work pushes the frontier further and strives to answer the following question: how, and by how much, can we reduce the energy consumption and improve the performance of network queries at the same time? To that end, this paper takes a radical approach: we propose to drop the conventional routing tables and achieve addressing and redirection through a novel encoding scheme inspired by the CRT [1]. Specifically, all of the entities-sensor nodes, storage nodes, and the base station-in a three-tier WSN are effectively connected through the keys and coordinates derived from a topology encoded by CRT. The benefits of removing routing tables are threefold: (1) It eliminates the overhead of those real-time updates to the stateful routing tables, which, consequently, reduces the network traffic and energy consumption. (2) The system becomes more reliable, as the centralized component (i.e., the routing table residing in the base station) is completely excluded. (3) It allows for parallel queries backed up by CRT and completely get decoupled from the underlying physical setup.
In addition to the CRT-encoded network topology, we propose three more techniques to further improve the query performance and lower the energy consumption: (i) hibernating sensor nodes under certain storage nodes coverage; (ii) parallel data preprocessing in local sensor nodes; and (iii) low-power data transmission on fault-tolerant paths. Hibernating sensor nodes can conserve the network energy and reduce the redundant data generation. The preprocessing applies data aggregation and deduplication to clusters of neighborhood nodes defined by the sensor nodes' transmission radii in the same period. The transmission of the messages follows only low-power and fault-tolerant paths that are recursively refined by our proposed algorithms.
1.1. Motivation. The urgent demand of this intelligent topology algorithm in WSNs as observed in [11] and [12] is the first motivation factor for network connectivity and coverage quality to continuous operation. We have an ambitious goal to propose a good topology algorithm that can enable the sensor nodes in the network to work closely together, provide good network connectivity, enable complete coverage of the monitoring area, effectively reduce communication interference, and improve the network lifecycle and network quality of service.
In WSNs, the high density of sensor nodes leads to the high redundancy of data collection, which will waste a lot of energy to transmit the useless data with incomplete redundancy. Therefore, making the data more complete, concise, and energy saving in the transmission process is the second incentive factor for this work.
In the query algorithm of WSN [13,14], many research has been done on reducing query energy consumption or query response time. However, few solutions combine these two characteristics. Therefore, there is a tradeoff between query instruction response time and energy consumption, which is the third incentive for this work.
In addition to the requirements for intelligent topology construction, overall network performance, and work efficiency in WSNs as described above, green initiatives for environmental considerations also imply that WSNs should be substantially improved in terms of energy conservation, as discussed by Patil et al. and Semprebom et al. [15,16]. That is, in order to protect the environment and prolong the life cycle of the whole network, intelligent hibernating of sensor nodes is also necessary. To protect the environment, another motivation of our work is to rationalize the density of sensor nodes and reduce the generation of redundant data.

Related
Work. For WSNs, efficient algorithms all need to consider the energy saving as a necessary factor to extend the network life cycle, and the same is true for the algorithms in this paper. In the following, we present the related work of this paper in terms of WSN topology algorithms, dormancy strategies, query-driven data aggregation, and transmission routing strategies.
Various works were proposed to improve the energy efficiency and network connectivity from the perspective of network topology. For instance, Phan and Kim [17] proposed a virtual topology-based time synchronization protocol (VTSP) to reduce the data redundancy, which is caused during the creation of the virtual links, in timing messages and for speeding up the convergence by excluding edge nodes from the consensus process. Wang et al. [18] proposed a novel head wolf mutation topology strategy, which increases the neighborhood search range of the optimal solution, enhances the uniformity of wolf pack distribution and the ergodicity ability of the wolf pack search. Liu et al. [19] investigated the issue of topology sensing of wireless networks with distributed sensors to handle the situation of limited and unreliable information, which include a time series of the signal presence instants and the corresponding transmitter identification. Roy et al. [20] presented an adaptive topology control strategy based on reinforcement learning for mobile software-defined WSN. Hu and Li [21] presented a new topology modification strategy for enhancing the robustness of scale-free WSNs. In contrast to existing works, this paper presents a radical change to the way that a WSN is built up. Specifically, to the best of our knowledge, this is the first work that exploits the CRT to encode a three-tier WSN, opening the door to many new opportunities to improve a WSN's performance and sustainability.
Various works were also proposed to improve the energy efficiency through hibernating nodes. Feng et al. [22] proposed to distinguish the redundant nodes based on local Delaunay triangulation and multinode election dormancy mechanism. Banerjee et al. [23] proposed based on available neighboring nodes, associated network entropy and traffic flow pattern, iSleep enables the sensor nodes to auto control the sleep state of the nodes and maintain network connectivity for a longer amount of time. Ahmed et al. [24] proposed sleep-awake energy efficient distributed (SEED) clustering algorithm. Wang et al. [25] proposed demand sleep MAC (DS-MAC) that allows nodes to adjust their sleep time adaptively according to the amount of the received data packets in order to efficient and effective communication in the dynamic traffic load. In this paper, on the basis of CRT topology algorithm, we put forward a grid-based sleep mechanism, which can give better consideration to the network connectivity and network life cycle.
At present, there are many researches on query-driven data fusion algorithms and data transmission routing protocols. Sarode and Reshmi [26] proposed the group search optimization (GSO) algorithm with neural network (NN) for developing the novel query-based data aggregation model. Liu et al. [27] presented a queries privacy preserving mechanism for data aggregation (QPPDA) which may reduce energy consumption by allowing multiple queries to be aggregated into a single packet and preserve data privacy effectively. Bhardwaj and Kumar [28] proposed a wavelet-based least common ancestor-sliding window (WLCA-SW). The energy loss and memory crisis is well addressed using the proposed WLCA-SW through the successive steps of query processing, duplicate detection, data compression using the wavelet transformation, and data aggregation. Wang et al. [29] proposed a new clustering GR protocol called quadtree grid (QTGrid) to save energy and improve spatial query efficiency. In contrast, the approach of collecting data taken by this work employs aggregation and deduplication techniques to further lower the energy consumption without compromising the query performance.
In the existing routing table communication protocol, the vast majority of the routing tables are selected by the routing table as the transport path. Jain et al. [30] proposed querydriven virtual wheel based routing protocol to improve the data delivery performance. They also proposed a virtual ring infrastructure based query-driven ring routing protocol (QRRP) to reduce the overhead of updating mobile sink location information as well as routing the data toward current location of the sink [31]. Saleh et al. [32] proposed a multiaware query driven (MAQD) routing protocol for MWSN based on a neuro-fuzzy inference system. In the above algorithms, the sensor node needs to update and store the routing table, which increases the consumption of memory and energy and as a consequence, increases the communication cost and burden. In contrast, this paper proposes a stateless routing table protocol to reduce the overall network traffic and extend the life cycle of the entire network.

Contributions.
To summarize, this paper makes the following contributions: (i) We propose a new mechanism to construct the topology of WSNs. The proposed mechanism eliminates routing tables and employs a novel and efficient scheme inspired by the CRT, opening the door to parallel queries and allowing sensor nodes to hibernate when possible. (ii) We design a new protocol to parallelly preprocess data on sensor nodes. The preprocessing effectively aggregates the neighborhood clusters and deduplicates redundant data. (iii) We devise a new algorithm to allow the queries and results to be transmitted on low-power and faulttolerant paths. The paths are efficiently constructed using recursive elections over subsets of the entire power range.
We have extensively evaluated the proposed system in terms of performance, energy, and reliability. With all the proposed techniques taken together, our system outperforms the state-of-the-art in all three metrics: (i) the query response is improved by up to 21.6%; (ii) the energy consumption is reduced by up to 16.8%; and (iii) the reliability is increased by up to 18.3%.

Organization.
In the remainder of this paper, we give a brief overview of the proposed system's building blocks in Section 2. Section 3 presents the design of the proposed system. We report the experimental results in Section 4. Section 5 concludes the paper.

Preliminaries
2.1. Three-Tier Networks. As shown in Figure 1, this paper assumes the entire network is divided into three layers. The top tier is a base station that issues query tasks and monitors the network's status. The second tier is comprised of a number of storage nodes that persist the data transferred by the sensor nodes and complete portions of the query if possible.

Wireless Communications and Mobile Computing
Large numbers of sensor nodes constitute the third layer, which is responsible for collecting data, lightweight preprocessing, and forwarding packets. The first two tiers-the base station and storage nodes-are supposed to have sufficient power; it is the third tier of sensor nodes where resources are limited. These large amounts of sensor nodes work as follows: (1) they are battery-powered, homogeneous (e.g., same power capacity, same computing, and storage capability), and cannot harvest environmental energy; (2) they do not carry GPS modules, but their communication power can be controlled; (3) their topology will not change once deployed (sensor nodes generally cannot be moved because of the large amount of energy required for doing so); (4) they preprocess the collected data and periodically send them to the storage nodes.
In addition, Figure 1 shows a simplified network model where the number of storage nodes in a real application is determined by the size of the application scenario and the communication radius of the storage nodes. And the storage nodes are divided into domains to manage the whole application scenario. There are usually hundreds of sensor nodes in each storage node management domain.

Energy Models.
In this paper, we assume that the base station and storage nodes have sufficient energy. The energy consumption of those sensor nodes comes from four states: the idle state, the hibernating state, the transmitting state, and the receiving state. The energy consumption in the idle and hibernating state are negligible (in these two states, the voltage of the sensor node is adjusted to a lower state); the transmitting energy (E TX ) and the receiving energy (E RX ) of transmitting a n-bit message to d-distance away are calculated as follows: where E elec denotes the energy consumption of each transmitter or receiver, ϵ amp denotes the energy consumption of the amplifier by one square meter per bit, and k denotes the propagation attenuation index. In practice, the choice of k [33] usually ranges between 2 and 5 depending on the environment; the higher interference (e.g., high buildings and dense forests) is expected, a higher k value should be set.

Chinese Remainder Theorem.
What the CRT essentially says is: if we know the remainders of an integer x divided by several coprime integers, then we can uniquely determine the remainder of x divided by the product of those coprime integers. Since all information is deterministic, it is defined as a stateless protocol if used to identify the   Wireless Communications and Mobile Computing membership relationships and maintain proximity (e.g., network topology). Formally, let n 1 ; n 2 ; ⋯; n k be pairwise primes: Let r 1 ; r 2 ; ⋯; r k be integers. Given the following congruent equations as follows: we can uniquely calculate x as follows: and We call N the CRT moduli and x the CRT solution.

CRT-Encoded
Topology. This work builds the three-tier CRT coding system from the bottom up. That is, we assume the total number of sensor nodes are known, then we calculate the size of the storage nodes, and finally, we determine the base station's keyword N, as shown at the top of Figure 2.
Once N's range is known, we employ a top-down approach to determine all the keywords for storage nodes, each of which further determines the keywords of its assigned sensor nodes. According to the CRT, N will be divided into k coprime integers n 1 ; n 2 ; ⋯; n k , where k represents the number of storage nodes stored in the network. Each storage node's n k , again according to the CRT, is decomposed into coprime integers n kj , where j represents the number of child sensor nodes of storage node's n k , and the range of j is statistically between several hundred and one thousand. As shown in Figure 1, we use a simplified approach to represent hundreds of child sensor nodes with several sensor nodes. In the example illustrated in Figure 2, j ¼ 1…a ½ for k ¼ 1. There are two main operations: (i) a downward CRT operation indicates that a decomposition is performed; and (ii) an upward LCM (lowest common multiple) represents a minimum multiply across those sensor nodes. We also require the keywords assigned to storage nodes and sensor nodes are monotonically increased as follows: and 3.1.1. Estimating Internode Communications. All nodes can communicate directly with each other as long as they are considered as neighbor nodes. The set N i ð Þ node A 0 i s neighbors is defined as follows: where P range denotes the power range and d A i ; ð A j Þ denotes the Euclidean distance between nodes A i and A j , which is estimated by the following (since no GPS module is available): where α denotes the sent signal power per square meter, SP A j →A i denotes the received signal power at A i , and β denotes the loss coefficient. From the above definition, it is easy to see that the neighborhood relationship between two neighbor nodes is symmetric as follows: We will revisit the concept of neighbor nodes when we discuss how aggregation is performed for a set of events simultaneously collected by multiple sensor nodes. The base station broadcasts a hello message to wake up the storage nodes and collects the initial status, including the total number of storage nodes. When a storage node receives the hello message, it sends its status table to the base station. When the base station receives the reply message from each storage node, it pushes the messages into a queue according to the arrival times. The total number of storage nodes can be calculated as the length of the queue.
To maximize the data locality, we want the storage node closest to the base station to manage the largest number of sensor nodes. Thus, when assigning keywords to storage nodes, the coprime integers n 1 ; n 2 ; ⋯; n k are assigned to the inverse of the queue at the base station, because these closer storage nodes usually acknowledge the hello message in shorter times. Once the keywords (i.e., n 0 k s) are assigned, each storage node can be uniquely identified by a twodimensional coordinate [idx, key], where idx represents the index in the queue and key represents the calculated integer by CRT principle (e.g., n k ).
Formally, the protocol to construct the storage node topology is illustrated in Algorithm 1. Due to limited space, we do not give the details of verifying whether a specific message has been received by the correct storage node. Essentially, this can be efficiently done by verifying the timestamps and distances during the multiple transmissions between the base station and the storage nodes, which are conducted in Lines 7, 11, and 12. Since wireless communication is an unreliable communication, we do not consider the time error caused by specific communication of nodes when discussing the time complexity of the algorithm. We only start from the algorithm itself, and the time complexity of the algorithm is T n ð Þ ¼ O n ð Þ.

Constructing the Topology of Sensor Nodes.
Like the topology of storage nodes, we start constructing the sensor nodes topology by applying the CRT to decompose the key of a specific storage node into coprime integers. Note that, one key difference between storage node topology and sensor node topology is the order of coprime integers assigned to the child nodes. Because sensor nodes are already the leaf nodes in the hierarchy of the three-tier network, we do not make specific optimizations on the size of the keys. As a result, the keys are assigned to the sensor nodes in an increasing order, as opposed to the decreasing order used in storage node topology. Because the sensor nodes are deployed at a high density, they usually exceed the upper limit that a storage node can maximally manage. In order to better manage those overdeployed sensor nodes, we propose to partition the storage's range into smaller sectors (θ is the smallest unit sector), as shown in Figure 3. Let k denote the maximum number of nodes and R denote the communication radius of the storage node A. Specifically, we align the starting edge of the first sector with the positive X-axis and evenly divide the entire range into k equal sectors in the counterclockwise direction. The storage node A then broadcasts the queue of initial statuses. A sensor node B is located in a sector n as long as the angle between B and the positive X-axis δ satisfies According to the locations (i.e., δ), sensor nodes acquire the initial statuses, retrieve the sector numbers, and estimate the distances to the storage node. The three-dimensional Input: The message queue at the base station Q; All storage nodes A i 1 ≤ i ≤ k ð Þ 1: Base station broadcasts the hello message 2: Storage nodes initialize status, respond to base station 3: Base station enqueues the returned messages 4: Base station splits N into coprime n 1 ; n 2 ; ⋯; n k by CRT 5:  6 Wireless Communications and Mobile Computing coordinates of the children sensor nodes represent the followings: the sector number of sensors node serves as the secondary coordinate (i.e., y), the x and z coordinates are assigned by the storage node key and the increasing order based on the distance to the storage node, respectively.
The following example illustrates how to reduce the range overlap between sensor nodes. It should be noted that the complete elimination of sensor node overlap is not realistic in the practice. Here we attempt to reasonably turn off some sensor nodes (i.e., hibernation nodes) to help deduplication the collected data (more to be discussed in Section 4.2) and conserve the energy footprint.
In this example, we assume the storage node's communication radius is 20 m, and the density of sensor nodes is 1-10 /m 2 , the sensor node's transmission radius is set to 5 m. Figure 4 shows the hibernation nodes in gray color (with communication radii in black color) and working nodes in blue color (with communication radii in pink color); we do not enumerate all nodes due to limited space. The entire area under the range of storage node A is labeled by square grids, each of which is associated with a value indicating the number of sensor nodes covering a specific grid. The value is initialized to 0, incremented when a sensor node's communication radius reaches the grid, and is decremented when a sensor node in that radius goes into hibernation (the working voltage of the sensor nodes is in a low state and can receive the activation packet). The number is counted when the sensor nodes send a hello message containing the location information, based on which the covered grids can be identified. For those sensor nodes landing in the same sector, they are differentiated by their distances toward the storage node. As a result, the overlap areas can be found based on the count of grids, which are shown as shadow areas in Figure 4. That is how those hibernation nodes (i.e., gray nodes) are identified in the figure: they all land in the shadow areas. When a sensor node is soon to be out of energy, it sends out a replacement message, and hibernating node (s) will wake up to area originally covered by the soon-to-be dead sensors.
To make matters more concrete, let us focus on two sensor nodes M and N in Figure 4. M turns into the hibernation mode because all the grids within its circle (defined by the transmission radius) are covered by at least one other sensors. On the contrary, N cannot go into the hibernation because some grids (to the right-downward direction) can be reached by only N.
Formally, the protocols to construct the topology of sensor nodes is depicted in Algorithm 2. The algorithm is selfexplanatory and is exempt from time analysis since it works infinitely-continuously monitoring the statuses of sensor nodes like a daemon service. The algorithm looks similar to Algorithm 1: the base station is now replaced by a specific storage node and the storage node becomes a sensor node.
Another key difference is that we consider the hibernation of sensor nodes in Algorithm 2, which is not the case for storage nodes in Algorithm 1, the time complexity of the algorithm is T n ð Þ ¼ O n log n ð Þ. Thus far, we have CRT-based protocols to build both topologies from the base station to storage nodes and from storage nodes to the sensor nodes. With these protocols, all nodes can communicate: between the base station and all storage nodes; between the all storage nodes and their (children) sensor nodes; between the distinct storage nodes; and between the sensor nodes even they are not under the same storage node. Each node can be uniquely identified with their coordinates and key in this three-tier network. Wireless Communications and Mobile Computing 3.2. Preprocessing of Collected Data. Since sensor nodes are deployed in a way that overlap each other, the collected data often exhibit high redundancy. In order to reduce the transmission of large amounts of redundant data, we propose to preprocess these data right on the sensor nodes. Specifically, we propose a set of neighbor nodes to collectively deal with the events occurred within this set. The goal of this neighbor set is to aggregate and deduplicate the data collected within a specific period of time. Formally, if during T time there are m events received by some of the n neighbor nodes, then we call that these n nodes constitute the set of neighbor nodes in the period T. In this work, we store the collected data (or, events) in the unit of 16-byte E i ; so an entire event EV can be represented by E 1 ∪ E 2 ∪ ⋯. A sensor node may only collect some portions of an event due to various reasons such as network jitters.

Building the Set of Event Neighbors.
To illustrate how the aggregation and deduplication works, we start with a hypothetical example where five events (EV 1 ; EV 2 ; ⋯; EV 5 ) occurred to be received by a set of 10 neighbor sensors (A; B; ⋯; J) within a period T, as shown in Figure 5. Let us assume those five events are of various lengths: len EV 1 ð Þ¼ 5 the size of event EV 1 is 5 × The above setup illustrates important facts in the real world: shorter events (e.g., EV 1 , EV 3 ) could be entirely received by multiple sensor nodes, while longer events (e.g., EV 2 , EV 4 , EV 5 ) are usually fragmented and collected by distinct sensor nodes.
These 10 nodes do not know up front whether they are in the same set of event neighbors or not. Rather, after they receive messages from others, they check whether the responding node contains any same (partial) event that is also (partially) collected locally. Take node A for example, because A collects both EV 1 and EV 3 , any node that has touched either event (again, not necessarily in its entirety) belongs to A's set of event neighbors N e A ð Þ, which is N e A    Wireless Communications and Mobile Computing transmission between sensor nodes and storage nodes. It is a recursive procedure and starts with all the nodes as the base case. Recall that the z coordinate of the sensor node stores its distance to its parent storage node (Section 3.1.3). Thus, we can sort all the sensor nodes according to their z coordinates. Without loss of generality, we assume these 10 nodes' z coordinates are in the increasing order as follows: Thus, we can efficiently calculate the closest node from each node's set of event neighbors. For instance, In fact, in this example, nodes A, B, C, D, E, F, G, and J all have A as their z min node; we do not show the detailed calculation for brevity.
All nodes then send out their own z min values to the network, and then the all nodes realize that {A, C, G} is the set of nodes satisfying the two conditions. The z min nodes then perform a union operation over all the event fragments sent from the event neighbors. Those sensor nodes other than z min can then turn into the idle mode to save energy. As a concrete example, since N e G ð Þ ¼ A; f B; D; F; G; Jg, node G aggregates and deduplicates all event fragments collected at nodes {A, B, D, F, G, J}, and results in the final message that will be sent to the storage node.
In the same way, nodes A and C also perform these aggregation and deduplication. Then, the above procedure is repeated over {A, C, G}. In this example, A is eventually delegated to communicate with the storage node; for the sake of space, we do not show the details of the derivation.

3.2.3.
Communicating to the Network. From a system's perspective, the aggregation and deduplication can be performed in a pipeline manner, so that the performance can be further improved. More specifically, the z min sensor node does not need to wait for the aggregation and deduplication in time slice t i to complete before starting to receive the event fragments in time slice t iþ1 .
When the storage node receives the data packet from its child sensor nodes, the events are saved per their types and fed to the level-1 address table that is periodically sent to the base station. The table comprises the storage node's key and coordinates, so that all sensor node and storage nodes' information is piggybacked to the base station. With all these metadata, the base station can efficiently retrieve the results of various network queries, both improving the performance and reducing the energy consumption.
Last but not least, we give the formal protocols for the aggregation and deduplication in the preprocessing of collected data in Algorithm 3. Lines 3-9 construct the event neighbors for all sensor nodes. Lines 10-15 recursively calculate the delegation of the entire set of event neighbors. The time complexity of the algorithm is T n ð Þ ¼ O n ð Þ.

Fault-Tolerant and Low-Energy Transmission Paths.
In this section, we aim to find a fault-tolerant and low-power transmission path between the base station and the destination node. As discussed before, we could estimate the distance between two nodes, such as storage nodes and the sensor nodes. However, the estimation approach cannot be directly applied to the base station, because the distance between the base station and other remote nodes may be out of the power range. It is also unrealistic to assume some intermediate nodes are perfectly on the straight line between the base station and the destination node, in which case the transmission path can be trivially solved by multiple transmissions between those intermediate nodes right along the straight path. To that end, we propose a series of reference nodes to help formulate an estimate of the straight line that should achieve low-energy consumption and highfault tolerance.
3.3.1. Directing Queries. The base station does not store the event data. Instead, it maintains an address table to manage the storage nodes where the event information is physically persisted. More specifically, the level-1 table entries store information about the storage node, such as the keys and coordinates. The level-2 table entries are linked lists to keep events' addresses on those storage nodes. Therefore, when the base station issues the query instruction, it only needs to query the required data on the specified storage node according to the query result in the address table.
The storage node periodically sends the addresses of the newly collected events, in the form of status table, to the base station. Such a heartbeat-fashion synchronization is employed to conserve low-energy consumption (as opposed to pull-or push-based approaches). When the base station issues a query about an event, it first finds out in which storage node the event is stored by looking up the entry in the two-level address Then, it records the event address information and launches the query task through the GiveOrder() interface. After the query is submitted, the destination node will receive it and call ReadOrder() to retrieve the query content. Once the results are ready, a ReturnData() is called to send the query results back to the base station, where GiveOrder() represents the query instruction packet initiated by the base station to the storage node according to the query requirements, ReadOrder() represents the storage node to read the received query instruction packet, and ReturnData() represents the storage node to package and send the queried data back to the base station.

Categorizing Intermediate Nodes.
Let P denote a node's transmission power, R denote its transmission radius, and d S→D denote the Euclidean distance between source node S and destination node D, then the minimum number of reference nodes n r must satisfy the following where r ¼ R=2 indicating the radius of the circle that can be reached by the next intermediate node.
To make matters more concrete, Figure 6 exemplifies how the references nodes are selected. The solid rectangle is the range of the acceptable errors, while the dotted rectangle is the range that a node's signal may reach. There are three colors of nodes: red, green, and gray.
(i) The red nodes are the reference nodes as calculated by n r ; they are hypothetical and not real nodes. Nonetheless, it is possible that some real nodes happen to be located exactly at reference nodes' positions. (ii) The green nodes are those within the solid rectangle, i.e., in the range of acceptable errors. The goal here is to efficiently find out a subset of these nodes to form a path along which the transmission is error free and low power. (iii) The gray nodes are outside the solid rectangle (and within the dotted rectangle). They are out of the range of acceptable errors and will not carry out any processing over the received messages, thus reducing the energy consumption of these nodes.

Finding the Transmission Path.
The query packet is sent out from the base station S. Without loss of the generality, we assume the next intermediate node along the transmission path is within the first quartile of the twodimensional space. In addition, the next node must also be within the rectangle of acceptable errors. There are two cases to be considered: whether the two adjacent sensor nodes are under the same storage node. If so, we know that both nodes share the same x coordinate, and we need to ensure that the difference between both nodes' z coordinates and y coordinates are smaller than ffiffi ffi 2 p ⋅ r. Otherwise, we need to make sure the two storage nodes are close enough, i.e., the difference between two x coordinates is smaller than ffiffi ffi 2 p ⋅ r and the distance between the candidate node and the reference node is smaller than r.
The complete procedure for constructing the transmission path is depicted in Algorithm 4. We do not give all the details of ReturnData() in Line 13, which basically carries out the operations in Lines 3-11 in the inverse direction from D to S. When multiple candidates qualify for the next intermediate node, we pick the one with the most energy for better load balance. If multiple candidates have the same energy left, a random one will be selected. This procedure continues till the

Experiment Setup.
The network area is set to a square of 100 by 100 m, the base station's communication radius is 150 m, which covers the entire network. The storage nodes' communication radius is set to 20 m, implying a total number of 10-15 storage nodes. The sensor nodes are evenly distributed over the entire network area; the total number of sensors range between 10,000 and 100,000. Each sensor node receives its nearby events in every 2 s. The time slice allocated for aggregating the events is set to 50 s. The storage nodes send the aggregated data to the base station in every 5 min. The base station issues a query in every 2 min. We used the OMNET++ simulator [34] to evaluate the proposed system. Experiments were carried out on an Intel Core i5-3470 workstation with 64-bit operating system. In all experiments, other experimental configurations (e.g., simulation software, network parameter configurations, etc.) were identical except for the variable parameters. We repeated all simulations for 50 times and will report the average. More parameters are listed in Table 1.

4.2.
Benchmarks. This section first evaluates the three techniques discussed above: sensor nodes hibernation (Section 3.1.3), data preprocessing (Section 3.2), and fault-tolerant transmission paths (Section 3.3). Then, we report the overall performance of the proposed system in terms of both responsive time and energy consumption. This section focuses only the benchmarking of the proposed system, and the next section (Section 4.3) will compare the proposed system to other stateof-the-art techniques.

Hibernating Sensor Nodes.
We first investigate the hibernation of sensor nodes under a storage node's coverage. Again, we assume the sensor nodes' communication range is 5 m. The question this experiment tries to answer is: how do the sensor nodes lower their voltage (i.e., dormancy rate) to respond to the changing density? In Figure 7, we show that the dormancy rate when deploying various numbers of sensor nodes ranging from 1,000 to 10,000 under a storage node's coverage. The plot keeps going up when more sensor nodes are deployed, which is expected because an increasingly larger portion of the storage node's coverage is being overlapped by those sensor nodes. One important observation is that the accommodation of the storage node's range is saturated at some point. Specifically, in this experiment when more than 8,000 sensor nodes are deployed, the number of working (i.e., nondormant) sensor nodes stays the same. The dormancy rate still increases since the more sensor nodes are deployed, yet, there are only 1,800 sensor nodes in the working status. Before reaching inflection point (i.e., 8,000), the number of working nodes increases in accordance to the deployed nodes. For instance, when 5,000 sensor are deployed, 1,300 out of 5,000 are active, resulting in a dormancy rate of 74%.
We then investigate how different grid sizes impact the hibernation of sensor nodes. Intuitively, we do not want to choose the grid size as small as possible to accurately count the overlapped areas (i.e., lower error rates), because of doing so would incur performance overhead in computation. Thus, the following experiment attempts to find a reasonably good grid size in terms of both error rate and performance overhead. Figure 8 reports how the grid sizes impact the hibernation error rate. The radii of sensor nodes' transmission are set to 3, 5, 8, and 10 m. We varied the grid size from 0.01 m all the way up to 0.5 m, as shown on the left subfigure. For small-scale sizes, we zoomed in the area (i.e., 0.1 m or smaller) and reported the error rates on the right subfigure. That is, the grid areas range from 0.0001 to 0.25 m 2 .
There are several important observations from Figure 8: (i) A sensor node's range can largely affect the error rate when the grid size is set to a relatively large value, e.g., 0.09 m 2 or larger in this case. For instance, when the grid size is 0:3× 0:3 ¼ 0:09 m 2 , a sensor node of radius 3 m incurs significantly larger error rates than the other three radii. An exception is that the difference between large radii (i.e., 8 and 10 m) is not outstanding. (ii) When the grid sizes are small (e.g., 0:2× 0:2 ¼ 0:04 m 2 or smaller), the error rates are still  Wireless Communications and Mobile Computing proportional to the grid sizes but the sensor nodes' ranges do not incur much difference. This implies that as long as the grid size is small, then the sensor nodes' ranges can be determined by other metrics (e.g., energy consumption).
In the following discussion, we pick the grid size as 0:1 × 0:1 ¼ 0:01 m 2 , which implies an error rate at 1.9% that is acceptable in most applications and more importantly, would not incur much computation overhead in practice. An indepth analysis of the optimal mesh size is beyond the scope of this paper and may be carried out in our future work. Figure 9, this experiment confirms the effectiveness of data aggregation. Due to the limited computing capacity of sensor nodes, simple data aggregation can also reduce energy consumption on various scales ranging from 500 to 1,800 in the working state sensor nodes covered by the storage nodes. For example, at the 800 nodes scale, energy consumption decreases from 0.055 to 0.029 J; at 1,300 nodes scale, the energy consumption decreases from 0.102 to 0.044 J. Meanwhile, Figure 9 shows a more important message: data preprocessing leads to better energy efficient scalability. Specifically, the energy consumption of 1,800 nodes is still well-controlled (below 0.056 J) with the simple data aggregation technology, while the energy consumption rises to 0.188 J without the data aggregation technology. The difference between the two approaches is small at 500 nodes (i.e., coverage is less than 100%), but scales up to 240% at 1,800 nodes.

Fault Tolerance of Transmission Paths.
In order to find the optimal acceptable error radius, the system is evaluated by several performance metrics including data transfer success rate (Figure 10), time delay (Figure 11), and energy consumption ( Figure 12). In the experiment, we set the communication radius of the sensor node as 5 m. In Figure 10, we report the different transmission success rates for 500-1,800 sensor nodes in the operating state under a single storage node when the acceptable error radius ranges from 1 to 5 m. As shown in Figure 10, the higher node density usually manifests as the higher transmission success rate, except for some exceptions due to network jitter.
As shown in Figure 10, the optimal radius for acceptable error is 2.5 m. Experimental results show that the transmission success rate will be suboptimal if the acceptable error radius (r err ) is too large or too small. For example, at the 1,200 sensor nodes scale, r err (2.5 m) has a transmission success rate of 80%, r err (1 m) and r err (5 m) have a transmission success rate of 46% and 57%, respectively. The main reasons for this result are as follows: (i) if r err is set too small, the transmission path between the source node and the destination node is too narrow, resulting in a small number of candidate nodes and unable to form the best transmission path; (ii) if the r err is set too large, the candidate nodes will be excessive redundant (in extreme circumstances, the acceptable error radius is set to the same value as the sensor communication radius, set up to 5 m in this experiment), so that the selectable transmission frequency band becomes worthless. And with many intermediate sensor nodes involved in the transfer, energy is consumed unnecessarily. This result in a counterintuitive but explainable low-transfer success rate.

System
Performance. The response time of query processing is an important indicator to evaluate the performance of network system. More specifically, we are interested in the average time it takes to complete a query. Intuitively, the higher the node density or the smaller the acceptable error radius means, it is more conducive to message the transmission, so the response time is more timely: the former allows more intermediate nodes to help the message transmission. The latter can form the transmission paths more quickly.
As shown in Figure 11, the experimental results differ greatly from our predicted results: (i) when the error radius is set to the minimum value (r err = 1 m), the best performance is not achieved; (ii) there is no monotonic relationship between system performance and node scale. More specifically, the best performance occurs at 1,200 nodes with r err = 2.5 m. In fact, different experimental settings all show inflection points at 1,100 or 1,200 nodes. Therefore, we need to answer the following two questions: (i) Why does the minimum error tolerance not yield the best performance? (ii) Why do all error-tolerance setups achieve the optimal performance somewhere in the middle?
To answer the first question, we found a similar conclusion when discussing the success rate, that is, the optimal error radius is 2.5 m, because it can well balance the transmission path size and the number of candidate nodes. Therefore, it is reasonable to believe that this conclusion also applies to the system performance. The response time of the query is positively correlated with the error rate, that is, the reverse of y value as shown in Figure 10.
The second problem can be explained as follows: when the number of nodes is 1,100-1,200, the performance is optimal because the density of sensor nodes is sufficient to transmit messages efficiently. Therefore, when the number of nodes continues to increase, it will reduce the effective transmission rate and increase the network redundant transmission, such as 1,600, 1,700, and 1,800 nodes. To some extent, this follows the basic rule discussed earlier: the optimal solution is usually located "somewhere in the middle" of a large parameter space.  Wireless Communications and Mobile Computing 13 In fact, this is the third time we have seen this pattern in this work. Although the analytical exploration of theoretical optimality is beyond the scope of this paper, we plan to work on this direction in future work.

Energy Consumption.
In this section, we show the influence of different acceptable error radii at different node densities on the network energy consumption. It should be emphasized that the goal of this work is to extend the life cycle of the entire network system, rather than simply reduce the primary energy consumption. For example, a higher density of sensor nodes may incur more energy consumption and also increase the average energy consumption per storage node-a desirable outcome despite the higher cost. That is why we chose average energy consumption as the main indicator in the later assessment. As shown in Figure 12, within the coverage of one storage node, the number of nodes scales from 500 to 1,800, while the acceptable error radius varies from 1 to 5 m. Note that the 5 m case is equivalent to having no acceptable error radius set, since the sensor node also has a communication radius of 5 m. Figure 12 clearly shows the validity of the proposed acceptable error path: in the case of no fault tolerance (i.e., r err ¼ 5), the energy consumption is much higher than in other cases. For example, at the scale of 1,200 nodes, the energy consumption of r err ¼ 2:5 m is only 0.069 J, while r err ¼ 5 m is 0.112 J, which is 162% higher than that of the former. We also observed a positive correlation between the energy consumption per node and the number of nodes in the network, which is somewhat counterintuitive: increasing the number of nodes does not reduce the energy consumption per node, in fact, increasing the number of nodes continuously damages the sustainability of the network. The results show that the higher sensor node density leads to the lower network sustainability. Therefore, it is very important to recommend network setting appropriate node density if other indicators (such as performance) can be met. Finally, the low-error radius (i.e., r err ¼ 1; f 2; 2:5g m) is less sensitive to node energy consumption than the previous experiment, indicating that users can choose one value more flexibly to optimize the other indicators.

Comparisons to Approximate
Algorithm. This section will compare the proposed system, CRS energy-balanced event query (CRTEBEQ), to multiple approximate techniques, such as an event-based data delivery model using a multipath routing scheme (EDMR) [35], hierarchical routing and data aggregating for WSNs (HRDA) [36]. The key idea behind EDMR followed a sink initiated route discovery process with the location information of the source nodes already known to the sink nodes, it also considered communication link costs before making decisions for packet forwarding. HRDA, on the other hand, the network was clustered, and some nodes were selected as cluster heads. On constructing a rendezvous region and selecting backbone nodes, a tree was formed within backbone-tree nodes, the aggregated data were sent to the sink through backbone-tree nodes and cluster heads. Both EDMR and HRDA assume the network is constructed through the conventional routing rather than encoding the nodes. The remainder of this section will compare CRTEBEQ against EDMR and HRDA from three perspectives: query time, energy consumption, and the number of queries before the first node dies, all of which at varying scales from 500 to 1,800 nodes in working under a storage node's coverage. All configurations (e.g., sensor specifications, message lengths, simulation software, network configuration) are identical across these three systems. Figure 13 shows the difference in the average query time between CRTEBEQ, EDMR, and HRDA. Experimental results show that the average query time of CRTEBEQ proposed in this paper is significantly less than that of EDMR and HRDA at all scales. At a minimum scale of 500 nodes, CRTEBEQ is 19.8% and 24% faster than HRDA and EDMR, respectively. On a medium scale, such as 890 nodes, CRTEBEQ takes an average time of 2.32 s to complete a query task, while HRDA and EDMR take an average time of 2.79 and 2.96 s, respectively, with an efficiency improvement of 16.8% and 21.6%, respectively. These significant advantages of CRTEBEQ are attributed to the more efficient CRT-based addressing scheme and the usage of fault-tolerant and lowenergy transmission paths to transmit data.

Query Time.
The experimental results also show that in these three systems, it is not entirely true that the higher the density of sensor nodes is, the more helpful it is to query. If the sensor nodes are too redundant, the query efficiency will decrease, and excessive energy will be consumed due to the forwarding redundant messages. Therefore, in all three systems, deploying the "right" number of sensor nodes can not only save costs, but also improve network efficiency. Figure 14 shows the average energy consumption of a single query under different sensor node densities in these three network systems. The experimental results show that: (i) the overall average energy consumption of CRTEBEQ is significantly lower than that of EDMR and HRDA at each sensor node density; (ii) the 14 Wireless Communications and Mobile Computing more densely sensor nodes are deployed, the more energy is consumed and the lower the effective working rate is.

Energy Consumption.
(i) For (i), the three systems have similar trends, but CRTEBEQ performs better than EDMR and HRDA. For example, on a sensor node density of 1,020, CRTEBEQ consumes 0.93 J, while EDMR consumes 1.12 J and HRDA consumes 1.04 J. This means that CRTEBEQ has a 16.9% and 10.6% reduction in energy consumption compared to EDMR and HRDA, respectively. (ii) For (ii), since Figure 14 shows that the average energy consumption for completing a query increases with the increase of sensor node density. It is expected that more query tasks will result in lower effective work rate and higher energy consumption.

Fault Tolerance.
In this section, we evaluate the fault tolerance of the three systems by counting the number of queries completed before the death of the first sensor node in these networks. In other words, if a sensor node dies in the network, we consider that the network is beginning to decay. It should be clear that "decay" does not mean that the network is unavailable, but that the hop taken by the dead node is no longer available and can be replaced by a neighbor node, and query messages can be transmitted via another path. As shown in Figure 15, the purpose of this experiment is to evaluate how long the network can work without any decay at all. Moreover, there is an inflection point in Figure 15, so there are more places worth exploring.
(i) First, CRTEBEQ still outperforms EDMR and HRDA on all scales, although the advantages are not as significant as query time and energy consumption. For example, on 1,020 nodes, CRTEBEQ was able to complete approximately 216 queries before the first dead node appeared in the network, while EDMR and HRDA were able to complete approximately 185 and 192 queries, an improvement of 16.8% and 12.5%, respectively. This indicates that EDMR and HRDA have better fault-tolerant performance than query performance and power consumption. (ii) Second, there is an inflection point for all three network systems, and after that, with the increase of deployed sensor nodes, the system performance will decrease. Again, we believe that these phenomena are caused by too densely deployed sensor nodes. More specifically, when the same amount of data needs to be transmitted, more sensor nodes will join, even if optimization (such as aggregation) is performed, the effective transmission rate of the network decreases. This is similar to our previous experimental law on time delay (Figure 11), and gives an important revelation: in the real world network, finding the appropriate network node density has a great impact on the overall network performance. (iii) Third, the inflection point of all three systems occur between 900 and 1,200 nodes (CRTEBEQ: 1,150, HRDA: 1,020, and EDMR: 1,020). This shows that the choice of optimal node density is not only dependent on network configuration, but also needs to consider more factors, and it is unlikely to find the optimal solution precisely through network model and algorithm alone. A theoretical analysis of this problem is useful, but it is beyond the scope of this article and will be discussed in our future work.

Conclusion and Future Work
This paper proposes a stateless and parallel approach achieve high-performance and energy-efficient queries over WSNs.

Wireless Communications and Mobile Computing
The system is built from the ground up with a novel topology and new addressing protocols based on the CRT. Various optimizations are applied to exploit the hibernating nodes and improve the efficiency of data collection and data transmission for network queries. Taken together, these features enable our system to significantly outperform the state-ofthe-art in terms of performance, energy consumption, and reliability.
Our future work is twofold. First, we plan to explore how to further conserve the energy consumption of sensor nodes in the context of CRT-encoded topologies and protocols. Specifically, we believe there is still room to improve the energy efficiency if we could somehow turn those nondelegation nodes (i.e., not selected as z min in Algorithm 3) to an idle mode in a more proactive manner. Currently, we employ a recursive algorithm to filter out unqualified candidates; we will investigate whether an analytical solution exists to this problem. Second, we will investigate how to apply the proposed techniques to other networks or paradigms, particularly, in the context of edge computing [37,38]. In edge computing, a network's terminal points take the responsibility of processing data collected on the sensor nodes. This design implies that the sensor nodes, in addition to collecting and transmitting data, perform computing tasks that could incur substantial power consumption. We plan to extend this work (e.g., a more general model that takes computing into account) to more paradigms such as edge computing.

Data Availability
The data used to support the findings of this study are included within the article.

Disclosure
An earlier version of this paper was presented at the 47th International Conference on Parallel Processing (ICPP 2018).