Intelligent Water Drops Algorithm-Based Aggregation in Heterogeneous Wireless Sensor Network

This paper provides a novel implementation of the intelligent water drops (IWD) method for resolving data aggregation issues in heterogeneous wireless sensor networks (WSN). When the aggregating node is utilized to transmit the data to the base station, the research attempts to show that the tra ﬃ c situations of WSN may be modi ﬁ ed appropriately by parameter tuning and algorithm modi ﬁ cation. IWD is used to generate an optimum data aggregation tree in WSN as one of its applications. IWD assumes that all nodes in the environment are identical, resulting in identical parameter updates for all nodes. In practical scenarios, however, diverse nodes with variable beginning energy, communication range, and sensing range characteristics are deployed. In order to replicate the in ﬂ uence of heterogeneity in the environment, improved IID (IIWD) is o ﬀ ered as an enhancement to the original IID. The suggested enhancement is appropriate for scenarios in which the aggregation node is utilized to transmit data to the base station in heterogeneous con ﬁ gurations. In terms of residual energy, dead nodes, payload, and network lifespan, a series of simulation results demonstrates that the proposed IIWD signi ﬁ cantly improves the accuracy and e ﬀ ectiveness of the IWD method in comparison.


Introduction
The study of nature to model the solution of practical problems with a computer is gaining great popularity as a result of its multiple applications for tackling optimization-related issues. There are several explorable algorithms, and intelligent water drops (IWD) is one of these nature-inspired algorithms that has recently been implemented. IWD algorithm takes into account the dynamics followed by water droplets in order to route their pathways to the lake or ocean through rivers. The algorithm employs the process occurring between river water droplets and riverbed dirt. The IWD method, proposed by Hosseini [1], has been successfully used to tackle several optimization-related issues and provides benefits such as an active feedback mechanism and a high degree of resilience [2]. IWDs are formed from natural water droplets and collaborate to discover the greatest solution to any given problem. The IWD method can be used to solve problems involving maximization or minimization [3,4]. The treatments are as follows: The IWD algorithm builds objects in stages. As a result, IWD is a population-based beneficial algorithm. In the IWD algorithm, two basic ways are used to build IWDs: dirt and velocity are attributes. Both of these characteristics may change throughout the course of a lifetime [5,6]. The International Women's Day an IWD flows from a source to a destination. The IWD begins its trip with 0 initial velocity dirt. During its voyage, it goes through the environment from which it originated. It accelerates and some dirt is eliminated [7,8]. The method employs several iterations in which water droplets attempt to uncover the optimum path from source to destination on a bed of environment particles. In order to do this, the node(s) create control packets that go to the destination. These packets, known as IWDs, have two primary characteristics: velocity and dirt. The environment consists mostly of the dirt of the environment bed. The environmental movement of IWD is governed by the following principles: (i) The velocity of IWD decreases near a high soil bed and vice versa (ii) High-velocity IWD accumulates more soil than a low-velocity IWD (iii) The soil in the environment is eroded more by a high-velocity IWD than a low-velocity IWD Further, IWD has been used to create optimal aggregation tree and has also been proven to achieve the energy efficiency in homogeneous WSN, given the lack of critical attention paid to heterogeneity [9]. Furthermore, there are several applications of WSN, where the deployed nodes possess different characteristics with regard to communication range, sensing range, battery, and sensing services. It is possible to refer to a wireless sensor network as dynamic if it is able to handle the following two atomic operations: node-move-in and nodemove-out, which, respectively, refer to nodes leaving an existing network and nodes entering into an existing network. The primary IWD algorithm has proved to be efficient in discovering the optimal path in the sensor network. However, it does not incorporate the effect of heterogeneity in the network, which is introduced due to distinctive characteristics of the nodes.
The heterogeneity of the node's features offers applications with flexibility and improves network operations within predetermined cost restrictions. For instance, if a network is installed with fewer high-energy nodes, the network may survive for a longer period of time, but the sensing range will be constrained. In contrast, deploying a network with a greater number of low-energy nodes would reduce the network's lifespan while increasing its sensing range. However, a mixture of nodes with varying energy levels may achieve a perfect balance between network longevity and sensing range [10,11]. Experimentation on energy-based heterogeneous node deployment demonstrates a substantial increase in sensing performance. In addition, the cost analysis of hybrid sensor networks verifies these networks' cost effectiveness [10].
In this paper, we suggest using the IWD technique to construct an efficient data aggregation tree in heterogeneous WSN. Heterogeneity in WSN is determined by the initial energy of the network's nodes. Thus, the words low nodes and high nodes are used to categorize nodes based on energy. Low nodes indicate low-energy sensor nodes, while high nodes indicate high-energy sensor nodes.

Problem Statement
The WSN is modelled as a join of two graphs: Here, V n represents all the sensing nodes, and V l represents the low nodes and V h represents the high nodes. Also, E n S e ij , where e ij is the edge connecting the node i to node j, which is based on the premise that if the distance between the node and the neighboring node is less than the predefined communication range R of the node i. However, G 2 is ðV a , E a Þ, where V a represents the set of aggregation nodes and E a represents the edges connecting the aggregation nodes to the base station (BS). The problem can be stated as to find a subset G opt ⊂ G 1 , where G opt = ðV opt , E opt Þ, where V opt ⊂ V n ,where V opt are the optimal number of low-energy and high-energy sensing nodes and E opt ⊂ E n , minimizing the hop count from the sensing nodes to the aggregation nodes. The problem can be solved as a constrained optimization problem, having the nonlinear programming form as Minimize f ðXÞ, where X = fx 1 , x 2 , x 3 ⋯ :: x n g subject to R, where R is the cost function subject to the following constraint: Here, h hs denotes the hop count from low-energy nodes to high-energy nodes and h sa represents the hop count from high energy nodes to the aggregation nodes.
Assumption: All the aggregation nodes, V a , are connected directly to the BS.

Related Work
This section depicts the key features of the current routing protocols in WSN. Because WSNs differ from other networks such as MANETs (mobile ad hoc networks) and mobile networks, routing is particularly complex. The main task of WSN is sensing, collecting, and delivering the information for further processing. Many methods have been developed in this area [12,13]. Furthermore, according to a complex network theory, predicting connection quality in wireless sensor networks is akin to predicting link quality in social networks. The routing problem of a network determines the quickest route (also known as the optimum path) between the transmitter and the destination [14,15]. Signal strength changes a lot on mobile and ad hoc networks, causing a lot of route failures and lowering performance. Several ideas have been proposed to estimate the signal strength-based link availability projection for best routing [16,17]. This link information may be utilized to calculate the connection breakdown time and, as a result, either repair the existing route or locate a new one for the packets. The reduction of packet losses and end-to-end latency improves performance [18,19]. Because of specific WSN characteristics, the routing algorithms in WSN vary from conventional networks in numerous ways.

2
Journal of Sensors WSN sensor nodes are energy restricted and cannot be recharged owing to their special application requirements. Furthermore, the major uses of WSN are to detect data, analyzes it, and then broadcast it to the BS. As a result, routing towards the BS is a critical job, and multiple algorithms have been presented [20], aimed at distinct circumstances or situations owing to their unique characteristics. In terms of infrastructure building and maintenance approach, the routing algorithms in WSN may be divided into three broad classes: Flooding and gossiping [21] do not need to retain topological information in advance and build routing paths after network setup or the commencement of network activity. The flooding-based routing algorithm transmits the observed data to all adjacent nodes and continues this process until the data reaches the base station. In contrast, routing systems based on gossiping randomly choose a limited number of neighbors and relay messages to them until the BS is reached. In contrast to the flooding technique, the gossiping approach reduces the quantity of data packets transferred across the network but creates the data packet implosion problem, which incurs additional expenses for WSN. Independent of network activities, proactive routing algorithms design and maintain the routing architecture continually. BS establishes and transmits the path to all sensor nodes to all network nodes. During network operations, the sensor nodes retain this information and use it to route data packets over these channels. In MANET, the proactive DSDV [22] protocol is used, but for WSN, a variety of tree-based methods [23] are offered (e.g., one-phase pull diffusion [24]). The intelligent interaction of wireless sensor networks (WSN) and mobile ad hoc networks (MANET) with the Internet of Things increases its user appeal and commercial viability [25,26]. By merging wireless sensor and mobile networks with the Internet of Things, it is possible to develop new MANET-IoT devices and IT-based networks. This technology facilitates user mobility while decreasing network implementation expenses [27,28]. One of the fundamental principles of Internet of Things systems is the networking of intelligent objects and their compliance with communications technology. Wireless networks (wireless sensor networks (WSN), whose characteristics include sensing, data collecting, heterogeneous connectivity, and data processing, play a major role in the Internet of Things (IoT) system [29,30]. Paths are continuously maintained during network operations, although at a great cost of resources. Several evolutions of the classical methods, such as BVR [31], VRR [32], and S4 [33], are provided, providing enhancements to the classical approaches in terms of reduced resource utilization, a quality required for realistically scaled WSN.
Reactive routing algorithms generate routing pathways as necessary. The architecture for routing is constructed by the sensing nodes that must convey data to the base station, not previously. In MANETs, the most used reactive routing method is AODV [34], whereas in WSN push diffusion [35], in FRA [36] and LRDE [37] are the most prevalent. These techniques save resources during times of inactivity but incur the cost of identifying pathways for each originating node.
The following significant category of routing algorithms contains hybrid algorithms, which integrate both reactive and proactive network behaviors based on network circumstances. Several hybrid routing techniques for MANETs now exist. Zone routing protocol (ZRP) [38] is the first hybrid method utilized in MANETs. The ZRP protocol splits the network into zones, and inside these zones, routes are decided proactively, while outside of these zones, routes are established reactively. ZRP has a lower routing overhead benefit. However, zones are determined statically in ZRP. Therefore, the SHARP protocol [39] presented an enhancement based on the dynamic generation of zones. The zones are only generated around nodes that generate a considerable amount of incoming data, which decreases routing overhead along with jitter and loss rate. However, in the context of WSN, the routing strategy that incorporates a hybrid adaptive solution has not yet been extensively deployed. In addition, MANET routing techniques are inapplicable to WSN owing to its fundamental properties. Figure 1 [40] depicts a thorough overview of routing methods in WSN based on node heterogeneity. Significant benefits of adopting energy-based node heterogeneity in WSN include increased throughput and decreased latency. However, heterogeneity reduces the hop count among the sensor nodes and the sink; hence, the delivery rate in heterogeneous WSN is greater than homogeneous WSN.
Broadly, there are three primary types of heterogeneity, namely, energy, computational, and link heterogeneity. Energy heterogeneity focuses on nodes' diverse battery power. Higher-end nodes get more energy. Few nodes have higher computing capacity than others in computational heterogeneity. Complex data processing and memory-intensive processes need powerful nodes. Connection heterogeneity focused on link bandwidth between nodes. Long-distance nodes are provided a high-bandwidth transmission connection to ensure reliable data transfer. Most WSNs employ energy heterogeneity since it uses the least resources. Computational and connection heterogeneities hinder WSN without energy heterogeneity. Figure 2 illustrate the types of heterogeneity.
Energy heterogeneity is split into three types based on node power levels: two-level, three-level, and multilevel. Two-level defines regular and advanced nodes. Normal, advanced, and super nodes are specified in three-level networks. Multilevel randomizes energy distribution in nodes. Recent routing techniques have been developed to improve WSN performance [41,42]. Cluster-based and tree-based routing protocols are the main types.

Journal of Sensors
LEACH is a clustering routing technique that forms clusters and elects cluster leaders to communicate with the BS [43][44][45]. LEACH does not consider residual energy while choosing a cluster head [20,46,47]; hence, it performs poorly in diverse environments [48]. As a result, stable election protocol (SEP), a clustering routing protocol, was devised. Cluster heads are chosen using a weighted probability [49]. SEP's two-level heterogeneous network performed well. Multilevel heterogeneous WSNs could not use the routing protocol properly. This led to the DEEC algorithm for multilevel heterogeneous networks [50]. In DEEC, cluster heads were chosen based on the average network energy and sensor node energy. Other clustering-based routing protocols include EDFCM [51], an enhancement of DEEC, REP [52], and EEPCA [53].
The second type of routing protocols is tree-based, wherein nodes are organized as trees and root node does the data aggregation and further transmitting it to the BS. Tree-based techniques suit aggregation needs [54] like forest fire, industrial, event, health, and other monitoring systems.
Data aggregation in tree-based protocols is optimized for energy efficiency. Finding an optimum aggregation tree is NPhard [55], similar to Steiner tree, weighted set cover issue [56]. DD [57] identifies the quickest routing channels to transport data packets throughout the network and opportunistically     Journal of Sensors aggregates them. However, DD is not considered efficient since the aggregation nodes are chosen randomly and may be distant from the source nodes. GIT is an approximation approach suggested to build an energy-efficient route in an ideal aggregation tree [58]. Krishnamacharya et al. [59] also demonstrated the advantages of data aggregation. Liao et al. [60] devised ant colony optimization (ACO) method, which simulates ant foraging behavior. These ants drop pheromone to designate a trail for the colony to follow. The ant colony method is used by Schurgers et al. [61] to aggregate data. Wu et al. [62] improved the chance of locating aggregation nodes in WSN exploiting ACO by widening search area surrounding routing pathways. These works outperform standard approaches in energy conservation.

System Model
The system model used here takes into account the random distribution of stationary sensor nodes in a monitoring region. Here, there are three distinct kinds and configurations of nodes: sensor nodes, aggregator nodes, and the base station (BS). The sensing node collects data and transmits it at regular intervals to the aggregation node. However, sensing nodes are designed to be of two types: low nodes (representing lowenergy sensing nodes) and high nodes (representing highenergy sensing nodes). Figure 3 depicts the aggregation tree for data. The aggregator node aggregates data and transmits it to the BS. The BS should be installed outside of the network, and it should transfer the processed data to the control center.
The heterogeneous WSN thus categorizes nodes as low nodes (), high nodes (), and aggregator nodes (). The upper nodes contain tenfold more energy than the low nodes. The communication model used in this study is the first-order radio model, whereas the sensing model previously applied was the deterministic sensing model [63]. This model implies that each node participates in the sensing process. The detected data is compared to aThreshold, whose value is predetermined. If the detected data exceeds the defined threshold, the data is sent to the next node. Consequently, sensing coverage is the total of the sensing coverage of all network nodes. However, the difference lies in predefined communication and sensing ranges. Subsequently, the communication and the sensing range of 'N h ,' 'N l ,' and 'N a ' are abbreviated as 'R ch ,' 'R sh '; 'R cl ,' 'R sl '; and 'R ca ,' 'R sa ' respectively. In addition, the ranges are defined in incremental order as 'R ca > R ch > R cl ' and 'R sa > R sh > R sl .'

Preliminaries
The primary IWD algorithm is based on the evidence that water drops always find the shortest route towards lake or ocean. Despite encountering obstacles and constraints, water drops always find an optimal path trailing twists and turns. Correspondingly, the environment is also affected as the water drops move from one place to another. In the same way, the environment also tries to alter the nature of the water drops. In a way, both water drops and environment have a tendency to influence each other. The environment here refers to the soil bed of the river. When the drops move fast, they tend to remove more soil from the soil beds than when they are slow. Drops that are trying to find an optimal path are called intelligent water drops (IWD). The three essential parameters that define the path taken by the water drops are velocity (Velocity IWD ), Soil (Soil IWD ), and Soil of the river bed Soil Edge . These parameters change as the data packet moves from source node to destination. The change of the velocity is updated by a parameter ΔVel IWD , which is calculated as follows: where a v , b v , and c v are constants that are application dependent. soilði, jÞ is the soil on the bed of edge between node i and node j. During the initialization phase, each edge is assigned an equal amount of this parameter. The velocity can be evaluated as The decrease in the soil of an edge is calculated as where a s , b s , and c s are the application dependent constants, which specify the relationship between the weight of the edges and the time that a data packet takes to move from a node i to j. The time taken is given by where HUDði, jÞ is the heuristic function defined for an application for calculating the hop counts on the path.
where R j represents the routing nodes in the neighborhood of node j, and h kj and h jd represents the hop count from source node s to node j and from node j to destination node d or the BS, respectively.

IWD Algorithm for Heterogeneous Network
In homogeneous environments, the IWD method may be used to produce an optimum data aggregation tree solution. Sensing nodes with data produce IWD to search for pathways linking to the base station or the closest aggregator node. These IWDs produce an aggregate tree by generating pathways using the approach described in Section 2. Here, it is proposed that the low node will locate a way to the high node, which will then transfer the data to the base station or aggregator nodes. Since a result, the energy consumption of low nodes will be decreased, as they will be required to find a way to the closest high nodes. Nonetheless, there are instances in which the route constructed by this IWD lacks a connection point and hence cannot discover other nodes visited by other IWDs. In such a case, the chance of constructing an ideal tree will be diminished. To update the soil in this situation, the IWD packet is transmitted to the neighbors of all modes. When a high node receives an IWD packet, it broadcasts an updated soil packet to its neighbors. Each neighboring node u updates the soilðu, vÞ based on the information received. The packet format of IWD is as shown in Table 1.
In the table, type of packet determines whether it is a data packet or a control packet. Source ID is the ID of the source node generating the IWD. Next hop ID is the next neighboring node. IWD soil and IWD velocity are the updated parameters of IWD on the path for a particular IWD. However, there  Journal of Sensors exists a condition when an IWD cannot find any nodes that are visited by other nodes. The algorithm for the heterogeneous network proceed as follows. Initially, all the aggregation nodes store the identity of the BS and broadcast their ID(s) to the network. Each sensing node stores the aggregation ID along with the information of next hop neighbors and the soil value of all the paths. Initially, all the paths are assigned equal values. Later, these values are updated as the IWD traverses on that edge. The edge connecting the node to the BS is assigned a lesser value of soil so that the additional gain in the velocity can be achieved. Additionally, lower soil value of an edge represents lesser number of hop counts, thus attracting more IWDs on this path. Because of this proposed modification, the probability of IWD to reach an aggregation node is higher, when a high node receives a soil update packet, an update message to all the neighboring nodes is sent and hence, following this approach, IWD reaches the aggregation nodes faster. In a way, high node increases the velocity of the IWD. Thus, the velocity parameters of the neighboring nodes of the high nodes are evaluated from the following: Additionally, the soil of the edge between the high node and the aggregation node is updated as follows: Here, equation is defined from the node i to the aggregation node, representing the hop counts from the node i to the aggregation node a.
where ρ n is the local soil updating factor for the path connecting to the aggregation node. By updating these values repeatedly, the probability of IWD to reach the aggregation nodes becomes higher, and hence, the IWD reaches the aggregation node faster, reducing the delay. By enhancing the probability of all the neighboring nodes, the soil of the path is declined notably, thus making IWD reach faster to the destination. The algorithm starts with the initialization of static parameters a s , b s , c s , and a v , b v , c v , and then follows the steps mentioned below, going through several iterations: The sensing node generates a control packet named IWD with initial values of velocity and soil viz. Initvel and Initsoil.
(1) A neighboring node is selected randomly by calculating the probability values, which is inversely proportional to the soil of the edges. The probability value which is inversely proportional to the soil of the edge is calculated as where vcðIWDÞ is the subset of the nodes, which IWD should not visit in order to satisfy the application constraints (2) If the next hop node is a high node, therefore, the velocity is updated from Equation (7); otherwise, it is updated from Equation (2) (3) Similarly, the soil of the edge for high node is updated from Equation (8); otherwise, it is updated from Equation (4) (4) The process continues till the state of complete termination is reached, which is when the IWD either reaches an aggregation node or a BS This IWD algorithm described above can build a data aggregation tree with a minimum number of hop count in heterogeneous setup. Once an aggregation node is found, the steps mentioned above updates the amount of soil in its neighborhood and hence increases the likelihood of selecting the best aggregation. Thereby, enhancing the chances of IWD moving through this aggregation node whenever it reaches its neighborhood in the next round.

Assumptions
(1) IWDs will try to find paths to the nearest aggregation nodes instead of BS

Results and Discussions
C++-based simulator is used to mimic the state-of-the-art ant colony optimization (ACO), IWD and the proposed IIWD. This simulator models' actual events like as collisions, carrier sensing, latency, network lifespan, and backoff. For aggregator nodes, the proposed requirements are consistent with iPAQ motes since they compute quicker, use less power, and have a greater sensing and communication range [64,65], while the proposed specifications for sensing nodes are consistent with MICA2 sensor nodes [66,67]. The performance of a 100-node network randomly spread across a 100 × 100 m 2 region with single sink is evaluated. The total number of aggregation nodes picked for a specific simulation cycle ranges from 5 to 30. The data packet is of 250  Table 2 displays simulation parameters. Parameters for the proposed IIWD algorithm are presented in Table 3. The values are taken according to the parameters provided in [9]. The evaluation metrics are as follows: total energy consumption (J) and network lifetime (rounds, which are discussed in the subsequent section. 7.1. Total Energy Consumption: Analysis. Total energy consumption is the sum of the energy spent transmitting control packets, sending data packets, and the network's total energy consumption. The total energy consumption is the sum of the energy used by all network nodes in a particular round. Various simulations are run with a variety of source node counts. The fundamental IWD, the ACO, and the proposed IIWD algorithm are compared. As shown in Figure 4, the average energy consumption for sending control packets is slightly higher than that of ACO and IWD (a). This may be a result of the increased number of control packets transmitted to refresh the edge soil. As demonstrated in Figure 2, the average energy consumption for transmitting data packets through IIWD is slightly less than that of IWD and ACO Figure 4(b). This is because the number of hops necessary for data transmission has dropped. In addition, it can be determined from Figure 4(c) that IIWD's overall energy usage is less than that of IWD and ACO since its routing function provides greater aggregation options. Consequently, the network's total energy consumption is dramatically reduced.

Network Lifetime.
A comparison is made between the performance of IIWD and ACO and IWD in terms of net-work lifespan calculated as the time until the first node runs out of energy as shown in Figure 5. The notion of updating the velocity of all IWDs along the designated path expedites the delivery of packets to aggregation nodes, hence extending their lifespan. IIWD has showed the greatest improvement in network lifespan. This improvement is due to the algorithm's suggested method, which minimizes the total number of data packets transmitted in the network, consequently reducing energy consumption of the nodes. As all aggregation nodes are directly linked to the BS, the resulting routing pathways are shorter. The results illustrate the efficacy of the proposed strategy for extending network life.

Payload.
Payload is determined by comparing the actual number of data packets delivered at the base station as shown in Figure 6(a) vis-à-vis number of packets supplied by the source nodes. Typically, payload consists of the actual data transported over a network for an application. The primary concern addressed in this paper is that the payload varies based on the aggregation process used. The effectiveness of aggregation in WIWD in a heterogeneous environment is determined using payload parameter. Figure 6(b) depicts the number of data packets received at the base station for varying numbers of rounds. In the WIWD method, payload, or the total amount of packets delivered over the network, is around 70 percent fewer than in the IWD algorithm. The fewer data packets sent may be ascribed to the aggregating procedure. Figure 6(c) provide details of packets sent to BS Instead of being routed immediately to the base station, the data packets are passed to the aggregation node, which aggregates them before sending them to the base station. Thus, the total number of rounds remains the same despite a modest reduction in the amount of data packets transferred to the base station. 7.4. Number of Alive Nodes. The proposed WIWD algorithm shows significant improvement in the number of alive nodes with respect to the total number of nodes. Since the nodes create paths to the aggregation nodes and send data packets up to these aggregation nodes, the nodes tend to stay longer and subsequently improve this performance metric vis-a-vis IWD. Here, the same logic implies that the incorporation of high-end nodes and the concept of waterfall in the network enhance the number of alive nodes in the network. Remarkably, the number of alive nodes in the WIWD increases by almost 40% Figure 7.

Conclusion
The suggested optimization model seeks to crystallize the different aspects that affect the sensor network's heterogeneity. The study draws on prior understanding of efficient routing algorithms for homogenous networks. In the context of this study, aggregator nodes insert new packet entries with regard to time and pop depending on the freshness factor defined by packet length for each node. Incorporating heterogeneity into a network may increase the total energy consumption and network longevity, according to the findings of this study. This method has been seen to save both the 9 Journal of Sensors residual energy of the nodes and the average energy of the network. This is evident from the results of the studies conducted to determine the performance of the IIWD. The technique outperforms prior algorithms due to the fact that the route selection performed by its routing function offers superior aggregation alternatives. The threshold of the nodes' remaining energy is used to determine which aggregation node is chosen. The findings demonstrate that the network's longevity has also been greatly enhanced.

Future Scope
The issue which leaves room for further investigation is that of time synchronization. There lies great scope for future research while exploring strategies for weak and strong time synchronization that may be experimented on, to further streamline the aggregation process with the BS.

Data Availability
Data available from the corresponding author upon request.

Conflicts of Interest
There authors declare that they have no conflict of interest.