Comparison of Residual Energy-Based Clustering Algorithms for Wireless Sensor Network

Wireless sensor network swears an exceptional fine-grained interface between the virtual and physical worlds. The clustering algorithm is a kind of key technique used to reduce energy consumption. Many clustering, power management, and data dissemination protocols have been specifically designed for wireless sensor network (WSN) where energy awareness is an essential design issue. Each clustering algorithm is composed of three phases cluster head (CH) selection, the setup phase, and steady state phase. The hot point in these algorithms is the cluster head selection. The focus, however, has been given to the residual energybased clustering protocols which might differ depending on the application and network architecture. In this paper, a survey of the state-of-the-art clustering techniques in WSNs has been compared to find the merits and demerits among themselves. It has been assumed that the sensor nodes are randomly distributed and are not mobile, the coordinates of the base station (BS) and the dimensions of the sensor field are known.


Introduction
With the proliferation in automated devices and the development in wireless technologies WSNs have gained worldwide attention in recent years.WSNs as an exciting emerging domain of deeply networked systems of low-power wireless nodes with a tiny amount of CPU and memory for highresolution sensing of the environment [1].The wireless nodes are nothing but a large number of low-cost, multifunctional sensor nodes that are deployed in a region of interest.The sensor nodes not only senses but also processes the data to make itself meaningful by using its embedded microprocessors and also communicates those meaningful data through its transceiver [2].They communicate over a short distance via a wireless medium and collaborate to accomplish a common task, for example, environment monitoring, battlefield surveillance, and industrial process control [3].WSNs are made up of a large number of inexpensive devices that are networked via low-power wireless communications [4,5].Due to the networking capability that fundamentally appears in a sensor network, it overcomes the flaws present in a mere collection of sensors, by enabling cooperation, coordination, and collaboration among sensor assets [6].Wireless sensor network technology is expected to have a significant impact on our lives in the twenty-first century by harvesting advancements in the past decade in microelectronics, sensing, analog and digital signal processing, wireless communications, and networking.Wireless sensor networks differ fundamentally from general data networks such as the internet, and as such they require the adoption of a different design paradigm [7,8].Often wireless sensor networks are application specific, they are designed and deployed for special purposes to solve some intended applications.In the context of wireless sensor networks, the broadcast nature of the medium must be taken into account.Because of the battery-operated sensors, energy conservation is one of the most important design parameters, since replacing batteries may be difficult or impossible in many applications [9].Thus sensor network designs must be optimized to extend the network lifetime.In view of energy consumption in a wireless sensor network, data transmission is the most important with respect to others.Within a clustering organization, intracluster communication can be single hop or multihop, as well as intercluster communication [10].Researchers have shown that multihop communication between a data source and a base station is usually more energy efficient than direct transmission because of the characteristics of wireless channel [11].Although many protocols proposed in the literature reduce energy consumption on forwarding paths to increase energy efficiency, they do not necessarily extend network lifetime due to the continuous many-to-one traffic pattern.In a sensor node, energy consumption can be "useful" or "wasteful" [12,13].Useful energy consumption can be either due to the following items: transmitting/receiving data, processing query requests, and forwarding queries/data to neighboring nodes.Wasteful energy consumption can be due to the items: idle listening to the media, retransmitting due to packet collisions, overhearing, and generating/handling control packets [14,15].
As compared with traditional wireless communication networks WSN has the following unique characteristics and constraints.
(i) Dense node deployment: sensor nodes are usually densely deployed in a field of interest.The number of sensor nodes in a sensor network can be several orders of magnitude higher than that in a MANET.
(ii) Battery-powered sensor nodes: sensor nodes are usually powered by battery.In most situations, they are deployed in a harsh or hostile environment, where it is very difficult or even impossible to change or recharge the batteries.
(iii) Energy, computation, and storage constraints: sensor nodes are highly limited in energy, computation, and storage capacities.
(iv) Self-configurable: sensor nodes are usually randomly deployed without careful planning and engineering.Once deployed, sensor nodes have to autonomously configure themselves into a communication network.
(v) Application specific: sensor networks are application specific.A network is usually designed and deployed for a specific application.The design requirements of a network change with its application.
(vi) Unreliable sensor nodes: sensor nodes are usually deployed in harsh or hostile environments and operate without attendance.They are prone to physical damages or failures.
(vii) Frequent topology change: network topology changes frequently due to node failure, damage, addition, energy depletion, or channel fading.
(viii) No global identification: due to the large number of sensor nodes, it is usually not possible to build a global addressing scheme for a sensor network because it would introduce a high overhead for the identification.
(ix) Many-to-one traffic pattern: in most sensor network applications, the data sensed by sensor nodes flow from multiple source sensor nodes to a particular sink, exhibiting a many-to-one traffic pattern.[18].During the reformation of clusters, the cluster head is changed along with the members affiliated to it.Clustering provides resource utilization and minimizes energy consumption in WSNs by reducing the number of sensor nodes that take part in long distance transmission.In WSN the primary concern is the energy efficiency in order to extend the utility of the network [19].
1.1.2.Why Do WSN Require Clustering?It has been shown that cluster architecture guarantees basic performance achievement in a WSN with a large number of sensor nodes.A cluster structure provides some direct benefits like spatial reuse of resources to increase the system capacity, with the nonoverlapping multicluster structure, two clusters may deploy the same frequency or code set if they are not neighboring clusters [20,21].Clusters also give performance enhancement in case of routing, because of the set of cluster heads normally form a virtual backbone for intercluster routing.Clustering in WSNs is very challenging due to the inherent characteristics that distinguish these networks from other wireless networks like mobile ad hoc networks or cellular networks [22].First, due to the relatively large number of sensor nodes, it is difficult to identify every sensor and the sensed data.Furthermore, sensor nodes that are deployed in an ad hoc manner need to be self-organizing as the ad hoc deployment of these nodes requires the system to form connections between themselves [23,24].

What Is the Cost of Clustering?
In a clustered network, the cost is divided into intra-and intercluster cost.The intracluster communication cost is from the nodes inside a cluster to the head [25].The intercluster communication cost is from the heads to the base station.The parameter energy efficiency of a clustered sensor network depends on the selection of the heads.The cost of clustering is a key issue to validate the effectiveness and scalability enhancement of a cluster structure [26,27].By analysing the cost of a clustering scheme in different aspects qualitatively or quantitatively, its usefulness and drawbacks can be clearly specified.
(i) When the underlying network topology changes quickly and involves many mobile nodes, the clustering-related information exchange increases drastically.
(ii) Some clustering schemes may cause the cluster structure to be completely rebuilt over the whole network when the CH's residual energy goes out of limit.
(iii) Another metric is the computation round, which indicates the number of rounds in which a cluster formation procedure can be completed.
The remainder of this paper is organized as follows.Section 2 gives account of existing clustering algorithms for WSN.Simulation result compares the algorithms according to lifetime and residual energy in Section 3. Algorithms are summarized in Section 4. Finally, Section 5 gives the conclusions.

Hybrid Energy-Efficient Distributed Clustering (HEED).
HEED (Hybrid Energy-Efficient Distributed clustering) [28] is a distributed clustering scheme in which cluster heads are selected periodically according to a hybridization of the node residual energy and a secondary parameter, that is, intracluster communication cost.HEED selects the cluster head which has the highest residual energy and requires the minimum distance for communication.Intracluster communication cost is a function of cluster properties, that is, cluster size, and whether or not variable power levels are permissible for intra-cluster communication.If the power level used for intra-cluster communication is fixed for all nodes, then the cost can be proportional to either (i) node degree, if the requirement is to distribute load among cluster heads, or (ii) 1/node degree, if the requirement is to create dense clusters.
The average of the minimum power (AMP) levels required by all M nodes within the cluster range to reach the cluster head is where, min(p i ) denotes the minimum power level required by a node v i , 1 < i < M, M is the number of nodes within the cluster range.
Initialization Phase.In HEED clustering is triggered in every T CP + T NO seconds to select new cluster heads where T CP is time required to create a cluster and T NO is the time interval between the end of a T CP and start of a subsequent T CP .In each iteration before the start of execution each node sets its probability of becoming a cluster head, CH prob , as where C prob = Initial percentage of cluster heads among all n nodes, and E residual = Estimated current residual energy in the node, E max = Maximum energy.
Repetition Phase.In repetition phase, every sensor goes through several iterations until it finds the cluster head which will use the least transmission power (cost).If it hears from no other CH, the sensor elects itself as a CH and sends an announcement message to its neighbours informing them about the change of status.Finally, each sensor doubles its CH prob value and goes to the next iteration of this phase.It stops executing this phase when its CH prob reaches one.
Finalization Phase.At last, each sensor makes a final decision on its status.A node can either elect to become a cluster head according to its CH prob or join a cluster according to overheard cluster head messages within its cluster range.HEED has a worst case processing time complexity of O(n) per node, where n is the number of nodes in the network.Also is has a worst case message exchange complexity of O(1) per node, that is, O(n) in the network.The probability of becoming cluster head for two nodes within each other's cluster range is very less.HEED protocol, which terminates after a constant number of iterations, is independent of network diameter.

Distributed Weight-Based Energy-Efficient Hierarchical
Clustering (DWEHC).Ding et al. [29] have proposed Distributed Weight-Based Energy-Efficient Hierarchical Clustering (DWEHC) to achieve better cluster size balance and optimizing clusters such that the minimum energy topology will be maintained.DWEHC makes no assumptions on the size and the density of the network.This algorithm is implemented by each node individually.The nodes, which uses DWEHC, follows a hierarchical structure for clustering.The number of levels in the hierarchy depends on the cluster range and the minimum energy required to reach the cluster head.Within a cluster, TDMA (Time Division Multiple Access) is used for transmission, that is, within a particular time frame one sensor can send the data to the cluster head.The weight is calculated by each after locating the neighbouring nodes in its area.The weight is a function of the sensor's reserve energy and the proximity to the neighbours.The node having largest weight will be elected as a CH and the remaining nodes become members.Each node in the network is either a cluster head or a child (first level, second level, etc).DWEHC follows the below mentioned steps to complete the algorithm.
(i) Relay: here authors only concentrated on path loss due to dependability of two sensors by means of distance by assuming all the sensors have similar antenna heights.
(ii) Relay Region: let s be the sender node and let r be the relay node, the nodes in the relay region can be reached with the least energy by relaying through r.
(iii) Enclosure Region: enclosure region is the complement of relay region.
(iv) Neighbours: these are the nodes which do not need relaying when a node s transmits to the others which can receive directly.
(v) Cluster range: radius of the cluster, that is, the highest distance between a node and the cluster head inside the cluster.
(vi) Weight used in cluster head election: weight is calculated based on parameters like the distance between the node, and the receiver, the residual energy of the node and the initial energy of that node.
(vii) Levels in a cluster: here each cluster is a multilevel, so the number of levels in a cluster depends on the cluster range and the minimum energy path to the cluster head.
(viii) My range and My dis : My range is the distance form the corresponding node to the cluster head and M y dis is the minimum energy path to the clusterhead.
Let DWEHC generate the cluster in T generate time frame.
There are two types of communication that may occur in a clustered network: intracluster and intercluster.The time required for these operations is T cluster .T cluster should be much longer than T generate to guarantee good performance.To prevent a clusterhead from dying due to energy loss, the DWEHC algorithm runs periodically in every T cluster +T generate .For intercluster communication TDMA technique is used for transmission.DWEHC is fully distributed over the whole network where every node is covered by only one clusterhead.The cluster heads are distributed in such a way that when two nodes are within each other's cluster range, the probability of both of them becoming cluster heads is very small.The complexity of broadcast message exchange is O(1) for each node.

Hybrid Clustering Approach (HCA).
Neamatollahi et al. [30] proposed HCA, a distributed clustering algorithm for wireless sensor networks.When the CH's energy level decreased to a predefined value, it indirectly informs other nodes, so that clustering is performed at the beginning of the next round.The network lifetime, which is defined as L be the time elapsed until the first node in the network dies.In HCA approach clustering is not performed in each round, which happens in dynamic clustering approaches.Each CHs save their residual energy in their memory after the clusters formed.When the residual energy of a CH becomes less than a predefined value, it sets a specific bit in a data packet which is ready to be sent to the BS in the current TDMA frame.
So that the BS will inform to all the nodes about the start of clustering process at the start of the next round.The BS sends a specific synchronization pulse in a multihop fashion to all nodes.After receiving the pulse each node prepares themselves for perform clustering.So, cluster head election and consequently the cluster formation are done on demand.
The authors argued that their approach can be useful for applications that require scalability and prolonged network lifetime.After the first setup phase, the clustering will not be performed until at least one of the CH attains a predefine part of its energy.The clustering process at the beginning of each round imposes lots of overhead on the network.As compared to LEACH and HEED, the HCA gives 30% more efficiency in terms of lifetime.

Energy Efficient Heterogeneous Clustered Scheme (EEHCS).
De Freitas et al. [10] proposed EEHCS, an energy efficient, heterogeneous clustering scheme for wireless sensor networks, which is based on weighted election probabilities of each node to become a cluster head according to the residual energy in each node.The algorithm starts the clustering process with the nodes present in the heterogeneous network having different amount of energy at the beginning.Here some of the sensor nodes are equipped with more energy resources than the normal sensor nodes in the network.Here the authors proposed three types of sensors used in the network, they are super nodes, advanced nodes, and normal nodes.Advanced and super nodes are more powerful and are having higher battery power than the normal nodes.It elects the cluster head in distributed fashion in hierarchical WSN.The algorithm is based on the principles of LEACH algorithm.Here the authors described how the election process of cluster heads should be adapted appropriately to deal with heterogeneous nodes.The optimal probability of a node being elected as a cluster head is a function of spatial density when nodes are uniformly distributed over the sensor field.This clustering scheme is considered as optimal because the energy consumption is well distributed over all sensors and the total energy consumption is minimum.LEACH depends only on the spatial density of the sensor network, because it works on homogeneous networks that means all the nodes having same energy.But EEHCS works on heterogeneous network, which is a mixture of super, advanced, and normal nodes.The lifetime of the network is much better in comparison with LEACH, because always the super nodes become the cluster heads.[31] proposed a distributed election clustering protocol to prolong the network lifetime of wireless sensor networks, which is based on residual energy and communication cost to elect suitable cluster-head nodes.This distributed clustering protocol works for to-level heterogeneous wireless sensor networks.In DECP the cluster head election is a function of residual energy and communication cost.If the energy is not balanced for all the nodes then the node with highest energy is considered for the selection of CH, but if the network is energy balanced then the communication cost is considered for CH election.DECP provides more load balance as compared to classical protocols like LEACH and SEP.In the cluster formation process, all nodes broadcast their current energy information and hear the energy message from the others.Each node calculates the cost based on the distance to the neighbours and the current energy, when it has sufficient knowledge about the neighbours.Then the nodes select the candidate node by choosing the minimum cost sensor node and sends a vote msg to the candidate node.Upon receiving the vote msg from the neighbours the node declares himself as cluster head and all non-CH nodes join the CH to form a cluster.This protocol does not need any of global energy knowledge at clustering process.As long as nodes exchange local information, cluster head nodes could be selected.DECP is scalable as it does not required any of exact position of each node in the field.[32], a clustering protocol proposed for heterogeneous networks to provide longer lifetime and more reliable transmission service.Different from other energy efficient protocols that consider the residual energy and energy consumption rate in the nodes, the process of cluster head selection in EDFCM is based on a method of one-step energy consumption forecast.Besides, the management nodes play a cooperative role in the process of the selection of cluster heads to make sure that the number of cluster heads per round is optimum.The algorithm tries to balance energy consumption round by round, which will provide the longest stable period for the networks.In actual heterogeneous application scenes, the node functioning as a cluster head, though has more residual energy than the others in a previous round, may die or consume much more energy in the operation of next round due to the computational heterogeneity.For further considerations, since the nodes are deployed uniformly in application scenes and the number of noncluster head nodes per round in a cluster is almost the same, the energy dissipation of a cluster head will only be relative to the locations of nodes in a cluster.We can think that the energy dissipations in those sequent rounds are correlative.EDFCM uses the average energy consumptions of the two types of cluster heads in previous round as the forecast values for the energy consumptions of them in the next round.The more residual energy in a node after the operation of next round, the higher probability the node will be selected as a cluster head.Contributions of EDFCM are to provide the longest stability period (when the first node is dead) and improve the scheme of clustering management in LEACH and LEACH-based algorithms.EDFCM yields longer stability period and much more effective messages transmitted to the base station, compared with other typical clustering protocols and the number of clusters per round in EDFCM is stable.

Energy-Efficient Unequal Clustering (EEUC). Li et al.
proposed EEUC [33], an energy efficient clustering protocol for periodical data gathering application in WSNs.Here the authors tried to remove the hotspot problem, that arises in multihop routing.The hotspot problem arises when the cluster heads closer to the data sink dies due to the burden by heavy relay traffic.The cluster heads nearer to the base station are heavily loaded with network traffic and looses energy quickly as compared to the CHs farther from the BS.To solve this kind of problem the authors proposed such a clever algorithm so that the clusters closer to the base station are expected to have smaller cluster sizes, thus they will consume lower energy during the intracluster communication and can preserve some more energy for the intercluster relay traffic.After the network deployment, the BS broadcasts a "hello" message to all the nodes present in the network with certain power levels.Then all the nodes calculate the approximate distance from the BS, which then helps the algorithm in making clusters of unequal size.Figure 1 illustrates the overview of the algorithm in which circles of different size denotes clusters of unequal size with respect to the distance of the nodes from the BS.
The responsibility of being a cluster head is rotated among sensors in each data gathering round to distribute the energy consumption across the network.Figure 1 shows the size of cluster decreased when the distance between the CH and BS decreases.This algorithm is based on distributed cluster heads, where cluster head selection is primarily based on the residual energy of each node.Throughput shows that the unequal clustering improves the network lifetime and balances the energy consumption in the network over LEACH and HEED.[34], a distributed multilevel clustering algorithm for heterogeneous wireless sensor networks.Here the cluster head is elected by a probability based on the ratio between the amount residual energy present at each node and the average energy of the network.The lifetime of a cluster head is decided according to its initial energy and residual energy.So always the nodes  with high initial and residual energy has a better chance to become a CH.DEEC is implemented based on the concepts of LEACH algorithm.The role of cluster head is rotated among all nodes of the network to uniformize the energy dissipation.Two levels of heterogeneous nodes are considered in this algorithm to achieve longer network lifetime and more effective messages than other classical clustering algorithms.It also works better for multilevel heterogeneous networks.

Distributed Energy-Efficient Clustering Algorithm for HWSN (DEEC). Qing et al. proposed DEEC
In DEEC all the nodes must have the idea about total energy and lifetime of the network.Average energy of the network is used as the reference energy.Here the authors chose different i based on the residual energy E i (r) of node s i at round r, where n i denote the number of rounds to be a clusterhead for the node s i .At the start of a new round each node s i computes the average probability p i by the total energy E total , while estimate value R of lifetime is broadcasted by the base station, where R is the total of rounds from the beginning of the network to all the nodes die.
The authors assumed that the N nodes are distributed uniformly in M × M region, and the base station is located in the center of the field for simplicity.Each noncluster head send L bits data to the cluster-head is a round.DEEC does not require any global knowledge of energy at every election round.The election threshold T(s i ) decides whether the node s i will become a cluster head or not in the current round.

Energy Efficient Clustering Scheme (EECS). Ye et al.
proposed EECS [35], a distributed, energy efficient and load balanced clustering algorithm which helps in periodical data gathering applications of WSN.This algorithm elects the cluster head from the sensor nodes who is having more residual energy through local radio communication while achieving well cluster head distribution.During the CH election, some candidate nodes are elected, and they compete among themselves to become a cluster head.EECS algorithm is based upon the features of most popular clustering algorithm LEACH.This algorithm uses single hop communication between the CH and base station.At the time of cluster formation the BS broadcasts a "hello" message to all the nodes at a certain power level.After receiving the "hello" message the nodes can compute the approximate distance to the BS based on the received signal strength.
In cluster head election phase, a node becomes a CANDI-DATE node with a probability T. After becoming a candidate, it broadcasts a COMPETE HEAD MSG to all the nodes present within radio range R compete .Each candidate node always checks for alternatives who are having more residual ISRN Sensor Networks minimum communication according to the received signal strength.
The overhead complexity across the network is O(n), where n is the number of nodes.There is at most one cluster head in every R compete radio range.Hence the cluster heads are distributed equally.EECS produces a uniform distribution of cluster heads across the network through localized communication with a slight overhead.The simulation results of this clustering algorithm shows that in this algorithm network lifetime increases by 35% compared to LEACH.This algorithm uses local radio communication for CH selection based on residual energy.[36], a distributed, multihop routing protocol with unequal clustering for WSNs to enhance network lifetime.Here BS is located in the centre of the sensing field which results in balancing the energy consumption.In this algorithm all nodes are associated to a cluster to avoid sensing holes.All the nodes have the same initial energy and a unique identifier (ID) in the starting of the clustering process.This algorithm chooses a node as cluster head among the sensors, who is having more residual energy.

Multihop Routing Protocol with Unequal Clustering (MRPUC). Gong et al. proposed MRPUC
When a new node wants to join the clusters, it considers both the distance to cluster heads and the residual energy of cluster heads.In the final step it selects some nodes as relay nodes, which are having minimal energy consumption for forwarding the packet and maximal residual energy to avoid expiring earlier.The base station broadcasts a BS ADV message to all the sensor nodes present in the network at a certain power level.Based on the received signal strength all nodes compute its approximate distance from the base station.Then all nodes in the R max range broadcasts the HELLO(ID, E) message to its neighbours and collect correlative information about them and save in a table.After that the node with high residual energy elected as cluster head and it broadcasts a message to all the members.
For intercluster communication, CH broadcasts a control message to all its neighbours and an intercluster tree is formed with multihop communication to save energy.Then it collects the reply messages from neighbour cluster heads and stores in a table.Then based on approximate distance calculated from table, suitable CH is chosen as parent node.After tree formation is done each node turns off the radio until its allocated transmission time comes.Then it sends the data packet to the cluster head during its allocated time.After all the data has been received, the cluster head aggregates data packets into a single packet and sends data to the parent node and parent node then forwards the received packet toward the base station.

Simulation Results
Performance of the existed clustering algorithms via simulations is presented in this section.This work uses MATLAB as the simulation tool where all simulations are conducted on networks using the IEEE 802.15.4 at the MAC layer.We consider a wireless sensor network with N = 200 nodes randomly distributed in a 250 ×250 m field.Without losing generalization, we assume the base station is in the centre of the sensing region.Simulation parameters are listed in Table 1.Our goals are to compare the performance of these algorithms and the level of energy it attends after a certain number of rounds.To compare the performance of these protocols, we ignore the effect caused by signal collision and interference in the wireless channel.
Figure 2(a) shows that EDFCM achieves higher residual energy after the 5000 rounds which is taken in this simulation work.DWEHC also achieves the level of EDFCM but it starts from higher level at the beginning and in subsequent rounds it falls down to an average level of 0.5.All other algorithms maintain a certain level of residual energy due to the energy dissipation in different rounds but only HCA reaches to zero after 3600 rounds.
Figure 2(b) illuminates total number of nodes alive over the time, which indicates the lifetime of network.We see that HEED performs much better than the other protocols.Some protocols like MRPUC, EEUC, and EECS started well at the beginning of their algorithm but they could not maintain it for long time.HCA also performs well in terms of lifetime as compared to other protocols except HEED.The other protocols also started well at the beginning but the lifetime could not be maintained for a long time.

Summary
See Table 2.

Conclusions
In this paper, a detail simulation survey of the clustering algorithms considering residual energy as the major problem are being presented for energy constrained wireless sensor network.The paper starts with the clustering definition and its benefits to WSN.Simulation results are discussed to describe the effect of CH selection and the size of the cluster based on the parameters like cluster density, frequency of reelection, and frequency of cluster changes.An overall comparison is presented in a table highlighting their characteristics, strengths, and weaknesses.In addition to energy constraints, quality of service metrics such as delay, data loss tolerance, and network lifetime expose reliability issues when designing recovery mechanisms for clustering schemes.
Protocols presented in this paper offer a promising improvement over conventional clustering; however there is still much work to be done.Optimal clustering in terms of energy efficiency should eliminate all overhead associated not only with the cluster head selection process, but also with node association to their respective clusterheads.Further improvements in reliability should examine possible modifications to the reclustering mechanisms following the initial CH selection.These modifications should be able to adapt the network clusters to maintain network connectivity while reducing the wasteful resources associated with periodic reclustering.
(x) Data redundancy: in most sensor network applications, sensor nodes are densely deployed in a region of interest and collaborate to accomplish a common sensing task.Thus, the data sensed by multiple sensor nodes typically have a certain level of correlation or redundancy.