Cluster Head Selection in a Homogeneous Wireless Sensor Network Ensuring Full Connectivity with Minimum Isolated Nodes

The research work proposes a cluster head selection algorithm for a wireless sensor network. A node can be a cluster head if it is connected to at least one unique neighbor node where the unique neighbor is the one that is not connected to any other node. If there is no connected unique node then the CH is selected on the basis of residual energy and the number of neighbor nodes. With the increase in number of clusters, the processing energy of the network increases; hence, this algorithm proposes minimum number of clusters which further leads to increased network lifetime. The major novel contribution of the proposed work is an algorithm that ensures a completely connected network with minimum number of isolated nodes. An isolated node will remain only if it is not within the transmission range of any other node. With the maximum connectivity, the coverage of the network is automatically maximized. The superiority of the proposed design is verified by simulation results done in MATLAB, where it clearly depicts that the total numbers of rounds before the network dies out are maximum compared to other existing protocols.


Introduction
Wireless sensor network (WSN) is a network of densely deployed large number of sensor nodes.WSNs are deployed to monitor physical events or the state of physical objects such as bridges in order to support appropriate reaction to avoid potential damages [1].The nodes and the related protocols in a WSN should be designed to be extremely energy efficient as battery recharging may be impossible [2].In direct communication WSN, the sensor nodes directly transmit their sensed data to the base station (BS) or sink without any coordination between the two.However, in cluster based WSNs, the network is divided into clusters.Each node exchanges its information only with its cluster head (CH) which transmits the aggregated information to BS.
The most important phase of cluster-based routing protocols is the cluster head selection (CHS) procedure that ensures uniform distribution of energy among the sensors, and consequently increasing the lifespan of a sensor network [3].Once the CHs are identified, they form a backbone network to periodically collect, aggregate, and forward data to the BS using the minimum energy (cost) routing.This method significantly enhances the network lifetime compared to other known methods.The major challenges include equal distribution of each cluster over the entire sensor network and the energy dissipation caused by the frequent information exchange between selected cluster head and nodes in the cluster in every setup phase of cluster formation [4,5].If CH is selected on the basis of the concept of maximum number of nodes connected, then it may happen that one or more nodes are not connected to any of the selected cluster heads, even though they are in the transmission range.Such nodes are called the isolated nodes.The proposed algorithm deals with the cluster head selection based on the unique node concept.A unique node is the one which is not connected to any other cluster heads.The current paper describes CHS using two other parameters as well, namely, number of neighboring nodes and the residual energy of the node.
The rest of the paper is organized as follows.Section 2 describes the related work of routing in WSN and emphasizes on existing CH selection methods.Section 3 explains the system model and assumptions to design the algorithm.Section 4 gives CH selection algorithm in detail along with its mathematical model.The flowchart is also used to depict the proposed algorithm.Section 5 gives the simulation results done in MATLAB and compares the results with existing methods.Section 6 concludes and proposes the work for future.

Related Work
WSNs employ low-cost, densely deployed, tiny electronic nodes connected to each other via wireless communication [12].The most power-consuming activity of a sensor node is radio communication which must be kept as low as possible.In order to reduce the amount of traffic in the network, we build clusters of sensor nodes as proposed in [12][13][14].
Clustering is a useful mechanism in wireless sensor networks that helps to combat with scalability problems, and, if combined with in-network data aggregation, may increase the energy efficiency of the network [15].CH is the main entity in a wireless sensor network, and all the responsibility for the data aggregation and communication lies with this single entity [2].CH should be chosen in such a way such that the coverage of the network is a maximum.Coverage is regarded as one of the important qualities of service (QoS) parameter of a WSN to evaluate its monitoring capability [8].
ACE [8] successfully distributes clusters uniformly over the network but suffers from its unawareness of residual energy in cluster-heads candidates, which results in electing a cluster head with low energy level.The other disadvantage of ACE strictly draws a line between nodes that can be a cluster head and the ones who cannot.In some cases, this assumption may be unrealistic, especially when all the nodes within a cluster have low power resources.The current research work eliminates both the problems, by considering the residual energy while selecting the CH.The proposed work takes a homogeneous network in which all the nodes are the same with equal power and have equal opportunity of becoming a CH.
In the research article [10], the authors talk about connectivity in which each mote is connected to at least  other motes in the same cover.But it may happen that a node in the network is not connected to any other node at all which does not ensure a completely connected network.
Table 1 inspires the authors to come up with a CHS algorithm that selects any CH on the basis of any unique note connected to it, the residual energy, and the number of nodes connected to it.and   (75 m) represents transmission and sensing range, respectively, it is assumed that   ≥ 2  as given in [5].If there is one more sensor within the transmission range of a sensor, then the sensed information is the same for both the sensors.(5) The nodes are randomly deployed.(6) Sensor nodes are stationary.( 7) BS ( 0 ) is fixed and installed somewhere in the middle of the network (250, 250).(8) The nodes are proactive; that is, they transfer data to the BS in periodic intervals.(9) The CHs are selected initially at the onset of the network.The CH is reselected or rotated as per the proposed algorithm if the energy of the CH falls below the threshold.In the first-order model [2] shown in Figure 1, different assumptions about the radio characteristics, like energy dissipation in the transmitting and receiving nodes, path loss exponent, and so forth, will change the advantages of different protocols.Here, we have assumed a simple model where the radio dissipates 50 nJ/bit to run the transmitter/receiver circuitry.Further, it requires 100 pJ/bit/m 2 for transmit amplifier to achieve an acceptable /  as depicted in Table 2.These parameters are slightly better than the current state of the art in radio design.The impact of energy loss due to channel transmission is also considered.Thus, to transmit a -bit message over distance  using our radio model, the energy required is given by  Tx (, ) =  Tx-elec () +  Tx-amp (, ) ,  Tx (, ) =  elec *  +  amp *  *  2 ,  <  0 .

Methodology
(1) The received energy can be calculated by Receiving a message is not a low cost operation.The protocols try to minimize transmit distance along with transmitting and receiving operations for each message.The notations used in the paper are detailed in the Abbreviations section.

Proposed Clustering Algorithm
On the basis of location of various nodes, the proposed algorithm identifies the clusters and CHs.The CHs are chosen on the basis of unique connected nodes, maximum number of neighbor nodes, and the residual energy.After CH identification, the data transmission from a specific node Step 1.  number of motes is randomly deployed in the region to be sensed.The BS is given in the id of 0 and it is manually located on the network field: Step 2. Calculate the set of neighbor nodes   and the number of neighbor nodes   for the th node   on the basis of the transmission range as depicted in Table 3: The function neighbor info returns all the neighboring nodes of a node   .The distance between a given sensor   and   is given by = { 1 ,  2 ,  3 , . . .,   }; set of neighbor nodes IDs within the transmitting range (  ).neighbor info function is repeated for all  motes and the BS.The function neighbor info gives the result as shown in Figure 2.
Step 3. BS always behaves as the CH.CH selection (CHS) process starts from BS. BS transmits the CH selection message (CHS msg) to all the neighboring nodes.Nodes having residual energy more than the threshold energy are eligible to become a CH.
Step 3 is depicted in Figure 3.
All neighbours function returns the set of all the neighboring nodes [Neighbours] of   .Step 3 A.This step is only applicable when the base station determines its neighbor and set the base station position according to the below equation to minimize the base station to neighbor node distance to save the energy and reduce the synchronization time; ∑   =1 → min.The optimal base station coordinates are then given by The minimum is obtained by setting the partial derivatives to zero: The partial derivatives are By using the vector notations, the vector pointing to the location of the th sensor node is   = (  ,   ), and the distance  vector between the sensor node and sink is   =   −  = ((  , ), (  , )).Let   be the unit vector from th sensor node towards the nearest sink (the orientation vector); that is, Using the above three equations  = ∑   =1 = 0 for  = ( 0 ,  0 ), that is, the average distance is minimized if the resultant  of the orientation vector is zero, In Neighbours the CH id or initially BS ID is excluded.
Step 4. The function unique neighbor info returns the unique nodes connected to a node.The unique nodes are the ones which are not connected to any other nodes in the (+1)th hop as shown in Figure 4. Table 4 gives the details of the unique neighbors for first hop (see Algorithm 1).A flag is set for all the   nodes appearing in the set Neighbours for   .
Step 5. Consider the unflagged elements of Neighbours.
The function set of neighbours returns   which is the set of all the neighbours of members of   Function [ (  )] =   ℎ (  ) . ( (  ) is depicted in Table 4.
The Left Neighbours contains the nodes whose cluster ID flag is not set by unique count (see Algorithm 2).
Step 6. Steps 3 to 5 are repeated till all the nodes are covered by elected CH.All nodes should have a path up to BS through single or multiple hops depending upon the distance of the node from the BS.
Step 7.After clustering, if in the process of transmission of data, one of the CHs die out, then the CH at the previous hop comes to know about it since the data from the dead CH did not reach it.Suppose the CH at th hop dies out; then in such a case the clustering algorithm is repeated after ( − 1)th hop for the entire network.
Step 8.If the residual energy of the CH becomes less than the threshold energy ( th ), then the CH selection process needs to be reinstantiated.Suppose the residual energy of the CH at th hop is less than the threshold; then the clustering algorithm is repeated after ( − 1)th hop for that particular path.
Step 9. Once the routing path is established the data transmits through multihop.Each CH combines the data collected from its connected nodes through data aggregation.
Data aggregation is any process in which information is gathered and expressed in a summary form.Many research papers [19] have shown that aggregation at the CH considerably reduces the amount of data routed through the network, increasing the throughput and extending the lifetime of the sensor networks.
Data aggregation also solves the purpose of estimating a missing value from a sensor [20].Sometimes it may happen that data from one of the sensors did not reach the cluster head, in such cases the Jackknife estimate can be used to  predict the value of the sensed parameter.It also performs fault tolerance.If the data received from one of the sensors does not match the estimated value, then accordingly the correction is made.
After aggregating the data from all the nodes in the cluster, the CH sends the data to the next CH or the BS as the case may be.The flowchart of the complete proposed algorithm is shown in Figure 5.

Results and Discussions
Simulations are carried out to evaluate the proposed algorithm in MATLAB.Simulation is done with nodes placed randomly using uniform distribution throughout the network of dimension 500 m × 500 m.The BS is located at  = 250 and  = 250.The simulation parameters considered are as described in Table 5.The transmission cluster radius is taken as 150 m and initial threshold energy  th =  0 /2.
The proposed algorithm is evaluated using the following measures: network lifetime and connectivity.After some rounds of transmission, when the residual energy of all the nodes approaches to  th then the network adaptively reduces the value of  th , thus increasing the network lifetime.Data aggregation at the cluster heads further enhances the network lifetime by reducing the size of the data to be transmitted by the nodes.When the cluster head residual energy is less than or equal to the threshold energy, only then the cluster head selection algorithm is carried out for  − 1 hops.Thus, the proposed algorithm ensures that energy is not wasted in cluster head selection for every round.The network The proposed algorithm is compared with DTE (direct transmission energy) [21], Leach (low energy adaptive clustering hierarchy) [2], and multihop routing, ACE [8].For each protocol, nodes are randomly deployed by generating random coordinates using uniform distribution.For each protocol, 100 iterations were performed and the result is their average.We performed simulation on 50 and 100 nodes.The values of the simulation parameters are shown in Table 6.The simulation results and comparison between the different CHS protocols are depicted in Figure 6.Data aggregation energy is not considered in the simulation.FND (first node die) corresponds to the number of rounds and the networks runs before the first deployed sensing node dies out.
10% die out means that 10% of the sensing nodes deployed die out.90% die out means that 90% of the sensing nodes deployed die out.
The above results clearly show that the proposed algorithm gives better results as compared to existing methods.In this case, the stability period is increased because the CH is rotating, only when the threshold value reaches 0.1 J.

Conclusion and Future Work
The proposed algorithm for CH selection in a WSN using unique node concept has many advantages.The latency in transmitting the data in a single hop is much more than in the proposed multihop wireless sensor network.Each node in the network transmits the data to CH/BS closest to it.The CH in turn transmits the data to the next CH, if required, to reach the BS.If the CH is selected on the basis of the concept of maximum number of nodes connected, then it may happen that one or more unique nodes are not connected to any of the selected CHs.Thus, this algorithm deals with the CH selection based on the unique node concept.In the proposed algorithm there is no possibility of having any outlier, as all the unique nodes are connected to some or the other CHs.Adaptability is well taken care of.After clustering, if in the process of transmission of data, one of the CHs dies out, then the CH at the previous hop comes to know about it since the data from the dead CH did not reach it and the clustering algorithm is repeated after ( − 1)th hop for the entire network.If the residual energy of the CH becomes less than the threshold energy then the CH section process is reinstantiated after ( − 1)th hop for that particular path.Table 6 clearly depicts that the proposed clustering algorithm increases the network lifetime.In wireless sensor networks, the communication cost is often several orders of magnitude larger than the computation cost; thus, the CHs perform data aggregation to reduce the amount of data to be transmitted.The proposed work is a 1-connected network.If any node fails then the algorithm is run again to perform clustering a-fresh.The drawback is that a lot of energy is wasted in reclustering.In the future, the authors plan to develop a Q-connected network ensuring full connectivity with minimum number of isolated nodes.In the future, the authors plan to incorporate the proposed algorithm for heterogeneous wireless sensor network in which the CH will have more power than the connected nodes to perform data aggregation.

𝑁:
Total nodes in the network  0 : Initial node energy (

3. 1 .
System Model and Assumptions 3.1.1.Assumptions.The following assumptions have been made for the proposed network.(1) The nodes are homogenous with initial energy of 1 J. (2) The nodes transmit the data to the BS in multiple hops.(3) The hops are determined based upon distance from the BS.(4) The sensors used have transmitting range of 100-150 m (outdoor) and 50-75 m (indoor).If   (150 m)

Figure 6 :
Figure 6: Simulations results and comparison Chart.

Table 2 :
Radio characteristics. Tx-elec =  Rx-elec =  elec 3.1.2.SystemModel.The use of clusters for transmitting data limits the number of nodes that transmit to BS avoiding transmission to short distances.

Table 3 :
Neighbor information details for the first hop. 1  4  5  7  10  11  12  13  14 is done using CHs till it reaches BS.Each CH combines the data collected from its connected nodes and performs the data aggregation that reduces the amount of data to be transmitted.The aggregated data is sent to the BS through the intermediate CHs as per routing table established earlier.The stepwise algorithm is described in the following section.

Table 5 ,
Function[  ,   ] = unique neighbor info(  , Neighbours) where   C   for i = 0 to   { If   > 0 then { node   is the CH set flag for   in the set Neighbours(  ) set CH id's (  ) for   in the set Neighbours(  ) } } Min count = min setflag count(  ) node   is the CH set flag for   in the set Neighbours(  ) set CH id's (  ) for   in the set Neighbours(  ) notflag neighbours(Neighbours, Setflag) set Neighbours = left neighbours } min setflag count is a function which calculates the minimum of   of all the nodes in   for which the flag is not set.( 0 ) is

Table 5 :
Neighbor information details for the first hop.
Numberofbitsinonepacket  th : Energy threshold value at which the CH selection restarts   : Distance between th node to th node  DA : Data aggregation energy   : Set of neighboring nodes of th node   : Sensing range   : Transmitting range CH id : Cluster head ID's for each node