Determination of Collection Points for Disjoint Wireless Sensor Networks

When the connection in Wireless Sensor Networks (WSNs) is broken, a subset of nodes which serve as the data collection points (CPs) can buffer the data from sensors and transfer these data to mobile data collectors (MDCs) to restore the connectivity of WSNs. One of the existing problems is how to decide the numbers and positions of CPs for obtaining an optimal path of MDC. In order to deal with this problem, a selection method of CPs is proposed to reduce the traveling distance of MDCs. Meanwhile, with this selection method, the changing rules and the stability of the path of MDC are theoretically proved. A 100-node WSN is implemented to test the proposed method. The evaluation results verify that the proposed method is efficient and valuable.


Introduction
WSNs have attracted much attention from research and engineering communities in recent years due to their numerous applications [1][2][3].The related techniques are widely used in harsh environments such as battlefield surveillance, border protection, and space exploration, which can not only provide a fully automated data gathering system but also avoid the risk of human operations and decrease the economic cost.
Within the WSNs, all the sensors are expected to form a connected network to coordinate their actions in the execution of a task and transmit the collected data to a base station (BS).However, most of the sensor nodes are battery-driven with limited processing capacity; they will face the risk of depleting their energy and becoming nonfunctional and even getting damaged in the harsh surrounding.In these cases, the network communication is disconnected; correspondingly the data transmission could be restricted.In order to retain the connectivity under the failure of some nodes, node redundancy method is proposed by deploying more nodes than necessary [4].Furthermore, other methods identify a set of nodes that could be repositioned and move them effectively whenever a network partitioning occurs [5][6][7][8].Although the above-mentioned schemes partly solved the connectivity problem of WSN, those methods are not suitable for largescale damage under extreme environment condition [9].
Integrating mobile data collectors into WSN can effectively solve the above communication problem and also improve the performance of networks, such as increasing efficiency and reducing energy consumption [9] and providing higher quality and a longer lifetime for the nodes and the entire network [10].By considering the relationship between the selected representative points (RPs) and the centers of all the clusters, the CPs are found with a representative work; then MDC can travel along the optimal path to collect the data of CPs when it moves into the transmission range of RPs and transfer the data to BS so as to restore the connectivity of the network [11].The CPs could be a part of nodes or a virtual point within the transmission range of RP; MDC only needs to visit these CPs to achieve data collection.This work was improved by Kalyanasundaram's research [12], which adopts the Clustering Using Representative (CURE) algorithm [13] to partition the remaining available nodes into clusters and then selects two CPs in each cluster for route planning of MDC.Although both previous methods can restore the connectivity of disjoint WSN, they do not take the area in which CPs may stay and the influence of CPs' selection on mobile path into consideration.

Mobile Information Systems
According to above analysis, a novel CPs' selection method is proposed to find the area of the optimal CPs.On the basis of the selection method, the changing rules and the stability of the path of MDC are theoretically proved in this paper.Firstly, the use of MDC for restoring the connectivity of disjoint WSN is investigated, and then different numbers of CPs are selected from each cluster for path optimization to find the shortest tour of MDC.Secondly, the number of CPs in each cluster and the region they may stay in are decided when the optimal path is formed.Finally, the region partition method of optimum CPs is proposed, and the impact of CPs' selection on optimal path is analyzed.
This paper is organized as follows.The next section presents the related work.Section 3 describes the assumed network model and its connectivity.Section 4 discusses the method of regional division.Section 5 proves the impact of CPs on mobile path.In Section 6, the simulation results are presented.Section 7 obtains the concluding remarks.

Related Work
As mentioned in [12], more attention has been received to research the restoring connectivity after the disjoint of single or multiple nodes in WSNs.Among these different methods, the way of using mobile elements or data mules to recover the internal connectivity among isolated segments has been implemented [14][15][16].MDC in this paper served as one of mobile devices which traverses the sensor field and collects data from each of the collection points.From the challenges of limited energy and data integrity, there is one key problem to be solved for this kind of WSNs restoring connectivity methods, that is, the moving path schedule that will determine the traveling path of the MDCs.
By considering the connectivity of the selected RPs and the cluster centers, the collection points are obtained [11].The recovery way of disjoined WSNs is that the MDC travels to collect the data of CPs and transfers them to BS.In IDM-KMDC, the number of available mobile elements is assumed to be less than the number of segments, where an MDC is assigned to each link on the minimum spanning tree of the segments [11].Meanwhile, on the other hand, FeSMoR [11] solves the problem using a mix of stationary and mobile nodes.In FOCUS [17], the available MDMs are way less than the number of segments.Considering the fact that the CPs could be a part of nodes or a virtual point and the MDCs only need to visit these CPs to achieve data collection, the research is improved in [12] by using the Clustering Using Representative (CURE) algorithm [13].However, the problem of how to decide the numbers and positions of CPs for obtaining an optimal path of MDC still existed [18].Meanwhile, the theory analysis of the convergence for the optimal path is not solved [19].With the previous research [20], a novel CPs' selection method is proposed, and the stability of the optimal path of MDC is theoretically proved in this paper.

Network Model and Connectivity
Recently, some approaches have been proposed to exploit the mobility for data collection in WSNs [8].Considering the properties of sink mobility as well as the wireless communication ways for data transfer, there are mobile data collector based method, mobile base station method, rendezvousbased method, and so forth.This research focuses on the MDC based method.The WSN is composed of a set of static sensors and one MDC.The MDC is a mobile sink that visits sensors.Data are buffered at source sensors until the MDC visits the sensors and downloads the information over a single-hop wireless transmission.
In this paper, the available nodes are roughly divided into multiple clusters.Based on the way of cluster-based routing protocol, the internal nodes which belong to the same cluster keep well connectivity by multihops [2,6].The communication between the different clusters is blocked.Each sensor node can get its location information through GPS or other positioning algorithms and the movement of MDC is controllable.It is assumed that the MDC can undertake the task of data collection, which begins from any cluster, and finally returns to the beginning cluster, so the movement path forms a circle.
The Fuzzy C-Means (FCM) algorithm is used to determine the center of the cluster in preparation for the selection of CPs [21].It is assumed that all sensors have the same transmission range.Before formally discussing how the CPs are picked and the location they could stay in, the definitions are given as follows.
Definition 1.Let  = { 1 ,  2 ,  3 , . . .,  −1 ,   } be the set of  clusters which is formed by FCM and let  = { 1 ,  2 ,  3 , . . .,  −1 ,   } be the set of each cluster's center and let  be the center of .Definition 2. RPs are the static sensor nodes with the shortest Euclidean distance to ;  is the transmission radius.Naturally, there may be more than one RP in each cluster.Definition 3. Let    be the point which is the th CP in the cluster   ;  = 0 means that each cluster chooses a single CP and it is nearest to .
A WSNs example with four given clusters is illustrated in Figure 1.The CPs' selection mode is directly shown by considering each CP's position and the possible routes.As shown in Figure 1, in each cluster, two CPs are chosen.Two static sensor nodes with the shortest Euclidean distance to  are chosen as RPs in each cluster.All the RPs are connected by the dashed line.Meanwhile, the fields within the transmission range of RPs include the position where CPs may stay.Then the crossing points between the fields and the connected routes are selected as the option of CPs.
From the geometric distribution point, the example in Figure 1 is only suitable for the case where each cluster's location presents convex side.When the cluster's location presents concave side as shown in Figure 2, the selection mode is different.Firstly, the cluster with concave point needs to be found.Secondly, its center noted as  4 is determined by FCM.Then, drawing the vertical line from  4 to the nearest edge, the following steps of CPs selection are similar to the example in Figure 1.The details of selection result of CPs are shown in Figure 2.
The selection mode of CPs when the cluster's location presents the convex side.Considering real situation, the connection diagram for the clusters is not only the convex or concave polygons, but also there appears more complex mesh structure.However, based on the geometric theory, the complex mesh structure can be decomposed into several convex and concave edges.Therefore, this research only focuses on the case of convex side and the concave edge with one concave point.And only one CP is selected in the cluster which has a concave point.
The aim of deploying MDC is to transmit data, and the moving path of MDC needs to be planned.Assuming that the MDC's moving path  is a shortest cycle, each cluster contains at least one CP.The problem of finding  is equivalent to Euclidean Traveling Salesman Problem which is considered to be an NP-hard problem.Therefore, the Dijkstra algorithm is employed to calculate the shortest path from a source node to all other nodes [22].Considering the requirement of the proposed method, it is necessary to use the algorithm repeatedly.So the moving path  can be described as follows: For example,  {0,1} 1 express  0 1 and  1 1 , which are the two CPs of cluster  1 .The simulation diagram of global optimal path is shown in Figure 3.
Among ( 1) and ( 2),   denotes the MDC's shortest distance, which is beginning from   1 , travelling through all clusters, and eventually moving back to  {0,1,2,...,} 1 .With the process in Figure 4, the previous examples in Figures 1 and 2 are reorganized and each cluster selects 2 CPs for path optimization.When the cluster's location presents convex side, the formation process of optimal path is shown in Figure 4.The closed path connected by the dashed line covers the MDC's moving path in the WSNs.As the definition of step (2), the closing path connected by the solid line indicates the potential minimum cost path through the optimization method.
Similarly, once the cluster's location presents concave side, the MDC's moving path (drawn by the dashed line) and the indicated minimum cost path (drawn by the solid line) are selected and shown in Figure 5.

Regional Division
In order to find the optimal area of the CPs, two additional definitions are given.The selection method of CPs varies according to the position of clusters, so the area of optimal CPs should change as well.
For the convenience of regional division, the shape of each cluster is assumed to be circular, and the transmission range of sensor node is considered to be a point; that is, RPs are equivalent to CPs.With the example of four clusters, the problem can be divided into the two following cases.

The Cluster's Location Presents Convex Side.
As shown in Figure 6, the shadow part is the area where the optimal CPs may stay.Taken the area  1 selection as example, because  0 1 is the nearest point to  (in other words, the distance between the rest of the nodes in  1 and  is longer than | 0 1 |), let | 0 1 | be the radius to draw a circular arc   0 1 , the intersection of ⊙  1 , and the outer part of   0 1 is the area where the rest of the nodes exist.Then let | 0 2  0 1 | and | 0 4  0 1 | be the radius to draw a circle, respectively; the intersection with the above area is the selection area of optimal CPs.This method can technically ensure that the local optimal path that includes the path from  1 to its adjacent clusters and the path from  1 and  2 to their adjacent clusters exists.Similarly, the global optimal path from  1 ,  2 , and  3 to  4 exists.
Let   be the  shadow part of the cluster and let   be the symbol of complementary set.So the shadow part  1 of  1 can be described as follows: ).In the same way, in Figure 4, ).Then   can be described as follows: (2)

The Cluster's Location Presents Concave Side.
In this case, firstly, the cluster with concave point should be determined.Then, for this cluster, one CP is selected.And for each of the other clusters, two CPs are selected.With the similar method of dividing and the principle of the situation with convex side, the area that optimal CPs may stay in is noted as the shadow parts in Figure 7.
It is noteworthy that although  0  is the boundary point, it is also belonging to the shaded area.Therefore, each shadow region must contain the sensor nodes: maybe one or The formation process of optimal path when the cluster's location presents concave side.
Figure 6: The regional division of optimum CP may stay in when the cluster's location presents convex side.
more.When each shadow contains only one point  0  , the connecting line of the points is the optimal moving path of MDC.

Impact of CPS on Moving Path
The selection method of CPs changes and the MDC's moving path will change accordingly.On account of the fact that more than one CP may participate in path optimization, we need to pick out the CPs that are the best location for optimization so as to minimize the moving path.
As shown in Figure 8, the moving path of single CP is drawn by solid line.Obviously, this path is not the shortest Figure 7: The regional division of optimum CP may stay in when the cluster's location presents concave side.
Figure 8: Impact on the optimal path of more than one CP when the cluster's location presents convex side.
path.For example, choose any two points  1 1 and  2 1 from  1 and meanwhile arbitrarily choose  1  2 from  2 and  1  3 from  3 .The Euclidean distance between two CPs is denoted by      +1 .We have (3) Figure 9: Impact on the optimal path of more than one CP when the cluster's location presents concave side.
From Figure 8, it can be found that , and  0 2  0 3 <  1 2  1 3 .The bias of two paths is presented as Then there are three cases, that is, Δ > 0, Δ = 0, and Δ < 0. Thus, it is verified that some points that are not included in the set ( 0 1 ,  0 2 ,  0 3 ,  0 4 ) can reduce the moving path.That is to say, the path formed by selecting two CPs in each cluster is shorter than the situation of selecting single CP.
Figure 9 is the case when the location of each cluster presents concave side.With the similar verification process, the conclusion is the same as the condition of convex side.
Also the number of clusters is extended to ; the peroration still works.Synthesize the above testing cases and the reasoning process; the following theorem is obtained and verified.

Theorem 6. For the arbitrarily placed 𝑛 clusters, under the condition of internal communication between each of the clusters, one has the solution that not all of 𝐴 0 𝑛 exists to obtain smaller value of the function 𝑇
The specific proof of Theorem 6 is given as follows.
Proof is completed.

Performance Evaluation
With the above theory analysis and basic testing, the selection method of CPs can directly make the moving path of MDCs change as well.Based on Theorem 6, the different situations are considered to evaluate the proposed determination method of CPs.Meanwhile, the recovering performances of WSNs with MDCs are compared and analyzed.

Testing Circumstance and Performance Indicators.
MATLAB R2012a is used as simulation platform for performance evaluation.100 nodes are randomly deployed in a 100 m × 100 m area.The number of clusters is chosen from 3 to 8 by using FCM clustering method.The communication of each node in the same cluster is normal.The transmission range of nodes and MDC is fixed as 10 m.Under the condition of forming the specified number of clusters, two modes that choose two CPs (CTCP) in each cluster and the IDM-KMDC [11] are compared.As mentioned in [11], the main idea of IDM-KMDC is to select a single CP in each cluster after forming the assigned number of clusters and then achieve path planning by using the minimum spanning tree algorithm [23].For clearly comparing the performances of two different modes, the following two performance indicators are used.
(1) Total Tour Length (TTL).Total tour length (TTL) reports the total travel distance of the MDC.Since the motion relates  to the energy consuming, it reduces the lifetime of MDC.The minimum travel distance is an important design goal.
(2) Maximum Step Length (MSL).This metric reports the longest distance that MDC will have to make.If MSL is larger, it might take longer time for MDC to complete one tour to collect all the data, which increases the data collection latency.Therefore, it is desired to minimize MSL.With the evaluation of CTCP, the performance of choosing multiple CPs in each cluster is tested.The TTL and MSL are also used to indicate the effect of number of CPs and number of clusters in the following sections.

Two CPs for Each
Cluster.With the testing circumstance, the values of TTL for two modes are obtained and shown in Table 1.For all the experiences, the number of clusters is chosen from 3 to 8. The changing trend graph of TTL is drawn in Figure 10.The different performances of CTCP and IDM-KMDC for the maximum step length are compared in Figure 11.
As shown in Table 1, with the CTCP, the total tour length of MDC is reduced and the lifetime of MDC will be The number of clusters  prolonged.When the number of clusters is increased from 3 to 8, similar results are obtained.The decreasing rates of TTL are also given in Table 1.
The results of CTCP and IDM-KMDC for the total tour length are shown in Figure 10.With the comparison results, the CTCP can better restore the network connectivity in the case of stationary speed.
As mentioned previously, the indicator of maximum step length will directly show the data collection latency of MDC.That is to say, the smaller MSL is, the better related method is.From the results shown in Table 2 and Figure 11, the data collection latency of CTCP is lower than IDM-KMDC, so the networks are running faster.

Multiple CPs for Each
Cluster.In Section 6.2, the mode of CTCP obtained better performance with the indicators of TTL and MSL.For the further study of effect of CPs, the mode of multiple CPs for each cluster is evaluated.The number of CPs for each cluster is changing from 1 to 6, and the number of clusters is increasing from 3 to 8.
As shown in Table 3, when each cluster chooses multiple CPs, the TTL will monotonically decrease with the number of  CPs increasing in each cluster.That is to say, since the selected CPs in each cluster monotonically increase, the more optimal path that is hidden in the clusters is found gradually until the most optimal path is formed; that is, TTL becomes stable.
Figure 12 shows the TTL testing results and the similar conclusion is seen: TTL monotonically decreases with the number of CPs increasing gradually in each cluster.The correction of the proposed method is also partly verified.Meanwhile, according to the specific number of clusters, the unnecessary calculation is avoided by choosing the appropriate CPs for path planning.

Computing Cost Comparison. With the development of
WSNs and the growing requirements of industry area, the number of nodes will be increased more and more fast.The performance of computing cost will be a key index for real applications.
For clearly testing the proposed determination method of CPs, the computing costs of different numbers of CPs and clusters are compared and listed in Table 4.With the presented results, the computing cost is increased with the number of CPs and the number of clusters.So it will be very important to reduce the TTL for keeping the computing cost in the accepted scope.Technically, the proposed determination method reduced the TTL of WSNs and indirectly constrained the computing cost of all networks.Through the above comparison of the different situations, CTCP is more suitable for the WSN that suffers from largescale damage in the harsh environmental conditions.Also the comparison results verify that when two CPs are chosen in each cluster, the forming path is better than that of single CP.

Concluding Remarks
In order to deal with the problem of the disjoint WSN operating in harsh environment, the MDC method is introduced to accomplish the data collection.By analyzing constraint of the existing researches, the method of regional division is proposed.The area in which the optimal CPs exist is determined and its unified description of the form is given.Meanwhile, the study results have verified that the TTL will monotonically decrease with the increase of the number of CPs in each cluster, and finally the TTL is tended to be stable.The simulation comparison and testing results clearly showed the correctness of the proposed method.
In the future, this work will focus on establishing the experiment system model and empowering the invulnerability of WSN and applying it to the industrial systems.

Figure 2 :
Figure 2: The selection mode of CPs when the cluster's location presents concave side.

Definition 4 .
Let   be the circular arc with  being the center of circle and the length of || is the radius.Definition 5. Let ⊙  be the circle with  being the center of circle, while ⊙  is the circle with  being the center of circle and || is the radius.

Figure 3 :Figure 4 :
Figure 3: The simulation diagram of global optimal path.

Figure 5 :
Figure5: The formation process of optimal path when the cluster's location presents concave side.

Figure 10 :
Figure 10: The comparison of TTL for two modes with 3-8 clusters.

Figure 11 :
Figure 11: The comparison of MSL with different number of clusters.

Figure 12 :
Figure 12: Total tour length as a function of clusters and CPs.

Table 1 :
The variance of TTL.

Table 2 :
The variance of MSL.

Table 3 :
The variance of TTL for multiple CPs.

Table 4 :
The comparison of computing cost(s).