Hybrid Recovery Strategy Based on Random Terrain in Wireless Sensor Networks

Providing successful data collection and aggregation is a primary goal for a broad spectrum of critical applications of wireless sensor networks. Unfortunately, the problem of connectivity loss, which may occur when a network suffers from natural disasters or human sabotages, may cause failure in data aggregation. To tackle this issue, plenty of strategies that deploy relay devices on target areas to restore connectivity have been devised. However, all of them assume that either the landforms of target areas are flat or there are sufficient relay devices. In real scenarios, such assumptions are not realistic. In this paper, we propose a hybrid recovery strategy based on random terrain (simply, HRSRT) that takes both realistic terrain influences and quantitative limitations of relay devices into consideration. HRSRT is proved to accomplish the biconnectivity restoration and meanwhile minimize the energy cost for data collection and aggregation. In addition, both of complexity and approximation ratio of HRSRT are explored. The simulation results show that HRSRT performs well in terms of overall/maximum energy cost.


Introduction
Wireless sensor networks (WSNs) have raised a great attention thanks to their vast spectrum of industrial and social applications [1,2], such as biological detection, environment monitoring, and battlefield surveillance.Data collection and aggregation are the first priority of WSNs.The primary objective of such task is to gather sensor readings from field sensors deployed over a geographic area (called the area of interest (AOI)) and then successfully deliver all gathered data to the sink node through multihop paths [3].That implies the importance of both connectivity maintenance and optimal network topologies discovery for WSNs [4].However, natural disasters and human sabotages will jeopardize the network connectivity so that the process of data aggregation will be compromised without doubt.For example, when a WSN is carrying out a surveillance mission on the activity of a volcano, if the connectivity is lost, then all gathered data will not be able to reach the sink node for further analysis.Without such important data, volcano eruption can not be predicted.That may cause massive casualties and severe economic losses.
Due to the significance of network connectivity as mentioned above, the problem of connectivity restoration has been receiving increasing attention in recent years.Thus, it is imperative to design both connectivity recovery strategies and routing algorithms for WSNs.All known solutions, which exploit relay devices such as stationary relay nodes (RNs) and mobile data collectors (MDCs), can be classified into two categories.One is employing RNs only for the purpose of establishing a connected intersegment topology with stable communication paths between every pair of segments [5][6][7][8][9].This category of work generally aims to minimize the number of RNs during the restoration process.The other one relies mainly on MDCs repeatedly visiting each individual's segments for data collection and aggregation with a few RNs involved [10][11][12][13][14][15][16][17][18].However, all of these works assume that either the terrain of AOI is flat or the number of relay devices is unlimited.
Given that these assumptions are not realistic in real scenarios, our goal in this paper is to develop an efficient connectivity restoration strategy that takes both realistic terrain influences and quantitative limitations of relay devices into consideration.
Our Contribution.This paper presents a hybrid recovery strategy based on random terrain in WSNs, namely, HRSRT, that establishes a biconnected intersegment topology in a disconnected network with a limited number of RNs and MDCs; meanwhile, the energy cost for data collection and aggregation is minimized.The HRSRT accomplishes our goal in this paper as follows: (1) To quantify realistic terrain influences, the area of interest (AOI) is mapped into a grid of equal-sized cells.Each cell  is associated with a weight () that represents the corresponding terrain influence within.We calculate (), the weight of each path , by accumulating the weight of each cell along , so that the weighted complete graph   is constructed based on minimum weight paths between segments.(2) A path planning algorithm (PRTP) is developed on   to build a Hamilton cycle of minimum weight as the biconnectivity restoration tour  for MDCs.And () is proportional to the cost for data collection and aggregation during the connectivity restoration.(3) According to different numbers of MDCs, two different relay nodes deployment strategy, ORND and RND, are devised to merge intersected paths of  by carefully choosing candidate positions for RNs, so that () is greatly reduced.
The rest of the paper is organized as follows.Related work is covered in Section 2. The notions and terminologies are introduced in Section 3. The problem description is described in Section 4. The algorithm HRSRT is elaborated in Section 5. Section 6 gives the theoretical analysis on approximation radio and complexity of HRSRT.And the validation results are presented in this section as well.We conclude this paper in Section 7.

Related Work
There are two categories of approaches pursued for connectivity restoration [19]: the first category is to establish connectivity without terrain influences; the second one is to federate disconnected segments with consideration of terrain influences.
There are many excellent works regardless terrain influences during the connectivity restoration, which fall into the first category.Some of these works that employ RNs deployment only are listed as follows.Cheng et al. formulate placing the fewest RNs to connect segments as finding the Steiner minimum tree with minimum number of Steiner points and bounded edge length [5]. Lee and Younis propose grid based approaches, CORP [6] and ORC [7], both of which recursively deploy RNs until all segments are connected.In [20], recovery algorithms are proposed to minimize the deployment cost of sinks and relays and guarantee all sensors have two length-constrained paths to two sinks.Sitanayah et al. [8] find a minimal set of RNs which ensures  lengthbounded vertex-disjoint shortest paths to a sink for each sensor node.Lee et al. [9] focus on achieving a biconnected intercluster topology.Other works that employ RNs, MDCs, and mobile nodes are listed as follows.In [10], Senel and Younis devise a convex hulk based recovery strategy IDM-kMDC.It finds  convex hulks of segments; then, each optimal tour for a convex hulk is assigned a MDC to restore the connectivity.In [13], a least-disruptive algorithm is designed, which considers the impact of topology change on network performance through selecting candidate mobile nodes based on routing tables.In [14], a -hop neighboring information based algorithm is presented, which drives the backup mobile nodes to its destination to avoid intersensor collisions.In [15], a localized hybrid timer based cut-vertex node failure recovery approach is proposed, which adopts cascaded movement to relocate the mobile nodes so that the timely restoration is ensured.Joshi and Younis [21] establish balanced and optimized data collection and aggregation tours using the mobile nodes within the network.They first construct a minimum spanning tree then successively split it around the center into partitions such that the segments of each partition form a convex hulk.Eventually, all available MDCs are assigned to partitions to complete the recovery process.In [16], Liao et al. aim at providing target coverage and network connectivity establishment through the minimum movement of mobile sensors.In [11], a delay-conscious recovery strategy, FeSMoR, was proposed to federate disjoint segments with a limited number of relay nodes.In [12], a convex hulk based recovery scheme MiMSI is designed, which assigns MDCs for intersegment federation and deploys RNs for intercluster connection.In [18], a distributed algorithm GSR is designed to decompose the deployment area into its corresponding skeleton outline, along which mobile RNs are placed to finish connectivity restoration.
There are many excellent works with consideration of terrain influences during the connectivity restoration, which fall into the second category.Zhou et al. [22] propose an extended rapidly exploring random tree (RRT) based algorithm to initiate a path for a mobile node to reach the intended destination without crashing into any obstacles.Senturk and Akkaya [23] investigate how realistic terrain influences affect network connectivity recovery.Then, they design a terrain based restoration strategy ReBAT [24] that considers different terrain types, such as forest, hill, swamp, and flat.ReBAT attempts to find the least cost paths between disjoint segments regardless of the subsequent data collection and aggregation.Truong et al. [17] propose a family of algorithms under the consideration of the impact of obstacles on mobility and communication, all of which collaborate to restore the connectivity with the least number of relay nodes and meanwhile minimize the mobility cost of agents.In [25], Mi et al. propose an obstacle-avoiding connectivity restoration strategy to avoid convex obstacles and intersensor collisions; however, they fail to consider realistic terrain influences on the process of connectivity restoration.
In this paper, we focus on establishing a biconnected intersegment topology in a disconnected network with a  limited number of RNs and MDCs and meanwhile minimize the energy cost for data collection and aggregation.It is worth to mention that establishing a biconnected intersegment topology, the quantitative limitation of relay devices, and minimizing the energy cost for data collection and aggregation are seldom considered all at pervious works unlike ours.

Preliminary
Some notations used throughout this paper are given first; the important symbols with their definitions are collected in Notations.
Definition 1.A weighted complete graph   = (, , ) is a complete graph of  vertices with each edge associated with its weight, where  and  denote the set of vertices and edges in   , while  stands for the set of weights of edges.To be more specific, the weight of an edge     ∈   is equal to ( * , ).
Definition 2 (see [26]).An Euler closed trail   is a closed trail that visits every edge of graph  exactly once.A graph that has an Euler closed trail is called an Euler graph.
Definition 3 (see [26]).A Hamilton cycle  is a closed trail that visits every vertex of graph  exactly once.A minimum weighted Hamilton cycle of graph  is denoted by   .
Figures 2 and 3 show the examples of a Euler closed trail  and a Hamilton cycle , where ( Definition 4 (see [26]).Given a graph  = (, ), a matching   is a set of pairwise nonadjacent edges; that is, any two edges share no common vertex.
Definition 5 (see [26]).A perfect matching  *  is a matching which matches all vertices of the graph .That is, every vertex of the graph is incident to exactly one edge of the matching.Definition 6.For three edges , , and  of a triangle , we define () + () ≥ (), the weighted triangle inequality, abbreviated as WTI.
Definition 7 (see [26]).A graph  is said to be -connected, if for each pair of vertices there exist at least  mutually independent paths connecting them; that is, the graph  is still connected even after removal of any  − 1 vertices from .A -connected graph  is of -connectivity.

Problem Description
Wireless sensor networks are deployed in realistic environments for data collection and aggregation.Therefore, the connectivity of a WSN is easily compromised due to nature disasters.Under such circumstances, the recovery strategy is required to restore the connectivity in a realistic environment.That implies that the terrain influences need to be taken into consideration.It is worth mentioning that the terrain influences on connectivity restoration are closely related to the landforms of AOI, such as forest, hill, swamp, and flat.Especially in some realistic scenarios, there exists a shortcut between two cites   and   ; however, taking  such shortcut will incur a significant energy cost.Intuitively, the possibility of taking a detour should be considered.To quantify the terrain influences, the area of interest (AOI) is mapped into a grid of equal-sized cells (squares) with side length ( √ 2/2).Besides, both segments and RNs are assumed to center at cells.The rationale is that RNs placed at eight neighboring cells are reachable for   .More importantly, sensing data from eight neighboring sensors can be collected by a MDC while travelling through   .
Similar to ReBAT [24], risk and elevation are taken into account while determining the optimal path besides the distance.Specifically, each cell   ∈ Cells is associated with a random terrain type, the corresponding risk factor, and an elevation.Accordingly, we use the weight function to represent the terrain influence on travelling through   .In addition, Manhattan distance is used to accurately estimate the cost of paths and 4 directions (north, east, west, and south) are considered directly accessible while moving MDCs to an adjacent cell.We then measure the weight of  , through the sum of the weight of all visiting cells except two specific cells, for example,   and   , where   and   are located.According to (2), we give the weight function of a path ( , ) as follows: ) and ( 3,4 ), respectively.Note that if there are several minimum weighted paths  * , s, then choose the one with the shortest MD(  ,   ) as  * , .Furthermore, we assume ( , ) = ( , ) in this paper.Let  be a data collection and aggregation tour; according to (3), () is given as follows: In [24] According to (4) and ( 5), it is easy to deduce that the energy cost is proportional to the terrain influence.For simplicity, we use ( , ) and () to represent the energy cost of the path  , and the tour , respectively.The performance comparison in energy cost will be conducted using (8) in Section 6.2.
In addition, the required restoration strategy should be allowed to use only a limited number of RNs and MDCs due to the fact that relay devices could be expensive.We then give the formal problem definition as follows.
Given  nodes with a transmission range of  that form  disjoint partitions in a squared region which consists of  ×  cells of size ( √ 2/2), the goal is to provide a random terrain based solution (distributed/centralized) which ensures that  partitions and the sink node will be biconnected by deploying a limited number of RNs and MDCs and meanwhile minimize the cost of data collection and aggregation.
This paper is dedicated to solving such problem by proposing a polynomial time algorithm, named HRSRT.It is worth mentioning that the the tour  constructed by HRSRT is not only the connectivity restoration tour but also the data collection and aggregation tour (see Figure 6).

The HRSRT Approach
HRSRT is a random terrain based recovery strategy.It aims to ensure all disjoint segments including the sink node are biconnected; meanwhile, the energy cost of data collection and aggregation is minimized.During the restoration process, only  RNs and  MDCs are employed.In fact, the energy cost is tremendously reduced with more MDCs involved.Thus, according to different values of , HRSRT adopts corresponding approaches to achieve the connectivity restoration as follows: (i)  = 1: If there is only one MDC available, then it needs to tour around all disjoint segments and the sink to collect and aggregate data.(ii)  > 1: If the number of MDCs is more than one, then the corresponding tour for each MDC should be carefully chosen so that total energy cost for data collection and aggregation is minimized.
The framework of HRSRT is shown in Figure 7.In this paper, we first introduce HRSRT as a centralized procedure; then, the distributed HRSRT is elaborated in Section 5.3.It is worth mentioning that the theoretical proof on the biconnectivity of tour  established by HRSRT is given in Section 6.1.That implies HRSRT can restore the connectivity with the consideration of terrain influences.

HRSRT with 𝑚 = 1.
For  = 1, HRSRT works in two phases.First, a random terrain based path planning (RTPP) is implemented to initiate a connectivity restoration tour  in phase one.Then, an Optimized Relay Node Deployment (ORND) is adopted to reduce () in phase two.The pseudocode for HRSRT with  = 1 is shown in Algorithm 1.
(3) Return . on the cost of data collection and aggregation for MDCs.
To quantify such influences, the cost for travelling through a unit area(cell) is represented by a weight value.Therefore, the priority of a path planning strategy on realistic terrains is to establish a tour of minimum weight for MDCs.In this section, a random terrain based path planning strategy, called RTPP, is proposed to accomplish the goal under the constrain that only  MDCs are available.Suppose there are  disjoint segments.We take six steps to build a minimum weighted tour   as follows: (1) According to the weight of each unit area, construct a complete weighted graph   over the set of segments remaining disjoint.Each edge of   , for example,     ∈   , is the minimum weighted path from   to   , that is,  * , .Note that if there are several  * , s, then choose  * , with the shortest Manhattan Distance between   and   .The rationale is that the MDC movement toward an adjacent cell is in one of four directions as we mentioned above.
(3) Find the set  of odd-degree vertices in mst; that is, As shown in Figure 8, there are 4 RNs and one MDC available for a set  of disjoint segments, where  = { 1 ,  2 ,  3 ,  4 ,  5 }. Figure 8(b) shows the weighted complete graph  5 over .And each edge     ∈  5 is associated with a weight value (    ) that represents the minimum cost of travelling from   to   .A mst of  5 is built in Figure 8(c).And there are only two odd-degree vertices  1 ,  2 ∈ mst, so the perfect matching of  2 is  1  2 .As shown in Figure 8(d), the trail  =  1  2  3  4  5  1 is the Euler closed trail of mst ∪ { 1  2 }.Then, according to the order of , continuingly visit all vertices starting from  1 .The solid lines consist of the Hamilton cycle  =  1  2  3  4  5  1 .And  is chosen as the initial data collection and aggregation tour .It is worth mentioning that the tour  is 2-connected and the corresponding theoretical proof is given in Section 6.1.The pseudocode for RTPP is shown in Algorithm 2.

Optimized Relay Node Deployment (ORND)
. ORND is a highly effective algorithm that aims to improve the initial data collection and aggregation tour   established by RTPP.Although there are only  RNs available, ORND attempts to place RNs at the optimal positions such that a number of intersected paths are merged to reduce (  ) and the final data collection and aggregation tour  is built (see Figure 9).Now we introduce how ORND works: (1) Check all paths on  and find out if there exist at least two paths, for example, Step (3) implies that if there exists only one pair of paths, for example,  * , and  * , , that have  consecutive cells in common, then () can be reduced by 2(  1 ,  ) at most due to the merging process of ORND.More specifically, if the RNs deployed along   1 ,  reach the cell   , then the MDC can directly collect the data sensed by   by simply travelling through the merged path  , =    ,  ∪    ,  , as shown in Figure 9, instead of travelling along paths  * , and  * , sequentially.
In Figure 10, it is obvious that although the initial tour is a Hamilton cycle  and CH 5 , the MDC does not have to go directly from   to   along the path  , due to the fact that there may exist at least two paths, that is,  3,4 and  4,5 , that can merge as one.In addition, a common cell   within the communication range of  4 is the only cell shared by both of  * 3,4 and  * 4,5 .Thus, such two paths merge as  3,5 by deploying a RN at cell   and all data sensed by  4 will be collected while a MDC is travelling through   .Figure 10(b) shows the final MDC tour , with () = 6 + 23 + 10 + 11 = 50.Note that although there are sufficient RNs, only one RN is needed.This is attributed to the fact that even if more RNs are deployed, () remains the same in this example.The pseudocode for ORND is shown in Algorithm 3.

HRSRT with 𝑚 > 1.
For  > 1, the same path planning algorithm RTPP elaborated in Section 5.1.1 is employed.It takes  original disjoint segments as the input to establish

Input:
The initial data collection and aggregation tour   and  RNs.Output: The final data collection and aggregation tour .
(1) for all paths on  do (2) if there exist at least two paths  * , and  * , such that they have  consecutive cells in common except the cell where   is located then (3) Start to deploy RNs along   1 ,  and do end if (9) end if (10) end for (11) Return  Algorithm 3: Optimized Relay Node Deployment (ORND).a data collection and aggregation tour .Then, we devise a new RNs deployment strategy (RND) to determine candidate positions for RNs on  such that the energy cost is minimized.Finally, a Path Allocation (PA) approach is developed to locate optimal tours for all MDCs (see Figure 11).Now we elaborate how a minimum weighted tour  is built as follows: (1) For each   ∈ , deploy ⌈/2⌉ RNs along path  * ,−1 and populate (⌈/⌉ − ⌈/2⌉) RNs along path  * ,+1 , where  =  1  2 ⋅ ⋅ ⋅    1 .We use symbols   and   to mark the latest deployed RNs along  * ,−1 and  * ,+1 , respectively.
We call the strategies, described in steps (1) and (2), RND and PA, respectively.In fact, if there exists a path  * , ∈  such that the RNs are deployed along  * , as step (1), then () is reduced by ( *   ,  ∪  *   ,  ).This is because   =  * , \ ( *   ,  ∪  *   ,  ) is the tour allocated to a MDC, instead of  * , .Besides, all data sensed by   and   can be collected, while both ends of the tour   are reached by the MDC.As shown in Figure 12(c It is worth mentioning that HRSRT adopts different strategies based on different values of .The rationale is that if HRSRT employs RTPP, RND, and PA sequentially with only one MDC available, then the MDC still needs to tour around the entire set of segments to collect data along the same path.That implies the RNs deployment incurs no reduction in energy cost.Unlike RND and PA, ORND is designed to merge intersected paths into one through deploying RNs at optimal positions such that the weight of  is minimized, regardless of how many MDCs are available.Thus, ORND is an ideal choice for the single MDC case.The pseudocode for HRSRT with  > 1 is shown in Algorithm 4.

Distributed Implementation. This section describes how
HRSRT is implemented in a distributed manner.When the network is partitioned into  disjoint segments, each segment   first chooses a sensor as its representative   and then broadcasts its location   .We assume that there are some mobile agents in this network.These mobile agents will be sent to those segments that lost contacts based on their original positions.Eventually, all   s share each other's   s after mobile agents return.Each   calculates the coordinate of the CoM of SP  using (6) [27].Note that   and   are coordinates Input: A set  of  disjoint segments,  = { 1 ,  2 , . . .,   }.Output: A data collection and aggregation tour .
(2) for each   ∈  do (3) Deploy ⌈/2⌉ RNs along path  * ,−1 and mark the last deployed RN   (4) Deploy (⌈/⌉ − ⌈/2⌉) RNs along path  * ,+1 and mark the last deployed RN   (5) end for (6) if  ≥  then (7) for all  * ,  ∈  do (8) Choose   =  * , \ ( *   ,  ∪  *   ,  ) as a MDC tour (9) end for (10) else (11) for all ( − 1)  * ,  ∈  do (12) Choose   =  * , \ ( *   ,  ∪  *   ,  ) as a MDC tour (13) end for (14) Choose   = ⋃  =   as a MDC tour for the last MDC (15) end if (16) for all   s do (17)  =  ∪   (18) end for Algorithm 4: HRSRT with  > 1. (1) for each   ∈  do (2) randomly choose a sensor as the representative   (3) sends a mobile agent to locate   s, where  ̸ =  (4) calculate the coordinate of the CoM of SP  (5) end for (6) repeat (7)   place a RN toward the CoM and let this RN as a new   to represent   (8) if there are    s that meet each other, then (9) they merge as one and choose the   closest to other    as the representative (10) end if (11) Update the list of disjoint segments (12) until (All RNs assigned to each   are populated.)(13) for each pair of remaining disjoint segments    and    do (14) calculate (   ,    ) and store it in the database  in increasing order (15) end for (16) while  is not a circuit do (17) Get the first element ( Then,   starts to populate relays toward CoM.Note that each RN placed by   will become new representative of   .While two or more   s are within each other's communication range, the corresponding   's merge as a new segment and the closest   to the CoM is chosen as the representative of such a newly established segment.Then,   's (where  ̸ = ) will stop to deploy RNs toward the CoM.Each   will recursively deploy relays until all RNs assigned to it are placed.
The path planning is followed by the RNs deployment.Since the position of the CoM is calculated by each   , final positions of each segment, for example,    s, are known by   .Then, all (   ,    )'s are calculated and stored in the database  in increasing order, where  ̸ = .Next, the shortest       s are iteratively added to the data collection and aggregation tour , which does not create a cycle, unless it completes the tour.Note that if the number of mobile data collectors   is sufficient, then each segment is assigned a MDC.Otherwise, only  segments are assigned MDCs, where  <   .Finally, each MDC tours around all existing disjoint segments along  to collect and aggregate data (see Figure 13).The pseudocode for the distributed HRSRT is shown in Algorithm 5.

Performance Evaluation
6.1.Theoretical Analysis.The correctness, complexity, and approximation ratio of HRSRT are analyzed in this subsection.First we give the following theorems.

Theorem 8. All paths of a data collection and aggregation tour established by RTPP satisfy WTI.
Proof.We prove this theorem by contradiction.Suppose there are three paths  , ,  , , and  , , for a MDC travelling among three segments   ,   , and   , such that ( * , ) + ( * , ) < ( * , ).That implies that the cost of directly travelling from   to   is higher than travelling from   to   via   , which contradicts the fact that RTPP only chooses minimum weighted paths for MDCs.Therefore,  * , should start from   and end at   via   .That implies ( * , )+( * , ) ≥ ( * , ).So the theorem holds.
To illustrate the following theorem clearly, we call a path  , a direct path, if there is no segment on  , other than   and   .Proof.RTPP strives to locate the minimum weighted closed trail of a complete weighted graph   .Without loss of generality, we set  =  1  2 ⋅ ⋅ ⋅    1 a minimum weighted closed trail.Note that each edge     ∈ mwct is a  * , , which represents the minimum weighted path from   to   .Then, we are going to distinguish between two cases to prove this theorem.

Theorem 9. The tours constructed by RTPP is either a Hamilton cycle or a closed trail.
If all edges of  are direct paths such that each segment   is on  exactly once for any , then, according to Definition 2,  is a minimum weighted Hamilton cycle (see Figure 3).
If there is at least a path not a direct path, that is,  * , ∈  is not a direct path, then it is easy to verify that  is a closed trail.We assume that the path  * , starts from   and ends at   via   .According to algorithm RTPP, there may exist two other paths  * , and  * , to complete the trail , such that  * , ̸ =  * , ∪  * , .Intuitively,   is on  at least twice, which implies  is a closed trail (see Figure 2).Proof.RTPP employs an introduced graph of a complete weighted graph  || to construct a perfect matching   || .When the mst of   is built, the set  of odd-degree vertices are employed to construct  || for the establishment of a perfect matching.For simplicity, let  = || − 1 denote the number of edges of mst.Intuitively, the overall degrees of || vertices in mst is even, because of the fact [26] ∑  =1 (V  ) = 2, where V  ∈ mst.It is easy to deduce that || is even.
For the rest of the paper, we use  2 to represent graph  || .
Note that a series of MDC based connectivity recovery algorithms, such as MiMSI [12], IDM-kMDC [10], and MINDS [21], make MDCs travel along the convex hulk of  disjoint segments.Specifically, if there is at least one segment not on CH  , then  segments will be clustered into several disjoint groups so that each group  can form a convex hulk CH || with each segment of  on it.For simplicity, we call such algorithms A-CH  s.And we call the tour along the convex hulk of  disjoint segment a CH  tour.

Theorem 11. HRSRT establishes a data collection and aggregation tour with less energy cost than 𝐴-𝐶𝐻 𝑛 s.
Proof.Let  and   denote the tours established by the RTPP and a A-CH  , respectively.According to RTPP, the selection of perfect matchings of  2 could result in two different types of tours.That is,  is either a CH  tour or a non-CH  tour.We distinguish the following two cases to prove that the energy cost of tour , Cost(), is less than that of   , Cost(  ), regardless of the number  of MDCs.
Case 1 ( is a non-CH  tour).If there is at least one segment not on CH  , then   is a directed graph with either one of the following two structures.One is that a graph consists of at least two connected components bridged with two directed paths.The other is composed of at least two subtours that share a common vertex.We thus take Figure 15 as an example to explicate this case.Let the tour in Figure 15(d  If all  segments are on CH  , then it is easy to get   = CH  .According to the tour selection of RTPP, we have () ≤ (CH  ) = (  ), because  is a non-CH  tour.Again, (5) guarantees that Cost() ≤ Cost(  ).
Case 2 ( is a CH  tour and all  segments are on CH  ).It is easy to get () ≤ (  ), since each edge     ∈  is the minimum weighted path from   to   , while the edge     ∈   is a straight line directly from   to   regardless of terrain influences.As shown in Figure 16, it is clear that () = 5 + 6+9+6+10+5+3+2 = 46 < (  ) = 5+6+9+6+10+15 = 51.According to (5), it is intuitive that Cost() ≤ Cost(  ).Theorem 12. RTPP establishes a close trail   , the weight of which is less than 1.5 times that of the least weight Hamilton cycle  * .
Proof.Without loss of generality, we assume the close trail established by RTPP is a Hamilton cycle.For simplicity, let  2 represent the perfect matching of graph  2 .Then, the theoretical proof on the approximation ratio of (  ) to ( * ) is as follows.
First, we set  the minimum spanning tree of a weighted complete graph   .Let   represent the introduced graph of   − {}, where  denotes an edge of .Obviously,   is still a spanning tree of   such that (  ) > ().That implies (  ) > (), because of () > 0.
Summing up all four steps above, the theorem holds because the following inequation holds: Theorem 13.The data collection and aggregation tour construct by HRSRT is 2-connected.
Proof.We distinguish between two cases to prove this theorem.
Case 1 ( = 1).HRSRT adopts RTPP first to calculate a minimum weight Hamilton cycle  as the connectivity restoration tour.Theorem 9 proves that  is either a Hamilton cycle or a closed trail.According to Definitions 3 and 7, it is intuitive that  is 2-connected.Then, ORND is employed to merge any pair of intersected paths  * , ,  * , ∈  so that  =  \ { * , ,  * , } ∪ { * , } through RNs deployment.It is easy to verify the fact that there still exist 2 mutually independent paths which implies the biconnectivity of tour .As shown in Figure 19, there exist  2  4 pairs of independent paths between any pair of segments.Even if a path is cut off, that is,  1,4 is disconnected,  1 and  4 are still connected via  2 and  3 .
Summing up the two cases above, it is easy to deduce that the data collection and aggregation tour construct by HRSRT is 2-connected.Theorem 14.The approximation ratio of HRSRT is 1.5.
Proof.We distinguish between two cases to prove this theorem.
Case 1 (there is only one MDC available).In this case, (4), (5), and Theorem 12 guarantee that the energy cost of tour  established by RTPP is less than 1.5 times that of the optimal tour.In addition, () can be reduced due to RNs deployment by ORND.
Case 2 (there are at least two MDCs available).In this case, RND ensures the sum of weights of all MDC tours is less than that of the tour  established by RTPP.
Summing up the two cases above, it is intuitive that HRSRT is a 1.5-approximation algorithm.
Proof.We distinguish the following two cases to prove this theorem.
Case 1 (there is only one MDC available).In this case, HRSRT consists of algorithms RTPP and ORND.We then analyze the complexity of them, respectively.RTPP takes four steps to build the data collection and aggregation and aggregation tour: the construction of a spanning tree, the establishment of an Euler graph through calculating the perfect matching, the localization of an Euler closed trail, and the discovery of a Hamilton cycle.The cost of obtaining a spanning tree of a graph is less than ( 2 ).And the construction of an Euler graph through locating a perfect matching will not cost more than ( 3 ).In addition, the construction of an Euler closed trail requires an () algorithm, where  is a constant.Furthermore, the discovery of a Hamilton cycle is accomplished by depth first search such that its complexity will not exceed ( 2 ).Intuitively, RTPP is a polynomial time algorithm with the complexity of ( 3 ).ORND discovers the intersected paths of a data collection and aggregation tour  established by RTPP.Then, it locates the optimal positions for RNs.Since both of the intersected paths discovery and the optimal positions localization can be done in a constant time, ORND is a () algorithm, where  is a constant.
Case 2 (there are at least two MDCs available).In this case, HRSRT is composed of RTPP, RND, and PA.It is intuitive that both of RND and PA are () algorithms.
Summing up two cases above, it is intuitive that the complexity of HRSRT is ( 3 ).

Validation Experiments.
The simulation environment, performance metrics, and experimental results are discussed in this subsection.

Experiment Setup.
We consider a disconnected MSN deployed in an application area a region of size 2000×2000 m.The sensing range and transmission range of a sensor or a relay are set to 25 and 50 m, respectively.In order to represent different terrain types over the deployment region, it is divided into cells (i.e., squares of a certain size) where each cell is associated with a terrain type picked randomly from Table 1.In addition, the cell size is determined on the basis of the application area and the application requirements.We choose the size of 35 × 35 m for the cells.Then, topologies with varying number of sensors and segments are generated and 50 topologies for each test case are considered.For each topology, terrain features are randomly added.Since obstaclefree environments are assumed for all baseline approaches, terrain features without obstacles are added.

Performance Metrics and Baseline Approaches.
In our experiments, a partitioned WSN with varying numbers of segments has been considered.In addition, the parameters that affect the network characteristics are listed as follows.
Number of Relay Nodes (  ).Since RNs are deployed to federate disjoint segments, a great number of RNs will significantly shorten intersegment distances, so that the energy cost for data collection and aggregation is mitigated.In this paper,   is assumed insufficient to connect all disjoint segments.[12], which implies lower energy cost.
Communication Range of a Relay Node ().With the growth of , - distances are reduced effectively that contributes to a low cost data collection and aggregation tour.
We use the following two metrics to evaluate the overall performance of HRSRT.
Total Energy Cost.Energy cost incurred because of movement is considered.Our goal is to minimize this cost to extend the network lifetime.
Maximum Energy Cost.This metric shows the maximum energy cost of a MDC.This is directly related to the survival of a single MDC, which affects the lifetime of the repaired network.
Recovery Time.This is the time required to complete networkwide recovery.Minimizing the recovery time is not a goal, but it is affected by the path planning.The velocities of MDCs are set to 0.5 m/s.We compare the performance of HRSRT with the following three baseline approaches.
FeSMoR [11].This algorithm is designed to minimize the average end-to-end delay between every pair of segments in a damaged WSN with a limited number of relay nodes.FeSMoR works in two phases.The first phase is to construct an Euclidian Steiner Minimal Tree (ESMT) of segments to balance data traffic through stationary RN deployment.In the second phase, FeSMoR finds the edges that require multiple stationary relays and do serve on the least number of - paths.Then, mobile relays are employed to replace stationary relays on those edges.If there are two adjacent edges that have leaf segments and each of which is assigned with a mobile relay, then such two edges merge as one data collection and aggregation tour, referred to as a Steiner triangle, that requires only one mobile relay.
MiMSI [12].It is a mixed recovery strategy that utilizes MDCs and stationary RNs for connecting a set of partitions.MiMSI first builds an ESMT in terms of the average node degree.
And Steiner Points of ESMT and segments are grouped into clusters based on proximity.Then, gateway nodes between every pair of neighboring clusters are determined.After that, MiMSI populates stationary relays at the positions where gateways are located for connecting two adjacent clusters.Finally, each cluster will employ a MDC that tour around segments for data collection and aggregation.If there is only one MDC available, then segments of each cluster are federated by stationary RNs and each gateway node will be repeatedly visited by the only MDC.
ReBAT [24].It is a connectivity restoration strategy that considers realistic terrain influences.ReBAT operates in two phases.The first phase is to seek the set of locations for the mobiles sensors to the ensure connectivity.In the second phase, a greedy-based heuristic constructs a CDS as the connected backbone of the network.Then, some dominatee nodes of the CDS are relocated to maintain the - connectivity.And during movement of the nodes, different terrain types associated with different frictions (i.e., risk values) are considered, such that each least cost path to the destination is found.Note that ReBAT only considers establishing a 1- network.In our experiments, as a baseline approach, ReBAT is slightly modified to build a data collection and aggregation tour through planning at least an extra path to the resulting 1- network after connectivity restoration.It is worth mentioning that if ReBAT can obtain the least cost path  * , that contains all segments, then in the worst scenario the weight of data collection and aggregation tour () is less than or equal to two times that of  * , ; that is, () ≤ 2 × ( * , ).According to Theorem 12, it is easy to deduce that the extended ReBAT is a 2- algorithm.Henceforth, we use ReBAT-EX to denote the extended ReBAT.
In summary, FeSMoR and MiMSI do not consider different terrain types.They attempt to minimize movement distance considering constant energy cost per meter.ReBAT strives to reestablish a 1- network, while HRSRT establishes a bi-connected inter-segment topology and a least cost tour for data collection and aggregation.In addition, we use Opt to represent the optimal energy cost for connectivity restoration.

Energy Cost Formulation.
We use the same formula as ReBAT [24] to measure the energy cost as follows: Note that  represents a tour continuingly visiting  cells,   is total distance that travelled from  −1 to   ,  is the constant value referring to the cost for movement per meter on a flat topology which is taken as 30 joules/meter, and   and   denote the product of risk and elevation, respectively, for the cell   shown in Table 1.In addition, the distance of travelling from a cell to its neighboring cell in one of the four directions equals ( √ 2/2); that is,   = ( √ 2/2).According to (4), (5), and ReBAT-EX, and HRSRT using (8) in terms of energy cost is presented in Section 6.2.4.

Simulation Results
. Several configurations with different combinations of   ,   ,   , and  are simulated.We change the value of   and   from 4 to 20 with increment of 4, respectively, while   and  vary from 6 to 14 with increment of 2 and from 50 to 250 with increment of 50, respectively.For each individual experiment, we average the results over 30 runs.Note that our experiments employ MDCs instead of mobile sensors for connectivity restoration.For all baseline approaches, if there are insufficient RNs to establish connectivity, then MDCs are utilized to tour around all disjoint segments for data collection and aggregation, even if there is only one MDC available.Accordingly, the deployment of RNs will shorten the - distances.That implies the reduction in energy cost for MDCs.
Total Energy Cost.It can be observed from Figure 20 that the energy cost of all approaches for varying   declines when there are more relays deployed.The reason for that is that RN deployment continuously shortens the - distances.This eventually reduces the energy cost for baseline approaches due to a shorter distance of the MDC, while the RNs deployment of HRSRT is responsible for the drop of its energy cost.It is clear that HRSRT consumes significantly less cost than FeSMoR and MiMSI, because of the effectiveness of the path planning.Furthermore, HRSRT incurs less than 3/2 of the cost of Opt as expected.
Figures 21 and 22 give performance comparisons of HRSRT with baseline approaches for varying   with/without RN deployment.As   increases, more energy is consumed for all approaches.This is attributed to the fact that there are more - links required recovery.However, if there are more and more disjoint segments located in the area, then the - distances are shorten significantly.That eventually mitigates the energy cost due to the MDC movement of a shorter distance.It is worth mentioning that ReBAT-EX is expected to have more energy cost than HRSRT, since HRSRT plans a data collection and aggregation tour through discovering a Hamilton cycle of distances.Consequently, overall movement energy cost of the MDC will be reduced for all approaches.Again, the effectiveness of the path planning is the key for HRSRT to succeed in establishing a data collection and aggregation tour of the least energy consumption, compared with FeSMoR and MiMSI.
The performance comparisons of HRSRT with baseline approaches for varying   without RN deployment are shown in Figure 24.The energy cost of all three baseline approaches falls with the   increases.The reason for that is that more MDCs employed will effectively shorten the total travel distance, which implies the fall in total energy cost.Although HRSRT and Opt consume constant energy due to the lack of RNs deployment, the MDC paths planned by HRSRT consume less energy cost than FeSMoR, MiMSI, and ReBAT-EX.The reason for that is that HRSRT works on discovering the minimum cost data collection and aggregation tour over the weighted complete graph of disjoint segments.More importantly, the effectiveness of HRSRT becomes the major factor in reduction of energy cost.Furthermore, the energy cost of HRSRT is less than 1.5 times that of Opt as expected.
Maximum Energy Cost.As shown in Figure 25, the maximum energy costs of all approaches decline with varying   .Because if there are plenty of MDCs employed for Steiner triangles of FeSMoR, clusters of MiMSI, and - link of ReBAT-EX, respectively, then overall travelling distance will be shorten.That results in the drop in maximum energy cost.For HRSRT, although the growth of   does not contribute to the decline in total energy cost due to the lack of RNs deployment, the maximum energy cost is reduced.That is attributed to more MDCs involved for sharing the responsibility of connectivity recovery and data collection and aggregation.It is obvious that HRSRT outperform all baseline approaches.
Recovery Time. Figure 26 shows the recovery time comparison of all approaches for varying   .The results indicate that recovery times of all approaches increase first with more segments involved and then decrease when the deployment of segments is getting dense.This is because that the intersegment distances increase when more segments are required to be connected which results in the increase in recovery time; however, densely populated segments shorten the intersegment distances that contribute to the decrease in recovery time.It can be observed that nonterrain-aware approaches, FeSMoR and MiMSI, require less recovery time than terrain-aware approaches, HRSRT and ReBAT-EX.This is expected because direct paths are followed from source to destination when nonterrain-aware approaches are applied.Note that HRSRT is better than ReBAT-EX in recovery time.
As shown in Figure 27, the recovery time decreases for all approaches with more MDCs joining in connectivity restoration.The reason for that is that all MDCs collaborate to restore the connectivity simultaneously so that the recovery time is shortened.Although nonterrain-aware approaches perform better than terrain-aware approaches, which aim at mitigating terrain influences on connectivity restoration, as expected in the recovery time, HRSRT still requires less recovery time than ReBAT-EX.

Conclusion and Future Work
Due to the significance of both connectivity maintenance and optimal network topologies discovery for WSNs, in  The maximum communication range of sensor nodes   : Th eth segment   : The sensor node that represents the segment   mst: A minimum spanning tree of graph  (  ): The degree of   SP  : The smallest polygon which includes  segments (see the shaded area in Figure 1(a)) CH  : TheconvexhulkofSP  (see the perimeter that consists of dashed lines in Figure 1(b)) CoM: The center of mass of SP  .

Figure 4 :
Figure 4: Neighboring cells and data collection.

Figures 5 (
Figures 5(a) and 5(b) give the examples of ( 1,2) and ( 3,4 ), respectively.Note that if there are several minimum weighted paths  * , s, then choose the one with the shortest MD(  ,   ) as  * , .Furthermore, we assume ( , ) = ( , ) in this paper.Let  be a data collection and aggregation tour; according to (3), () is given as follows:

Figure 5 :
Figure 5: The weights of paths.

Figure 6 :
Figure 6: Network connectivity recovery with RNs and MDCs.

Figure 9 :
Figure 9: Two paths merged as one by Optimized Relay Node Deployment (ORND).

Figure 11 :
Figure 11: One path subdivided into three optimal paths by RNs deployment (RND) and Path Allocation (PA).

Figure 14 :
Figure 14: The example of weighted triangle rule.

Theorem 10 .
The weighted complete graph  || established by RTPP consists of 2 vertices.
, Senturk et al. give the energy cost function Cost(  ) for travelling through a cell   as follows, where  is a constant value: Cost (  ) =  (  ) × .
an introduced graph of   , to establish a minimum weighted perfect math   || .Then, initiate an Euler graph   =   || ⋃ mst.(4) Randomly choose a vertex   ∈   and draw an Euler closed trail    that begins with   .(5) According to the order of    , consistently visit all vertices starting from   .If a vertex   has already been visited, then directly go to next vertex   ∈    until all vertices are visited.Then, all the visited   s are put into .(6) Locate CH  and check whether each segment   is on CH  .If there is at least one segment, for example,   , not on CH  , then find the edge (  ,   ) ∈ CH  closest to   and create the tour   = CH  \ (  , ) ∪ (  ,   ) ∪ (  ,   ).Otherwise, let   = CH  .Next, calculate min{(), (  )}, and choose the corresponding tour as the data collection and aggregation tour.

Table 1 :
Terrain types, risk rates, and elevations.Number of Disjoint Segments (  ).Intuitively, the number of communication links between segments raises with   .Therefore, a larger   results in higher energy cost for data collection and aggregation.Number of Mobile Data Collectors (  ).MDCs are employed to replace stationary RNs, due to   being insufficient.If there are plenty of MDCs for connectivity recovery, then travelling distance of data collection and aggregation is minimized this paper, we have discussed a random terrain based connectivity recovery problem in a disconnected WSN under the constraint that there are only  relay nodes (RNs) and  mobile data collectors (MDCs) available.According to different values of , a hybrid connectivity restoration and routing strategy HRSRT is designed.For  = 1, two highly efficient algorithms, the random terrain based path planning (RTPP) and the Optimized Relay Node Deployment (ORND), constitute HRSRT.For  > 1, HRSRT is composed of RTPP, relay nodes deployment (RND), and optimal Path Allocation (PA).All four algorithms collaborate to accomplish the biconnectivity restoration of a disconnected network; meanwhile, the energy cost of data collection and aggregation is minimized.The performance of HRSRT is analyzed theoretically and validated through simulation.The simulation results show that HRSRT outperforms FeSMoR, MiMSI, and the extended version of ReBAT (namely, ReBAT-EX) in terms of the total/maximum energy cost.Our future work is to investigate the connectivity recovery problem with the consideration of the cost for RNs and MDCs deployment.Notations(  ,   ): The Euclidean distance between two segments   and   MD(  ,   ): The Manhattan distance between two segments   and      ,  : Apathfrom  to   , abbreviated as  ,  * , : The minimum weighted path from   to   :