Optimal Multirobot Coverage Path Planning : Ideal-Shaped Spanning Tree

The present paper attempts to find the optimal coverage path for multiple robots in a given area including obstacles. For single robot coverage path planning (CPP) problem, an improved ant colony optimization (ACO) algorithm is proposed to construct the best spanning tree and then obtain the optimal path, which contributes tominimizing the energy/time consumption. For themultirobot case, first the DARP (Divide Areas based on Robots Initial Positions) algorithm is utilized to divide the area into separate equal subareas, so much so that it transforms the mCPP problem into several CPP problems, degrading the computation complexity. During the second phase, spanning tree in each subarea is constructed by the aforementioned algorithm. In the last phase, the specific end nodes are exchanged among subareas to achieve ideal-shaped spanning trees, which can also decrease the number of turns in coverage path. And the complete algorithms are proven to be approximately polynomial algorithms. Finally, the simulation confirms the complete algorithms’ advantages: complete coverage, nonbacktracks,minimum length, zero preparation time, and the least number of turns.


Introduction
From the last century, intelligent robots have gradually been penetrating through various industries [1].Nowadays, massive tasks need collaborations among multiple robots, which brings a plenty of robotic challenges that one of the fundamental problems is path planning for multiple robots.Usually, path planning is to determine an optimal path among "points" (e.g., start point to target points), while avoiding obstacles or no-fly zones.However, when our interest is the whole points of the given area or the reconnaissance points are uncertain, the coverage path planning (CPP), which guarantees robots pass over every point in a given area, should be carried out.In the literature, CPP problem is directly related to a mass of applications, such as search and rescue operations [2], inspecting industrial plants [3], monitoring hazardous environments [4], etc.
From the aforementioned applications, the CPP problem is associated with robots or their sensors and the area of interest.Therefore, one of the most common techniques is area decomposition technique.Our adopted approach is termed as approximate cellular decomposition, which can separate the area of interest into identical cells (according to the scope of sensors), so that the robots can easily cover every point.Apparently, the union of all cells can approximate any arbitrary shaped target area.
In the literature, the single robot CPP problem has received a lot of attention from the decade.To solve this problem, the primary constraint is to completely cover the area of interest.And the next consideration is the high efficiency.During the varying techniques (see Section 2), the spanning tree coverage (STC) algorithm [5], constructing a spanning tree (ST) for all available cells, can guarantee completely covering the area of interest, while generating nonbacktracking path along the ST.Thus, the STC algorithm is one of the dominant approaches for this problem.Our approach utilizes the idea of STC algorithm and inherits its advantages.Simultaneously, because robots' turns will increase more energy/time consumption, we propose a methodology to find the best ST to reduce the number of turns.For more details, see Section 4 of the paper.

Mathematical Problems in Engineering
With the development of technology, multiple robots can collaborate to complete the CPP problem, which is referred to as mCPP problem.To determine the optimal path for mCPP problem, the objectives, including completeness, nonbacktracking path, initial position, minimum coverage path, and energy/time consumption, must be taken into consideration.Unfortunately, it is difficult to adequately address the mCPP problem.Particularly the minimal covering time is NP-hard [6].And the proposed algorithms (see Section 2), which aim to overcome the NP nature of the problem, are mostly focusing only on one of the objectives.Through protracted and unremitting efforts, the authors in [7] make a great progress to propose the DARP (Divide Areas based on Robots Initial Positions) algorithm, which is suitable for the mCPP problem and will guarantee the completeness, nonbacktracking, initial position, and minimum coverage path.To reduce the energy/time consumption, the most direct and simplest method is to obtain "idea-shaped" subareas and decrease the number of turns, where extension investigation only based on the DARP algorithm is bright.
In the present paper, we attempt to solve the mCPP problem without overlooking any of the aforementioned objectives.Our contributions in this paper are threefold.Firstly, the available cells are divided into distinct classes based on the number, capabilities, and initial locations of robots, by improving the DARP algorithm.Secondly, the idea of ST is utilized to generate complete, nonbacktracking, and minimum coverage path.And then, improved ant colony optimization (ACO) algorithm is tried to find the best ST, to minimize the robots' turns and energy/time consumption.Finally, the target is to further decrease the number of turns by exchanging the specific nodes among subareas.
The paper is organized as follows: Section 2 illustrates the related work on the mCPP problem.Some preparatory works and definitions for mCPP problem are completed in Section 3. In Section 4 improved ACO algorithm is proposed to construct the best ST in given area, taking into account the number of turns.In Section 5 "idea-shaped" subareas are accomplished, with a comprehensive discussion on exchanging cells.And the simulation results and comparison studies are presented in Section 6.Finally, Section 7 concludes the paper together with an outlook on the future work.

mCPP Problem in Given Area.
From the previous investigations, mCPP problem has been addressed extensively.The authors in [8] reviewed and evaluated several different methods, such as cellular methods, grids-based methods, and graphs coverage, for known or unknown areas.And this subsection presents the typical and dominant achievements on our problem.
Exact cellular decomposition methods are one representative of classical and popular methods.Typically, it first decomposes the known area into subregions.Next, the robot sweeps the subregions using simple motions.From the literature, trapezoidal decomposition can produce coverage paths only for polygonal spaces.But it creates so many only convex cells, which increase more sweeping paths [9].Boustrophedon decomposition, proposed by Choset [10] and Pignon [11], in an attempt to overcome the limitation of the aforementioned method, has shorter coverage paths in the same class of situations.Simultaneously, optimal sweep direction that is aimed at minimizing the number of turns holds less lanes, which has been proved in literature [12,13].Furthermore, Morse decomposition allows spiral patterns to simplify the flyable path and to minimize the time [14].However, it is not flexible enough to handle rectilinear environments, and the path might be retracing.Overall, the exact cellular decomposition methods do not consider the robots' initial position, and there might be some additional trajectory or backtracks among subregions.
Considering the young research field of mCPP problem, the spanning tree coverage (STC), firstly applied to multiple robots by Hazon, N. et al [15], is one of the dominant approaches.Their algorithm called MSTC can guarantee visiting all cell of interests only once.Unfortunately, the path of each robot is associated with its initial position, and the path of one may cover the whole given area in the worst case.Later, the same authors, trying to alleviate the shortcoming, improved the MSTC to OPT-MSC [16], by restraining the path distance between adjacent robots.However, there is no guarantee for the initial position.In literature [17], an alternative ST method, restraining the number of cells in known area and providing an upper bound on algorithm, was proposed to control each robot's maximum path length.It performs better than MSTC and OPT-MSTC, though the nonbacktracking constraint may be broken.Worse still, all these methods disregard the number of turns or simple motion, which may result in more energy/time consumption.

Area Decomposition.
In [18,19] a convex polygon decomposition problem is presented.The anchored area is divided into more convex polygon pieces by sweep-line approach.As a result, each subarea contains the robots' initial positions on its boundary and each of a specified area related to the robots' capabilities.However, the approach performs well only on unrealistic conditions, such as convex polygon without obstacles and initial position on the boundary.
Concerning the concave polygon, the algorithm described in [13] first divides the concave polygon into separated convex polygons by a principle that the concave decomposition line is in parallel to one edge of the polygon.Then, subregions, who are entirely adjacent or whose widths have the same direction, will be combined to avoid unnecessary back and forth motion.Unfortunately, the technique does not consider the obstacles and initial position again.And there might be many repeated or additional paths when robots traverse to the next subregion.
Many of the state-of-the-art approaches, which seem suitable for the area division problem, have relied on Lloyd's algorithm [20], Voronoi partitioning [21], K-means [22], some heuristic algorithm [23], and alternate-offer protocol [24].Although these approaches have their special characteristics, the common ground of direct applying aforementioned approaches to mCPP problem is the suboptimal results or additional paths.
The authors in [7] make a great contribution to fully exploit the robots capabilities based on the advantage features, using a grid-based algorithm referred to as DARP (Divide Areas based on Robots Initial Positions).The proposed algorithm, which performs an area subdivision related to robots and defines paths in distributed manner, can guarantee the completeness (it covers all area), initial position (each subarea includes the robots' initial position, and robots can start at once), nonbacktracking path (the robots visit each cell only once), and minimum coverage path.Actually, the generated path does not consider the number of robots' turns, which increases the difficulty of the robots' implementation and energy/time consumption.
According to the aforementioned analysis, there is room for contributions concerning decreasing the number of turns and the time consumption.To meet these requirements, this work improves the DARP algorithm by exchanging some cells among adjacent subareas.Thus, it simplifies the path and reduces the number of turns so as to inherit the merit of DARP and improve the implementation efficiency of the robots.

Basic Definitions and Assumptions
Concerning the coverage path planning, the primary task is to organize the area of interest so as to choose suitable approaches which are related to the capabilities of robots.
Usually, the camera or sensor footprint of a robot is usually a trapezium.For simplification, the speed and altitude relative to ground level of the robot are considered to be fixed, and the projection of robots' location is at the center of the footprint which is assumed to be a square about  on a side.
For ease of understanding, the whole area of interest () including obstacles () is discretized into equal cells according to .Furthermore, the area of interest to be covered is assumed a rectangle in the (, )-coordinates.Then the set of cells to be covered () is represented as where rows, cols are the number of cells of rows and columns, and the number of all cells is   =  × .Apparently, the number of cells to be covered is  =   −  0 , where  0 represents the number of cells of obstacles.
Definition 1. Distance between two cells (  ,   ) and (  ,   ) If   ≤ , the two cells are adjacent; thus the robot can travel from one to the other at each timestamp.

Single Robot Coverage Path Planning
To solve the mCPP problem, we can start to cover the area of interest with only one robot.According to the aforementioned analysis, the single robot coverage path planning problem can be transformed to calculate the minimum length of complete coverage path  with minimum number of turns.min ‖‖ .. ⊇  (6) where ‖‖ denotes the length of path .

STC Algorithm.
The STC algorithm imposes a 2-size grid approximation on the given area of interest, where  is the sensor size.For the resulting grid, any effective procedure can be used to construct a ST, starting from any initial position.Then, a complete coverage closed path that circumnavigates the ST edges is generated.Figures 1(a)-1(d) illustrate the basic steps to generate a coverage closed path.
Considering the ST for the work-area, the consequences vary from methods or initial positions; thus the complete coverage closed path may be different.Figures 1(d) and 1(f) present two different STs for the same work-area.Apparently, the path in Figure 1(f) has more turns, which requires more energy/time consumption for robots.Also, it does not help the simple motions to alter the direction too frequently.To reduce the number of turns, a better method for ST should be selected.
Most often, minimum-spanning-tree algorithm is referred to together with Kruskal algorithm, Prim algorithm, depth-first search algorithm, and breadth-first search algorithm.In the present formulation, the nodes are at the center of the gathering 4 cells, where the distances between adjacent nodes are the same.And these algorithms all can generate a complete coverage ST for a given area but its shape is completely random and makes a big difference on robot's initial position.Concerning the number of turns, the path resulting from depth-first search algorithm is more likely to have the least compared to others, but still having no optimality guarantee.Fortunately, an improved ACO algorithm is proposed to construct a complete coverage ST with minimum number of turns, so much so that we can benefit from the advantages of ACO algorithm.[25], is inspired by real life behavior of a colony of ants seeking an optimal path between their colony and food.In their search, ants roam randomly and once they find food, they mark the trails by laying pheromone, the amount of which influences other ants to retain random travelling or follow the trails.The more pheromones a path has, the more possibility other ants follow the trail and consequently lay more pheromones.But over time the trails' attraction decreases with the evaporation of pheromone.As a result, the shorter trail that costs less time has more pheromones so that finally it is reinforced.On the contrary, pheromone evaporation and random travelling may contribute to avoiding locally optimal solution.In that case, the optimal solution can be found by exploring only a small part of the solution space.

Improved ACO Algorithm for ST. Ant colony optimization (ACO) algorithm, firstly proposed by Macro Dorigo
To accomplish ST with minimum number of turns, the improved ACO algorithm is proposed, whose detailed procedures are as follows. where and   is randomly selected according to the probability: where   = 1/  .() is a parameter to tune the importance of direction; if nodes  −1 ,   , and   are in a line, it defines () = 1; else () = 0.5.Thus the ants can be firstly guided to the node that is in line with nodes  −1 and   .  represents a set of unvisited adjacent nodes when the ant  is at node   ; if   ̸ = ⌀, and nodes  −1 ,   , and   are not in a line, then update the number of turns   and the length of backtracks   by   =   + 1,   = 0.If   = ⌀, it means there is no unvisited adjacent node for node   .To improve efficiency, the distance between nodes   and   can be increased temporarily avoiding unwise transition.Simultaneously, update the length of backtracks   and backtracks number   by   =   + 1,   =   + 1; then take node  −2  +1 as the current node, so as to keep finding next node until the next unvisited node; update backtracks number   by   =   + 1.Finally, cycle the above procedures until all nodes are visited.
(iii) Pheromone Actualization.When all ants have finished a circuit, the pheromone information can be updated according to the optimal path found in the current generation.And considering the evaporation, the pheromone can be updated by the following formula: where   describes a set of nodes included in the optimal path and   represents the number of turns of the optimal path the ant ( = 1, 2, . . ., ) passed, and it can be calculated from the number of turns   and global backtracks   .
(iv) Stopping Rules.The arithmetic stops when there is no improvement on the solution or the set value of iteration is reached, and then the optimal path will be output.

Performance Discussion
. This subsection presents the basic steps to generate coverage closed path.Also, a comparison with depth-first search algorithm is made to confirm the performance of improved ACO algorithm.First, a given terrain including obstacles is discretized as Figure 1(a).Second, every four cells are grouped into a large square-shaped cell, and the center of the large cell is defined as node.Third, the ST is constructed by improved ACO algorithm (the parameters are set as follows:  = 30,  = 150,  = 1,  = 1,  = 0.15, and   (0) = 1).Here, depth-first search algorithm is also adopted to be a comparison.Last, a complete coverage closed path that circumnavigates the ST edges is generated.
The coverage path in Figure 1(d) contains 52 90-degree turns (one 180-degree turn can be regarded as two 90-degree turns), which is less than 64 turns of the coverage path in Figure 1(f).The reason is that the ST resulting from improved ACO algorithm is always aimed at minimizing the number of turns, but the ST resulting from depth-first search algorithm tends to be random and restricts the robot's initial position.Apparently, the path resulting from improved ACO algorithm is more likely to have fewer turns so much so that there is less energy/time consumption.Thus, the improved ACO algorithm is more suitable for constructing ST taking into account the number of turns.

Multirobot Coverage Path Planning
Similar to the single robot CPP problem, the mCPP problem can be defined to calculate the minimum length of paths   so as min max where ‖  ‖ denotes the length of path   .

DARP Algorithm. The DARP (Divide Areas based on
Robots Initial Positions) algorithm, proposed in literature [7], has two phases including dividing the available cells into distinct subareas as many as the number of robots and finding optimal path for each subarea.The latter has been realized from the approaches for single robot CPP problem.In other words, only the problem of dividing available cells, without any concern about the optimal paths, has to be addressed.Thus, the optimal solutions for mCPP problem can be defined as min max where   denotes the set of subareas but not the strict paths,   denotes the number of robots, and   represents the robots' initial position.And these constraints ensure the complete coverage, nonbacktracking path, absolutely fair division, subareas including the related robots' initial position, and continuous subareas, respectively.
To achieve the areas subdivision, we must complete the following twofold.
(i) Equally Divide the Space.Each robot's subarea   can be computed by the assignment matrix ; they are constructed according to where   represents evaluation matrix, which expresses the distance between the cells of  and robots' initial position and   is a scalar correction factor,  denotes a positive tunable parameter,   is the cardinality of the   set, and  denotes the global "fair share," which is computed by those available cells divided by the number of robots.

Mathematical Problems in Engineering
(ii) Build Spatial Connected Areas.From the aforementioned analysis, the division cannot guarantee the continuity of the subarea.To deal with such situations, the evaluation matrices are finally updated as where ⊗ denotes the element-wise multiplication,   denotes the set of cells connected with the initial position,   denotes the set of cells assigned to the robot but unconnected with the initial position, and   is a parameter to construct gradually a closed-shaped subarea of each robot.If all subareas of each robot are continuity,   is set to all-one-matrix.In a nutshell, the division comes out by modifying the related evaluation matrix iteratively.

Exchange Nodes.
Although the DARP algorithm, which can equally divide the area of interest and guarantee the continuity of subarea, aims to provide the optimal cell's assignment, the shape of subarea is random without taking simple motion or fewer turns into consideration.As shown in Figure 2, each area has the same number of cells but different shapes, resulting in different coverage paths along with different number of turns.
Dealing with such situations, a method, exchanging nodes among different subareas, is introduced to minimize the number of turns in each subarea.By adopting this method, the number of turns in each subarea should meet the equations as where   denotes the number of turns in th subarea and  0 represents the number of turns before exchanging nodes.By applying this method, we must achieve the following twofold: confirming the exchangeable nodes and the method to exchange them.A fine-grained analysis is as follows.
(i) Exchangeable Nodes.On the basis of DARP algorithm, there are subareas.To obtain the exchangeable nodes, first and foremost, a ST should be constructed for each subarea, where the number of turns in each node can be confirmed.Definition 3. In the ST, one is called the end node, which is only connected with one adjacent node.
Considering the continuity of the ST in subarea, only the end node can be exchanged (given to the adjacent subarea).As shown in Figure 3.The node   is the end node, and it is only connected with node   .Apparently, the end nodes can be divided into three types: (a) I-shaped end node: there are two turns, and when this node is discarded, there are still two turns.
(b) L-shaped end node: there are four turns, but when this node is discarded, they decrease to two turns.
(c) T-shaped end node: there are four turns, but when this node is discarded, they decrease to none.
(ii) Exchange Nodes.Having the above concept in mind, it needs to obey the following rules so that the number of turns will be decreased when exchange nodes.
(a) Each exchange means that one subarea discards an end node and another subarea receives this node.
(b) When discarding an end node, T-shaped end node is top-priority, followed by L-shaped end node.And this discarded node received by an adjacent subarea will connect the new ST and form another shaped end node, where Ishaped end node is the best, L-shaped end node is next, and T-shaped end node is the worst choice to decrease the number of turns.Then, this adjacent subarea will discard an end node to another subarea, and so on.After a circuit, the number of nodes belonging to related subarea is still about (/  ), and if the number of turns cannot meet (14), then discard the current exchange and restart a new circuit.
(c) All exchangeable nodes in one subarea experience an exchange and then turn to another subarea.
(d) The exchange stops when there is no longer decrease in the total number of turns in the given area of interest or reaching the max iteration times, and then the optimal path will be output.

Overview of the Complete Algorithms.
Concluding the above two subsections, the complete algorithms for mCPP problem are summarized.The algorithms in the present paper include three phases: During the first phase, the area of interest is equally divided into   separate subareas by DARP algorithm, so much so that the mCPP problem can be downgraded to   single CPP problems.Next, STs are constructed for subareas, respectively, to confirm the number of turns and exchangeable nodes, utilizing improving ACO algorithm.After the above two phases, the complete coverage, nonbacktracking path, absolutely fair division, subareas including the related robots' initial position, and continuous subareas can be easily satisfied.In the last phase, the target is to decrease the number of turns by exchanging the specific nodes among subareas.
A flowchart of the proposed algorithm is presented in Figure 4.

Memory and Computational Complexity Analysis.
As the previous subsection, the complete algorithms include three phases with separate calculations, so that the computational and memory complexity can be analyzed in three parts.
In the first phase, DARP algorithm's memory complexity is (  × ).The maximum iterations (in the worst case scenario) of the algorithm depend on the number of robots, random initial deployments, and the grid size, resulting in being practically infeasible to compute exhaustively.A series of simulations are adopted to approximate the computational complexity, the curve of which is proven strictly bounded  under the  3  ×  2 curve until a practical interesting input (  × ).
In the second phase, improved ACO algorithm constructs STs for all subareas, the number of which is equal to the number of robots   and the grid sizes in which are about (/  ).The memory and computational complexity are ((/  ) 2 ×   ) and ((/  ) 4 ×   ), i.e., ( 2 ) and ( 4 ), respectively.
In the last phase, the memory complexity is obviously linear to the size of input (  × ), i.e., (  × ).The computational complexity of exchanging nodes depends on the number of the subareas   and exchange times among them.There are at most /  exchangeable nodes in a subarea; thus it needs ( × /  ) circuits (related to the optimal solution) to complete exchanging these nodes, where  is a constant number.And in the worst case scenario, each cycle needs to exchange nodes   times.As a result, the computational complexity is defined as  Since the number of robots   and the grid size  meet 1 ≤   < , the memory complexity of the complete algorithm can be written as And the computational complexity can be written as Concluding this section, the complete algorithms are approximately polynomial algorithms until practical input, but may lose their polynomial behavior with numerous inputs (  × ).

Simulation Results
This section presents a simulation of the proposed algorithms.The simulation setup adopted is the same as in [7] for comparable results.More precisely, (i) the size of the area of interest is [rows, cols] = 24 × 24 (ii) the obstacles arrangement follows a random uniform distribution (iii) there are 9 robots in this area, and their initial positions are as shown in Figure 5(a) Figure 5 represents a comparison between the proposed algorithms with DARP+STC algorithms.Figure 5(a) illustrates the initializations.Figure 5(b) represents dividing the terrain into separate subareas (the number of gathered cells is [13 13 13 13 13 13 14 13 14]) by DARP algorithm, each of which contains the related robots' initial position.Then, in Figure 5(c), STs are constructed by algorithm in [7] (Kruskal's or Prim's algorithm), which results in a random tree, not concerning the number of turns.And the total number of turns in paths is 174 ([16 18 18 18 20 22 20 22 20]).In Figure 5(d), taking the number of turns into consideration, improved ACO algorithm is adopted to construct the STs.And the total number of turns in paths is 164 ([16 18 18 16 18 18 20 20 20]), which is less than those in Figure 5(c).To further decrease the number of turns, some specific end nodes are exchanged among subareas, as shown in Figure 5(e).The total number of turns in paths is 148 ([14 18 12 14 18 18 18 16 20]), where the number of turns in each subarea and the whole terrain is obviously the least.Finally, the optimal paths are around the STs in Figure 5(f).
In literature [7], the DARP + STC algorithms are compared with MFC and optimized MSTC algorithms (the stateof-the-art methods) in cover time (in terms of path length) for all robots, which proves that the DARP + STC algorithms have optimality guarantee.In the present paper, the proposed algorithms do not increase the cells of any subarea (namely, they do not increase the path length); at the same time they decrease the number of turns.Apparently, when covering a given cell, the energy/time consumption covering in a line is less than that covering with turn, so much so that the proposed algorithms have optimality guarantee when concerning the number of turns.

Conclusions
The proposed algorithms are aimed at finding optimal solution for multirobot coverage path planning problem.During the preliminary analysis, the basic definitions and preparations are made to achieve the optimal performance easily.First, of the proposed algorithms, the area of interest is equally divided into separate subareas utilizing the DARP algorithm, which takes the obstacles and initial positions into consideration.Then, STs are constructed for subareas by improved ACO algorithm, which means avoiding paths with a mass of turns.On the basis of STs, some end nodes are exchanged to modify the shape of STs as "ideal shape," so as to further decrease the number of turns.With these orchestrated STs, the coverage paths can be calculated.It is worth to point out that the robots can start from their initial positions, and not only are the overall paths complete, nonbacktracking, minimum coverage paths but also there are fewer turns in paths.And these above features never appeared in other methods at the same time.
Regarding the proposed method, it tries to find an optimal solution, but the numbers of turns in paths are still restricted to the basic shapes of the subareas.Thus, one of the future works could be to explore a succinct method which straightforwardly divides the terrain into ideal-shaped subareas, or some other methods can achieve all the aforementioned features.

Figure 1 :
Figure 1: Comparison of two ST algorithms.
(i) Initially.Initialization of parameter as follows: : quantity of the ants : index of the ants : maximum iteration : index of iteration number : influence of the pheromone value : influence of heuristic information   : pheromone information between nodes   and     : Heuristic information correlated with the distance between nodes   and   : Evaporation rate of pheromone All the ants are randomly placed on the nodes (ii) Transition Rules.When the current node is   and the number of iterations is , the ant ( = 1, 2, . . ., ) selects the next node   following the pseudorandom rules as the following formula.

Figure 2 :
Figure 2: All are 32 cells but of different shapes.

( a )
Initial cells' discretization, nodes, obstacles, and robots' positions (b) Divide the terrain into subareas by DARP algorithm (c) Construct STs for subareas by Kruskal's or Prim's algorithm (d) Construct STs for subareas by improved ACO algorithm (e) Exchange end nodes among subareas (f) Final paths around the STs