Collaborative Pursuit-Evasion Strategy of UAV/UGV Heterogeneous System in Complex Three-Dimensional Polygonal Environment

The UAV/UGV heterogeneous system combines the air superiority of UAV (unmanned aerial vehicle) and the ground superiority of UGV (unmanned ground vehicle). The system can complete a series of complex tasks and one of them is pursuit-evasion decision, so a collaborative strategy of UAV/UGV heterogeneous system is proposed to derive a pursuit-evasion game in complex three-dimensional (3D) polygonal environment, which is large enough but with boundary. Firstly, the system and task hypothesis are introduced. Then, an improved boundary value problem (BVP) is used to unify the terrain data of decision and path planning. Under the condition that the evader knows the position of collaborative pursuers at any time but pursuers just have a line-of-sight view, a worst case is analyzed and the strategy between the evader and pursuers is studied. According to the state of evader, the strategy of collaborative pursuers is discussed in three situations: evader is in the visual ﬁeld of pursuers, evader just disappears from the visual ﬁeld of pursuers, and the position of evader is completely unknown to pursuers. The simulation results show that the strategy does not guarantee that the pursuers will win the game in complex 3D polygonal environment, but it is optimal in the worst case.


Introduction
Artificial intelligence (AI) is a frontier field that many researchers are competing to explore worldwide. AI is at the level of decision making which is higher than that of automatic control, and it gives the automatic system greater autonomy and intelligence. e improvement of the adversarial ability is an effective way to increase the intelligence of robot, such as AlphaGo, so many studies on game theoretic decision making are carried out [1].
UAV has high flexibility and its size is small, so it is popular as a civil and military agent. Most decision making problems of UAV are mainly concentrated in mission planning layer, including formation [2], cluster [3], and task assignment [4]. However, many intelligent algorithms are difficult to find suitable application or background. One of the reasons may be the fact that the studies are too abstract and idealized.
On the other hand, UAV/UGV heterogeneous system is a new type of multiagent collaborative system, which can realize complex collaborative tasks based on the advantage of powerful situation awareness, and many researches have achieved some valuable results. Khaleghi et al. [5] proposed a dynamic data driven adaptive multiscale simulation (DDDAMS) for efficient surveillance and crowd control via UAVs and UGVs. e hardware-in-the-loop (HIL) real-time simulation demonstrates the effectiveness of DDDAMS. Shao et al. [6] designed a cooperative USV-UAV platform, which can realize that UAV lands on a USV (Unmanned Surface Vehicle) by a hierarchical landing guide point generation algorithm. Heterogeneous system has many advantages, but how can one perform the intelligence of it in a game?
Pursuit-evasion is a typical adversarial game, and differential strategy is one of the early solutions. Classical differential game is based on noncooperative equilibrium [7] and participants move according to the equations described by Hamilton-Jacobi-Isaacs (HJI). Awheda and Schwartz [8] proposed a fuzzy reinforcement learning algorithm that uses Apollonius circle mechanism to define the capture region of learning pursuers. In differential game, most methods describe physical constraints by means of mathematical equations, but they are complicated to solve multivariate constraints, especially in complex environment.
Cops and robbers game is one of the most studied pursuit-evasion games that are based on graph. Goldstein and Reingold [9] pointed out that it is an EXPTIMEcomplete problem. So it is hard to find an efficient solution of equation and should not be considered as a problem of optimal control. If evader randomly chooses its next position, it will result in a mixed equilibrium. e study of Isler and Karnad [10] shows that the reduction in visibility can cause an exponential increase in the capture time.
In geometric environment, lion and man game is one of the hot issues about pursuit-evasion. By analyzing several lion and man games, Casini and Garulli [11] proposed an approach that relies on the computation of a suitable "center" at each move. Bhadauria and Isler [12] showed that three cops can capture the robber in any polygonal environment. However, how to make them work well in a more complex environment is still a problem that needs to be studied. e paper studied a collaborative pursuit-evasion game of heterogeneous system in complex 3D polygonal environment. A heterogeneous system consisting of UAV and UGV based on previous works is introduced first. According to the characteristics of heterogeneous systems, a novel collaborative pursuit-evasion game in nonconvex three-dimensional polygonal environment is studied. Simulation results show that the proposed method can perform an optimal pursuit-evasion game in complex 3D polygonal environment.

UAV/UGV Heterogeneous System and Previous Works.
In pursuit-evasion game, the working process of UAV/UGV heterogeneous system is described as follows: UAV flies in the air with high speed and top view. UGV runs on the ground with low speed but can interact with environment. Both UAV and UGV are equipped with a variety of sensors that are fully aware of environment and the digital map is known to both of them. UAV and UGV can communicate with each other and share information such as location, speed, and strategy. e proposed structure of UAV/UGV heterogeneous system is shown in Figure 1 and the prototype of it is developed with quadrotor as UAV and Mecanum vehicle as UGV. e UAV/UGV heterogeneous system has been established based on our previous works [13]. e UAV and UGV are equipped with sensors such as image, attitude, and altitude. e sensors will help the system to sense the environment and the motion of target. e fusion of INS, GPS, and image is used to make relative positioning and the map is already stored in onboard computer. By the superiority of collaboration, the heterogeneous system can perform a series of complex tasks such as cluster formation, collaborative awareness [14], and collaborative decision making.

Collaborative Pursuit-Evasion Game and Hypothesis.
By using the framework of the heterogeneous system described in Section 2.1, two collaborative pursuers consisting of a UAV (denoted as P 1 ) and a UGV (denoted as P 2 ) pursue another UGV (denoted as E) in a complex 3D polygonal environment ( Figure 2). e speed of P 1 is higher than that of P 2 and E and the speed of P 2 is equal to that of E In the adversarial and complex environment, the heterogeneous system not only tries to capture the evader but also has to avoid several obstacles. In addition, P 1 and P 2 can communicate with each other. In the game, E is no longer a purposeless passive evader, which is different from moving target tracking, and it is intelligent enough to take advantage of terrain and location of pursuers.
In the game, P 1 and P 2 share information about position and speed of E but do not know the strategy of E. e goal of E is to maximize the chances of survival and get rid of pursuit. Usually, E does not know the strategy, position, and speed of P 1 and P 2 . A more severe situation for pursuers is that E can get the information of P 1 and P 2 . So, in this paper, it is assumed that E knows the position and speed of pursuers everywhere and anytime, but pursuers will lose target when E hides behind obstacles, which will bring great challenge to pursuers. us, pursuers only have a Line-of-Sight (LOS) view, and evader has obvious advantages. Since P 1 and P 2 cooperate with each other, if E is observed by P 1 , P 2 will be notified, and vice versa. e terrain is large enough to prevent the game from ending too soon, but it has a boundary. Both pursuers and evader know the map and move in turns. When the projection of P 1 in horizontal plane or P 2 can reach the position of E in a calculation period, pursuers win the game. If E reaches the boundary of map before being caught, evader wins the game.
According to the description of the collaborative pursuit-evasion game, the pursuit-evasion strategy is determined by the completeness of information and the probability of escape. Further, the degree of completeness of information is related to the visibility of visual field, which depends on the state of E relative to P 1 and P 2 . On the other hand, the probability of escape is related to the path of escape, which depends on obstacles in the environment. e visibility of visual field and the path of escape are interdependent and both of them determine the pursuit-evasion strategy together. Assume that E(t) is the position of E at time t. M represents the information of obstacles and environment. So the set of the paths of escape is R(E(t), M). Define r i as the ith path of escape; then the probability of E successfully escaping along path r i at time t is p E (E(t), t | r i ).
en, the decision model S of the collaborative pursuitevasion game can be described as  (1) Here, r i ∈ R(E(t), M) and symbol ∝ represents that S is determined by the results of equation on the right.

Move Generator Based on 3D Real-Time Path Planning
In this paper, 3D path planning method is taken as a move generator to ensure that decision making can be achieved. e comprehensive consideration of pursuit-evasion and path planning is an effective guarantee for the implementation of decision. According to equation (1), the proposed strategy needs to calculate the probability of different escaping routes through path planning method, so an appropriate path planning method is studied for collaborative pursuit-evasion game.
Unlike traditional classification, path planning methods will be considered in the perspective of data here. Only if the data of decision making and path planning is unified can the researches of two directions achieve seamless connection, which is completely different from hybrid planning [15]. e data of path planning is terrain and different modeling methods of terrain completely determine the applicable path planning algorithm. So finding a suitable path planning method for strategy should start from the modeling of terrain. e graph-based approach is more suitable for global path planning, so an optimum solution can be found in some way. However, the convincing theoretical optimum is hard to reach. For instance, different definition of threat may lead to a different optimal path [16]. Different artificial settings make it difficult to find a unified optimal path for the Voronoi diagram. However, the advantage of the method is that it has clear theory and can compute the complexity (its complexity is n log n and n is the number of threats). e random sampling frequency of Probabilistic Roadmap Method (PRM) is also determined by human factors [17]. e complexity of PRM depends on the difficulty of searching path but it almost has no relationship with map and dimension. In other words, these methods have their own advantages, but their optimal paths are doubtful. e grid-based method is studied frequently and there are a lot of research results. A-Star is one of the well-known algorithms in this class and now many researchers have focused on the improvement of heuristic function to reduce time consumption [18]. However, it has the problem of combination explosion because of gird and data structure. Developed from A-Star, D-Star can make dynamic path planning but the shortest path is still influenced by the density of grid [19]. In addition, Ant Colony (AC) needs grid to compute the matrix of pheromone concentration [20] and Genetic Algorithm (GA) needs grid to make genetic operation [21]. Both methods rely on infinite iterations to approach a theoretical optimum. Another problem is that they are hard to give an accurate analysis of complexity.
By comparison, the map used in potential field methods is simpler and potential field methods are often efficient. Common potential field methods use force to find shortest path, but they must endure local optimum [22]. Learning from the concept of fluid dynamic, stream function establishes a potential field that can avoid local minima and it also has been extended to three-dimensional space [23]. Due to the restriction of fluid dynamic, stream function has stagnation point, which will lead to the termination of planning.
According to the summary of introduction, there are several solutions to pursuit-evasion game, such as the method based on differential game, graph, and polygon. Because differential game is difficult to get an analytical solution in complex environment, the proposed strategy is based on polygon. en, the map is described as polygon and the polygonal path planning methods are suitable.
Here, a method based on boundary value problem (BVP) described in our previous works [24] can be used. Of course, finding a more appropriate 3D path planning method is beneficial to accelerate the calculation and reduce the complexity. e field of BVP is harmonic and it has a grid map. BVP uses gradient descent direction to determine the path that connects the start and end points. Since the target area is defined as lowest potential field and each grid only has one gradient direction, a path from any point to target area will be found. Under the Dirichlet boundary conditions, the potential field of each grid will be calculated by the following equation: Here, v is the deflection unit vector and ε is coefficient. e adjustments of these two parameters will be of benefit to improve search and it is equivalent to change the actual potential field artificially. By using Gauss-Seidel (GS) method, the classic BVP can be discretized. Taking the threedimensional case as an example, the dynamic update of the grid potential field is calculated by the following equation: Each central grid is adjacent to twenty-six grids (similar to the center of Rubik's cube), so the update of the discrete potential field is related to the adjacent twenty-six grids. In equation (3), a is the coefficient of superimposed field. p t c and p t+1 c are the potential fields of the central grid at the current moment and next moment, respectively. e second  Complexity term is the average potential field of adjacent grids and the third term is the propagation field of adjacent grids.

Strategy of Collaborative Pursuit-Evasion Game
Pursuit-evasion game is an important branch of game theory and also a challenging research field of artificial intelligence. Pursuit-evasion game is the same as many chess games that are also an EXPTIME-complete problem of computational complexity. With the development of computers, some chess games have obtained theoretical solutions, such as Connect Four (conquered in 1989), Gomoku (conquered in 1993), and 8 × 8 checkers (conquered in 2007). It has been proved that EXPTIME-complete problem cannot be solved in a polynomial time. erefore, many studies have to seek an approximate solution for it. e classical pursuit-evasion game has achieved a great success in barrier-free environment, but its conclusion is difficult to be applied in the environment with obstacles (nonconvex environment). On the other hand, the players are heterogeneous and the dimension of the planning space increases from 2D to 3D, so the solution of collaborative pursuit-evasion game in 3D complex environment with incomplete information becomes more difficult.

Strategy of Evader.
e general idea includes the following three points: (1) E moves toward the nearest boundary (2) E needs to minimize the probability of being discovered by P 1 or P 2 (3) E needs to maximize the distance from it to P 1 and P 2 Among the three points, the first point is the condition of victory, and the second and third points are the conditions for survival. E will try to win the game under the premise of ensuring survival. If P 1 or P 2 always knows the position of E, the strategy of E should follow the first and third points.
en the problem becomes a bit simpler. Now a more complicated situation is that both P 1 and P 2 do not know the position of E when it hides behind obstacles. In this scene, a worst case of E is proposed as follows: once E is found by P 1 or P 2 , E will be always visible until the end of game (denoted as Once Seen Until condition, OSU).
Since the speeds of P 1 , P 2 , and E are not all the same and there is V p 1 > V P 2 � V E , the map used by them should be unified first. According to Section 3, the map with grid is preferred. Here, the length of each grid is determined by the fastest participant P 1 , which means that every calculation period (or unit grid) is subject to the movement of P 1 . at is to say, if P 1 moves one or several grids, P 2 and E may still stay in the same grid. e speeds of participants are constant; otherwise, the density of grid should change dynamically.
In the case of OSU, the strategy of evader is that E needs to calculate the set Path E of shortest paths from the current position to each boundary grid first, and so en E will calculate the probability of being discovered by P 1 or P 2 in each path. Assume that the position of evader at time t is E(t) and its feasible position at next time t + 1 is E 1 (t + 1), . . . , E k (t + 1), where subscript 1 . . . k represents the feasible branch of path at time t + 1. Suppose that l j (t + 1) is the number of feasible branches, which starts from the jth waypoint, and so j ∈ 1 . . . k { }. Let L � k j�1 l j (t + 1) and then the probability of the branch E j (t + 1) which evader selects is l j (t + 1)/L. Next, evader will compute the probability of being discovered by P 1 or P 2 at each waypoint of each path. e visual field can be calculated based on the Line-of-Sight (LOS) method [25].
Since P 1 and P 2 share information about E, it is considered that if E is observed by P 1 , P 2 will be notified, and vice versa. It should be noted that each waypoint E(i) M(j) does not mean that E moves one step because it is related to the density of grid. e density of grid is based on the movement of P 1 in a calculation period, so Table 1 shows the relationship between the movement of E and waypoints.
Suppose that E is located at the jth waypoint of the ith path, so E is at E(i) M(j). en, the risk value R E(i) M(j) of waypoint E(i) M(j) will be calculated. By the path planning method in Section 3, all the escape paths after the jth waypoint need to be derived to determine whether E will be observed or caught in the future. If E(i) M(j) will not be observed by P 1 or P 2 , R E(i) M(j) � 0. If E(i) M(j) will be observed by P 1 or P 2 but will not be caught before E wins the game, R E(i) M(j) � 1. If E(i) M(j) will be observed by P 1 or P 2 and will be caught before E wins the game, R E(i) M(j) � 2 and all rest risk values after the jth waypoint will be set to 2 according to OSU condition. en, the escape path selected by E is It should be noted that the idea of risk value is similar to the probability in [26] but the calculation is simpler and is not limited to the case of equal speed between pursuer and evader.

Strategy of Collaborative Pursuers.
In some researches, UAV can be taken as a provider of UGV's visual field and only UGV is used to pursuit evader. At this time, UAV is similar to an aerial base station described in [27]. e collaborative strategy here refers to the situation where both UAV (P 1 ) and UGV (P 2 ) are taken as pursuers and UAV flies on a low altitude with terrain following/threat avoidance (TF/TA). P 1 and P 2 use the method described in Section 3 to make path planning. So, according to the state of evader E relative to P 1 and P 2 , the collaborative strategy is divided into three cases.
(1) Case 1: evader is in the visual field of P 1 or P 2 (in sight)

Complexity
Pursuers should follow the shortest path to catch E. However, at the same time, P 1 and P 2 need to keep E in their visual fields as far as possible. So the strategy of pursuers in Case 1 is as follows: pursuers should ensure that E is most likely located in collaborative visual field first and then follow the path calculated by the method in Section 3. e expected position of pursuers at the next moment can be calculated by Algorithm 1.
In the case of incomplete information, Algorithm 1 takes both of the visual field and the shortest path into account. en, the probability that P 1 moves in vertical direction is increased so that P 1 will have a better visual field. (2) Case 2: evader just disappears from the visual field of pursuers (known before a while) Classic pursuit-evasion games often use the strategy that drives evader to a bounded border, such as lion and man game [28]. In the game of this paper, evader will win the game when it reaches the boundary of map before being caught, so a "collaborative intercept" strategy of collaborative pursuers is proposed based on the characteristic V p 1 > V P 2 of UAV/UGV heterogeneous system. e idea of "collaborative intercept" strategy is as follows: in a large enough but bounded map, pursuers collaboratively compress the escape space of E and try to turn the problem into a typical bounded pursuit-evasion game (similar to a lion and man problem [29,30]). Similar to the discussion in Section 4.1, a worst case is also proposed for pursuers when Case 2 happens: Once P 1 and P 2 lose the position of E, E will be always invisible until the end of game (denoted as Once Lose Until condition, OLU). en, the "collaborative intercept" strategy is as follows: in the case of OLU, P 2 takes the position where E disappears as the subgoal to continue linear pursuit in Figure 4(a), and P 1 takes the position where E probably appears as the subgoal to intercept it in Figure 4(b). Here, a cuboid obstacle is taken as an example. In Figure 4(a), E(t) and P 2 (t) are the current positions of E and P 2 , respectively. E(t + 1) is the next position of E; O 1 is cuboid obstacle. V invisible represents invisible point. When E moves from E(t) to E(t + 1), its state relative to P 2 changes from visible to invisible. So the subgoal of P 2 is the vertex S of the cuboid obstacle O 1 . Vertex S is similar to the blocking vertex in [26] and it has the following properties.
Property I: if two points P 2 and E(t + 1) are blocked by a polygon O 1 , the shortest path from P 2 to E(t + 1) is a polygonal path whose inner vertices are vertices of O 1 . In Figure 4(a), Property I indicates that the subgoal S must be in the shortest path between P 2 and E. e role of the subgoal of P 2 is to drive and force E so that the situation is more beneficial to pursuers. When P 1 performs 3D interception, the subgoal of P 1 can be obtained according to our previous works [31,32]. Figure 4(b) shows the calculation of a 3D subgoal by taking a cuboid obstacle as an example. In Figure 4(b), E(t) and P 1 (t) are the current positions of E and P 1 , respectively. E(t + 1) is the next position of E. O 1 is cuboid obstacle and V invisible represents invisible point. D is the intersection of P 1 (t)E(t + 1) and O 1 . Denote the edge that is nearest to point D as FG and the foot point from D to FG as point A. In line segment FG, the intersection of plane E(t)P 1 (t)E(t + 1) and O 1 is point B. When E moves from E(t) to E(t + 1), its state relative to P 1 changes from visible to invisible. en, we have eorem 1.

Theorem 1.
Suppose that P 1 (t)E(t + 1) and FG are two nonintersecting lines in three-dimensional space. Point D is on P 1 (t)E(t + 1) and point A is the foot point from D to FG. B is an arbitrary point on FG and B ≠ A. So ‖P 1 (t)B‖ + ‖BE(t + 1)‖ > ‖P 1 (t)A‖ + ‖AE(t + 1)‖.
Because B is an arbitrary point on FG, eorem I always holds.
Since the study assumes that pursuers will lose the position of E when E hides behind obstacles, E(t + 1) is actually unknown to P 1 and point A cannot be calculated. In the "collaborative intercept" strategy, P 1 will take the position where E probably appears as the subgoal to intercept it. So, in Figure 4(b), the subgoal of P 1 is S 1 and S 2 . Plane E(t)P 1 (t)E(t + 1) first intersects the edge HI of O 1 , so the first subgoal is S 1 . If P 1 still could not see E after arriving at S 1 , P 1 will take the second intersection S 2 between plane E(t)P 1 (t)E(t + 1) and O 1 as a new subgoal. It can be seen that S 1 , S 2 , and B are the points on Table 1: Relationship between the movement of E and waypoints in unit period.

Complexity
the vertical edge of O 1 and in plane E(t)P 1 (t)E(t + 1). Taking the calculation of point S 1 as an example, a general formula is given as follows.
In 3D space, assume that the equation of line HI is Here, (x 1 , y 1 , z 1 ) is the coordinate of point H and (x 2 , y 2 , z 2 ) is the coordinate of point I. c is an intermediate variable. Reorganize equation (5); the equation of line HI is Assume that the equation of plane which is determined by point P 1 (t), E(t), and V invisible is where AA, BB, CC, and D D are known coefficients. From equations (6) and (7), we have Use c in equation (8) to replace the one in equation (6). en the coordinate of intersection S 1 can be got and S 1 is the subgoal of P 1 . Our previous works study a variety of geometries, including rectangle, trapezoid, triangle, circle, and ellipse in 2D and cuboid, sphere, cone, and cylinder in 3D. It can be proved that the subgoal has the characteristic of shortest path [31]. (3) Case 3: the position of evader is completely unknown to pursuers P 1 and P 2 carry out a collaborative search that ensures the area covered by the collaborative visual field of pursuers is as large as possible at the next moment. According to the conclusion of [33], it is difficult to find the optimal solution in the case of incomplete information. So reducing sensing overlap (1) for all adjacent grids next P1 of P 1 do (2) for all adjacent grids next P2 of P 2 do (3) Suppose the current adjacent grid of P 1 and P 2 is next P1(i) and next P2(j), respectively. According to LOS and the method in Section 3, calculate the shortest path set from current position of E to each boundary grid. (4) Denote the shortest path set as Path PE(i, j) ∈ Path PE(i, j)(1), Path PE(i, j) (2) . . . . (5) end for (6) end for (7) for Path PE do (8) For each path in Path PE, compute the number of steps or calculation period required by E to get rid of collaborative visual field according to Table 1 Path PE(1, 1) . . . R Path PE(i, j) . . . . (10) end for Path PE(i, j) . In other words, the path with the lowest risk is the most likely escape path for E. (12) next P1(m i) and next P2(m j) are chosen as the expected position of P 1 and P 2 at the next moment, respectively ALGORITHM 1: Expected position of P 1 and P 2 at the next moment. Complexity 7 8 Complexity between pursuers is beneficial to improving the efficiency of the search which means distributing pursuers. Reference [34] points out that a strategy to capture the rash evader that hides behind a vertex exists if and only if a complete search algorithm (like min-max) can find a solution in the state space of the detection-phase and capture-phase representations up to a given discretization.

Simulation and Analysis
In simulation, the map is designed as Figure 5. In the environment, there are several cuboid, conical, and cylindrical obstacles. e starting points of P 1 , P 2 , and E are [50, 50, 1], [370, 150, 0], and [450, 370, 0], respectively. e relationship of speed between pursuers and evader is V p 1 � 1.5 * V P 2 and V P 2 � V E . P 2 and E move a unit grid in each calculation period. e proposed method solves a pursuit-evasion game in 3D complex nonconvex environment with incomplete information. In particular, pursuer and evader have different speeds and awareness.
ere are fewer researches about the problem of this paper, so finding an appropriate comparison method is not easy. e main reasons are as follows: (1) Most of the pursuit-evasion games are carried out in a barrier-free environment but less in nonconvex environment. Reference [35] introduced a hybrid system that can avoid obstacles in complex area and play a differential game in open area. Both of [12] and [36] studied the case of single obstacle, but their conclusions are difficult to generalize to complex nonconvex environment.
(2) Another research direction of pursuit-evasion game is sensor limitation. Most of these studies focus on how to maximize the efficiency of search [37] or field of view [38]. Generally, these methods only provide the strategy of pursuer but rarely introduce the strategy of evader, so it is difficult to present a complete pursuit-evasion game in simulation.
(3) e study of pursuit-evasion game with two or multiple pursuers is still in the stage of theoretical discussion. Most of these games are derived and evolved from the conclusion of single pursuit-evasion game. ere are no directly applicable algorithms for how the two heterogeneous agents with different maneuverability can complete the collaborative pursuit-evasion game.
In summary, the method of [26] is used for comparative simulation. erein, a visibility-based pursuit-evasion game is studied and a randomized strategy in any simply connected polygonal environment is proposed. e method is suitable for single and multiple agents with different speeds. For simplicity, we will refer to the method of this paper as "METHOD I" and that of [26] as "METHOD II" in the subsequent analysis. In Figure 5, the red and green lines represent the path of P 1 and P 2 in METHOD I, respectively. e black and purple lines represent the paths of P 1 and P 2 in METHOD II, respectively. e path of E is represented by a blue line. e game is divided into 6 stages. Only stages 4-6 of Figure 5(b) use METHOD II. It is because METHOD II uses a random strategy after losing target, which ensures the completeness of algorithm but evader is easy to escape in complex environment due to the lack of heuristic information. METHOD I of this paper will use the algorithm in Section 4.2, Case 3, after losing target and it will maximize the collaborative visual field of pursuers, which improves the efficiency of search. erefore, in order to ensure the continuity of the pursuit-evasion process, P 1 by METHOD II is only used in stages 4-6 of Figure 5(b) and marked by black line. Similarly, P 2 by METHOD II is also only used in stages 4-6 of Figure 5(b) and marked by purple line.
ere is no detailed strategy of evader by METHOD II, so E marked by blue line uses the algorithm of Section 4.1 in the comparative simulation. e 6 stages are as follows: Stage 1: at the beginning, the position of evader is completely unknown to pursuers in Figure 5. So P 1 and P 2 move toward the direction where the collaborative visual field covers the largest area. E knows the positions of pursuers at any time. In order to reduce the probability of being discovered by pursuers, E moves toward the northeast of the map in Figure 5 and takes the cuboid obstacle located at [400, 450] as a shelter. Stage 2: E is still invisible to pursuers in Figure 5. Pursuers maintain the original strategy of collaborative search. According to the search direction of pursuers, E continues to move toward the boundary to win the game. Stage 3: it is similar to stage 2, where pursuers have not found E yet. However, P 1 has almost searched a quarter of map in Figure 5, because V p 1 > V P 2 � V E . Since the situation becomes dangerous, E takes the cuboid obstacle located at [500, 550] as second shelter while keeping moving toward the boundary. Stage 4: in Figure 5(a), E moves toward the third obstacle located at [650, 650], which is closer to the boundary, that is, the condition of victory. Unfortunately, E is discovered by P 2 in the process. Immediately, P 2 informs P 1 about the position of E. In METHOD I, pursuers change the strategy and the situation changes from Case 3 to Case 1 according to Section 4.2. Both METHOD I and METHOD II use linear pursuit at this time, so P 1 and P 2 move toward the direction of E simultaneously in Figures 5(a) and 5(b). Stage 5: when E rounds the third obstacle located at [650, 650] and continues to move toward the boundary, it disappears again from the visual field of pursuers. At this time, METHOD I and METHOD II use different strategies. In Figure 5(a), pursuers by METHOD I change the strategy from Case 1 to Case 2 according to Section 4.2 and perform a collaborative interception. ence, P 2 moves toward the direction where E disappears to continue linear pursuit, and P 1 moves toward the direction where E probably appears to intercept it. In Figure 5(b), both pursuers by METHOD II move toward the direction where E disappears. Stage 6: E is discovered again by pursuers. Since METHOD I and METHOD II use different pursuit strategies, the actions of evader are also changed. In Figure 5(a), in METHOD I, P 1 and P 2 use the proposed strategy of collaborative interception, so E has to move toward the northeast of the map under the eviction of pursuers. It means that the scope of E's action is further compressed and pursuers win finally. In Figure 5(b), in METHOD II, both P 1 and P 2 are on the same side of E, so E uses the obstacle located at [650, 650] to hide itself. en, E tries to get rid of pursuit by circling around the obstacle. Since P 1 is faster than E, evader still loses the game finally. However, as the speed of E gradually increases, if still using METHOD II in the map of Figure 5, the further simulation results show that evader will win the game when E reaches the critical speed 1.2V E � V p 1 . Table 2 shows comparison between METHOD I and METHOD II in terms of visual field and length of path. Since stages 1-3 all use METHOD I of the paper, Table 2 only lists the comparison of stages 4-6. In Table 2, item "visual field" means the average area covered by field of view in current stage. Item "distance" means the distance between pursuer and evader, which is presented by a range. e upper and lower bounds of range mean the maximum and minimum distances between pursuer and evader in current stage, respectively.
Comparing the item "visual field" between P 1 and P 2 in Table 2, it can be seen that the average area covered by field of view of P 2 is generally larger than that of P 1 . It is because the visual field of P 1 is affected by different flight altitude. Hence, when P 1 flies in TF/TA mode in complex 3D environment, the area observed by P 1 is smaller than that of P 2 .
About P 1 by METHOD I in Table 2, after receiving the position of E from P 2 , its strategy changes from Case 3 to Case 1 according to Section 4.2. So, in stages 4 and 5, P 1 immediately reduces the distance from E. In Stage 6, since P 1 has to avoid the obstacle located at [650, 650], the item "distance" of P 1 by METHOD I increases in a short period of time but eventually goes back to 0. As for item "visual field," Figure 5(a) shows that P 1 by METHOD I enters the area without obstacle and makes linear pursuit in stage 6, so the visual field of stage 6 is larger than that of stages 4 and 5.
About P 2 by METHOD I in Table 2, Figure 5(a) shows that it discovers E in stage 4 and begins to make linear pursuit, so the item "distance" of P 2 by METHOD I is gradually decreasing in stages 4 and 5. In stage 6 of Figure 5(a), the obstacle located at [650, 650] affects the pursuit of P 2 and E moves toward the obstacle located at [750, 850], so the item "distance" of P 2 by METHOD I increases in stage 6. As for item "visual field," Figure 5(a) shows that P 2 by METHOD I enters the gap between two obstacles in stage 5, so the visual field of stage 5 is smaller than those of stages 4 and 6.
About P 1 by METHOD II in Table 2, after receiving the position of E from P 2 , Figure 5(b) shows that it directly moves toward the position of E in stages 4 and 5 and chases E around the obstacle located at [650, 650] in stage 6. So the item "distance" of P 1 by METHOD II is gradually decreasing. As for item "visual field," comparing stages 4 and 5 in Figures 5(a) and 5(b), the area through which P 1 by METHOD II flies is relatively more open than that of P 1 by METHOD I, so the visual field of P 1 by METHOD II is larger than that of P 1 by METHOD I in stages 4 and 5. In stage 6, P 1 by METHOD II always pursues around the obstacle located at [650, 650], so its visual field in stage 6 is smaller than that of P 1 by METHOD I.
About P 2 by METHOD II in Table 2, its visual field is larger than that of P 2 by METHOD I in stage 6. e item "distance" of P 2 by METHOD II is smaller than that of P 2 by METHOD I in stages 5 and 6. It is because the moving direction of E in METHOD II is different from that of METHOD I.
Overall, the length of path by METHOD I is shorter than that of METHOD II according to Table 2. From Figure 5(b), it can be seen that if the speed of P 1 by METHOD II is further reduced, pursuers may probably lose the target again and evader wins. erefore, METHOD I is more robust and it can ensure a higher winning probability of pursuers even though the speeds of P 1 and E are relatively close.
Besides the factors of different speeds, the winning probability of pursuers is also related to the initial positions of both pursuers and evader. e result is verified in Monte Carlo simulation: in the map shown in Figure 5, it is assumed that the number of wins and the initial distance between pursuers and evader are all distributed normally and are mutually independent. So the mean is μ 1 � μ 2 � 0 but standard deviations σ 1 and σ 2 are unknown. After 1000 Monte Carlo simulations with different initial positions of P 1 , P 2 , and E, the relationship of standard deviation between METHOD I and METHOD II is σ 1 � 1.38σ 2 . It means that  the pursuers by METHOD I win more often when P 1 , P 2 , and E are in different initial positions.

Conclusion and Further Works
e paper studies a novel collaborative pursuit-evasion game in nonconvex three-dimensional polygonal environment whose pursuers are composed of two heterogeneous agents, that is, UAV and UGV. Evader is intelligent and is able to use obstacles to hide but pursuers will be blinded at this time. So the challenge of the novel game is double constraints of movement and terrain. Since the agents have different speeds, the map is unified by grid and BVP method is used as move generator. en, the worst cases of both evader and pursuers are analyzed. According to the state of evader relative to pursuers, the collaborative strategy is divided into three situations: (1) When evader is in the visual field of pursuers, an algorithm is proposed for maximizing the probability of discovering evader (2) When evader just disappears from the visual field of pursuers, a "collaborative intercept" strategy is proposed based on lion and man problem (3) When the position of evader is completely unknown to pursuers, pursuers will carry out a collaborative search Further works include the following: UAV only provides visual field but does not participate in pursuit. At this time, the optimal visual field and strategy need further research. In addition, the analysis of the impact of different initial position as well as experiment ( Figure 6) is also one of the important works.
Data Availability e data are confidential so they were not uploaded.

Conflicts of Interest
e authors declare that they have no conflicts of interest.