Multi-UUV Cooperative Dynamic Maneuver Decision-Making Algorithm Using Intuitionistic Fuzzy Game Theory

. In this paper, a multi-unmanned underwater vehicle (UUV) cooperative dynamic maneuver decision-making algorithm is proposed based on the combination of game theory and intuitionistic fuzzy sets. Underwater environments with weak connectivity, underwater noise, and dynamic uncertainties are fully considered through intuitionistic fuzzy sets, which solves one of the main problems in making decisions underwater. Subsequently, the intuitionistic fuzzy multiattribute evaluation of a UUV maneuver strategy is conducted, and the intuitionistic fuzzy payment matrix of the cooperative dynamic maneuver game is obtained. Thereafter, the Nash equilibrium condition is proposed to satisfy the intuitionistic fuzzy total order, and the Nash equilibrium maneuver decision-making model under a dynamic underwater environment is established. Meanwhile, the modiﬁed particle swarm optimization method is presented to solve the established problem and ﬁnd the optimal strategy. Finally, an example is used to verify the superiority of the proposed cooperative dynamic maneuver decision-making algorithm.


Introduction
Unmanned underwater vehicles (UUVs) are characterized by their small size, superior maneuverability, low cost, preferable stealth, etc. ey can operate independently or under a manned operation. e multi-UUV control algorithm has gained increased attention, which involves a multi-UUV cooperative formation control, cooperative navigation, cooperative confrontation, etc. [1][2][3][4][5]. At present, research on multi-UUV collaborative formation and navigation has had prosperous developments [2,6]. However, studies on multi-UUV cooperative confrontation are still quite limited. Multi-UUV cooperative confrontation can be applied to marine scientific investigation and military confrontation, including underwater multitarget tracking, surveillance, and detection, effectively increasing the radius of underwater operation and reducing underwater equipment losses and casualties.
Maneuver decision-making is key to multi-UUV cooperative confrontation because it acts as the basic action in each confrontation step [7]. Current studies on maneuver decision-making mainly focus on unmanned aerial vehicles (UAVs) and land unmanned system (LUS) clusters, among others [8]. ere are many studies on single-agent control technology but only a few on multiagent decision-making technologies.
ere are also various studies on unilateral strategy optimization but few on bilateral game theory. Moreover, most references research path planning for a single agent. For instance, in [4], the simulations give a numerical experiment with six agents. erefore, a more scientific and accurate real-time confrontation strategy can be formulated by introducing cooperative game theory into the maneuvering and decision-making of unmanned system clusters [9]. However, Wang et al. have only studied AUV strategies and have not discussed the UAV number in their study.
As opposed to other land or air unmanned vehicle cluster confrontation, multi-UUV cooperative game theory has some unique features. First, UUV has a low communication rate, poor information interaction ability, and weak perception because of the weak connectivity in underwater environments, making it difficult to locate a UUV cluster precisely and restricting the decision-making process. Subsequently, in the confrontation process, not only the antagonistic situation should be considered, but also the cooperation between UUVs in the cluster is required. In addition, underwater confrontation in a real-time situation is dynamic and lasts for several rounds, which makes it more complicated.
Dai et al. used game theory to realize the decisionmaking of noncommunication multirobot (three in one experiments) tasks [10], in which the joint probability distribution of the robots were established according to the distance information, and the dynamic game process with incomplete information was presented. An approximate dynamic programming method was proposed for the oneto-one air combat maneuver problem in [11]. e discrete simulation model of the UAV air combat was analyzed and validated through game theory by Poropudas et al. [12]. Suresh and Ghose synthetically considered the number of UAVs, weapon configuration, and ground defense system, discussed the tactical cooperation of UAVs in ground confrontation, and proposed a four-to-four UAV grouping algorithm based on Dubins' path [13]. e game characteristics between 22 multirobot patrol formation and patrolled objects were analyzed by Hernandez et al. and a distributed dynamic collaboration method based on game theory was proposed in [14]. Dahl et al. proposed the application of space chain scheduling to solve the cooperative game problem in three-to-three multirobot task allocation [15]. Wang et al. studied a cooperative game-based autonomous cluster aggregation strategy for the cluster aggregation behavior of a UAV cluster in implementing reentry target-oriented cooperative surveillance [16]. For underwater confrontation, Muhammed et al. proposed different kinds of game theories for cooperation among acoustic sensor nodes and compared their performances under different conditions [17]. However, these existing studies concentrated on the unmanned aerial and land systems cluster, which have not fully considered the impact of underwater environmental characteristics.
is study focuses on two key factors in underwater maneuver decision-making, namely, its weak interconnection characteristic and dynamic confrontation process. Weak interconnection, including weak connectivity, underwater noise, dynamic uncertainties, leads to the uncertainty payments of the maneuver decision-making process of UUVs. Classical game theory only discusses the game with clear payments [18]. However, in an actual underwater environment, the information provided is mostly fuzzy. If this fuzzy information is converted into a clear value directly, it will lead to distortion and loss of real information. Consequently, the maneuver decision-making algorithm will naturally lose its viability as a strategy choice. erefore, in this study, a cooperative dynamic maneuver decisionmaking algorithm is proposed based on intuitionistic fuzzy game theory. Underwater environments with different kinds of uncertainties are fully considered through the intuitionistic fuzzy sets, which solves one of the main problems of underwater decision-making process. Meanwhile, the intuitionistic fuzzy multiattribute evaluation of the UUV maneuver strategy is performed, and the intuitionistic fuzzy payment matrix of a mobile game is obtained. e Nash equilibrium condition satisfying the intuitionistic fuzzy total order is proposed, and the Nash equilibrium maneuver decision-making model under a dynamic underwater environment is established. Finally, the modified particle swarm optimization method is used to solve the established problem and find the optimal strategy. e general diagram of the maneuver decision-making process is shown in Figure 1. e rest of this paper is organized as follows. Section 2 presents the maneuver attribute evaluation process. Section 3 provides the decision-making model based on intuitionistic fuzzy game theory. Section 4 is the main result of the existence of Nash equilibrium. e cooperative dynamic maneuvering strategy optimization is presented in Section 5. Section 6 shows an example of multi-UUV confrontation. Finally, conclusions are drawn in Section 7.

Maneuver Attribute Evaluation
To establish the fuzzy payoff matrix, the evaluation of multi-UUV maneuver attributes is presented according to the information based on the situation of different confronting sides. e confrontation trajectory of a multi-UUV can be regarded as a combination of multiple maneuver actions.
ere are seven basic maneuver actions in UUVs, namely, keep the pace, speed up, speed down, left turn, right turn, pitch up, and pitch down. It should be noted that these actions might be limited according to the features of the UUV. e two confrontation sides are named as "A" and "D", respectively. e maneuver strategy sets S A and S D of sides "A" and "D" are defined as where a 1 and d 1 denote the strategy "keep the pace," a 2 and d 2 denote the strategy "speed up," a 3 and d 3 denote the strategy "speed down," a 4 and d 4 denote the strategy "left turn," a 5 and d 5 denote the strategy "right turn," a 6 and d 6 denote the strategy "pitch up," and a 7 d 7 denote the strategy "pitch down." Four attributes are considered in the maneuver attribute set M: where H is the distance factor, V is the velocity factor, E is the deflection angle, and F denotes the depression angle. e main difference between multi-UUV confrontation and other confrontations with autonomous robots is the information transmission mode. Due to the submarine environment, the information in a multi-UUV confrontation process is mainly received through underwater acoustics. e shallow water acoustic channel is a channel with time-space-frequency variation [19]. It has a strong multipath interference, high environmental noise, large transmission loss, and notable Doppler shift effect [19]. erefore, the information provided in the multi-UUV 2 Complexity confrontation process has strong uncertainties. It is difficult to accurately quantify the extent of the threat of each side during the decision-making process [20]. Hence, in this paper, each attribute is divided into seven levels by using an intuitionistic fuzzy language. In a practical confrontation, the fuzzy language can be transformed into a certain set to participate in the decision-making process. Because the intuitionistic fuzzy set could measure the degree of fuzziness of the original information more comprehensively, the fuzzy language is transformed into intuitionistic fuzzy sets here [20,21]. e relationship between the fuzzy language and fuzzy sets is listed in Table 1. e importance level l ij of the ith maneuver attribute factor m i relative to the jth one m j is obtained according to experience and practical problems as presented in Table 2 [22]. erefore, the importance level matrix L can be achieved using the following equation: where l ji is the inverse of l ij according to the definition of the importance level. e threat weight w i (i � 1, 2, 3, 4) of each attribute is obtained as (4) A multi-UUV confrontation model generally includes two forms, one is a pure strategy model and the other is a mixed strategy model. When the probability of one of the mixed strategies is 1, it becomes a pure strategy model. In an actual confrontation, both sides need to determine their strategies according to the dynamic information of the confrontation process and then achieve the payoff matrix of both sides. According to equation (1), the dimensions of the maneuver strategy sets S A and S D are both n � 7. us, the maneuver strategy of "A" is a 1 , a 2 , . . . , a n and "D" selects . . , n) can be obtained according to Table 1 to quantitatively evaluate the chosen strategy, where u ij and v ij are the membership and nonmembership degrees [23]. erefore, the fuzzy evaluation matrix under the attribute m 1 , m 2 , m 3 , m 4 , of "A" is expressed as Definition 1. For the intuitionistic fuzzy set f k � (u k , v k ), the weighted arithmetic integration factor (IFWA w ) is defined as where w k is the threat weight which satisfies 0 ≤ w k ≤ 1, 4 k�1 w k � 1. erefore, the fuzzy payoff matrix F can be achieved through the following equation: where

Decision-Making Model Based on Intuitionistic Fuzzy Game Theory
According to the above preliminaries in Section 2, a multi-UUV cooperative dynamic maneuver decision-making model is built in this section based on intuitionistic fuzzy game theory. In actual confrontation, with the change of  Complexity 3 real-time information, it is difficult for both sides to obtain each other's strategy in advance, so it is quite hard to produce the optimal strategy. e main characteristic of game theory is that the action schemes adopted by the participants are interdependent, and the gains depend on the strategies adopted by both the participants and others. en, the optimal solution can also be found under the condition that the information of the opponents is incomplete. e maneuver game under uncertain underwater environment discussed in this paper essentially belongs to the category of two-person zero-sum game [12]. Each of the confronting sides are regarded as players in the confrontation process. Because on the uncertainty of underwater environment and the weakness of interconnection, the players' judgment of the situation is often ambiguous and uncertain. erefore, the two-scale intuitionistic fuzzy set is used to solve such problems.
Let F be the fuzzy payoff matrix, and the players "A" and "D" choose pure strategies a n ∈ S A , d n ∈ S D with probability x n (n � 1, 2, . . . , 7) and y n (n � 1, 2, . . . , 7), respectively, and denote a � (a 1 , a 2 , . . . , a n ) T and d � (d 1 , d 2 , . . . , d n ) T , so we call a and d the mixed strategies of "A" and "D." en, denote as the mixed strategy spaces of "A" and "D." So Γ � (A, X, D, Y; F) is the intuitionistic fuzzy two-player zero-sum matrix game.

Definition 2.
Under the fixed strategy (X, Y), the expected return of player "A" is according to the algorithms of intuitionistic fuzzy sets [24]. Besides, the expected return value of player "A" is e membership degree and the nonmembership degree of the intuitionistic fuzzy expected return represent the acceptance and rejection of the strategy by the players, respectively. Owing to the nature of two-scale conflict, the score function method is usually used to rank the intuitionistic fuzzy sets.
which are the intuitionistic fuzzy sets, η a 1 � u a 1 − v a 1 and η d 1 � u d 1 − v d 1 which are the scores which represent the degrees of the chosen strategy, satisfying the requirements of decision, and h a 1 � u a 1 + v a 1 and h d 1 � u d 1 + v d 1 are the accuracies which represent the accuracies of the chosen strategy, meeting the requirements of decision. en, the total order relation of these fuzzy sets can be achieved as follows: When η a 1 < η d 1 , we call a 1 is less than d 1 , denoted as a 1 ⊆ a 1 ; When η a 1 � η d 1 and h a 1 < h d 1 , we call a 1 is less than d 1 , denoted as a 1 ⊆ a 1 ; When η a 1 � η d 1 and h a 1 � h d 1 , we call a 1 is equal to d 1 , denoted as a 1 � a 1 .

Definition 4. In the intuitionistic fuzzy zero-sum game
if there exist strategy pairs (x * , y * ), (x * ∈ X, y * ∈ Y), for ∀x ∈ X, ∀y ∈ Y which satisfy x T Fy * ⊆ x * T Fy * ⊆ x * T Fy, we call the mixed strategy (x * , y * ) as the Nash equilibrium strategy which satisfies the intuitionistic fuzzy game E.
en, we will study the existence of the Nash equilibrium strategy in the next section.

Main Result Discussion
To analyze the existence of the Nash equilibrium strategy of the intuitionistic fuzzy game E in equation (9), the following Lemma 1 is introduced first.
Lemma 1 (see [18]). ere exists a Nash equilibrium of mixed strategy for a game E, if the strategy space S of the game E is a closed and convex set and payoff function ψ(·) is continuous for any s ∈ S.
Based on Lemma 1, we obtain the existence of the Nash equilibrium strategy of the intuitionistic fuzzy game E described by the following eorem 1.

Theorem 1. For the intuitionistic fuzzy game E, there exists a Nash equilibrium of mixed strategy.
Proof.
e strategy space (X, Y) of an intuitionistic fuzzy game E is mixed. So, for any two mixed strategy s 1 , s 2 ∈ (X, Y) and 0 ≤ λ ≤ 1, it implies λs 1 + (1 − λ)s 2 ∈ (X, Y), which means that the strategy space (X, Y) is a closed and convex set.
Besides, the expected return value (10) is the payoff function of the intuitionistic fuzzy game E. Equation (10) is continuous for any (x, y) ∈ (X, Y). Based on Lemma 1, there exists a Nash equilibrium of mixed strategy for the intuitionistic fuzzy game E.
is completes the proof of eorem 1.
□ Remark 1. Although the existence of the Nash equilibrium of mixed strategy for the intuitionistic fuzzy game E (9) can be ensured, it is difficult to obtain an analytical solution of the Nash equilibrium. us, most research studies try to calculate the numerical solution of the Nash equilibrium by using optimization algorithms. Based on Definition 4, the analysis of optimization algorithms is given as follows.
According to the definition of the Nash equilibrium of mixed strategy for the intuitionistic fuzzy game E in Definition 4, the optimal strategy of "A" is to maximize its intuitionistic fuzzy expected return. On the other side, the optimal strategy of "D" is to minimize its loss. erefore, according to the maximum and minimum theorem of game theory [25], the nonlinear programming model can be used here to find the optimal confrontation strategy: max(ρ, σ) where max(ρ, σ) is the optimal expected return, which satisfies equation (11), and ϑ j is the jth pure strategy of y with the mixed strategy of x. Notations E − (.) and ⊇ are defined in Definition 2 and 3, respectively. Based on Definition 4, optimal expected return max(ρ, σ) and optimal mixed strategy x * could be calculated.
Equivalently, for the mixed strategy y, it has min(ζ, c) where min(ζ, c) is the optimal expected return, which satisfies equation (12), and ϑ j is the jth pure strategy of x with the mixed strategy of y. Based on Definition 4, optimal expected return min(ζ, c) and optimal mixed strategy y * could be calculated.
It is difficult to obtain the optimal solutions of equations (11) and (12). us, how to calculate these two optimization problem equations (11) and (12) is shown in Section 5.

Cooperative Dynamic Maneuver
Strategy Optimization e intuitionistic fuzzy payoff matrix is obtained, and the planning model is established according to the above attribute evaluation. In this section, the optimal maneuvering strategy of a multi-UUV game is achieved through the modified particle swarm optimization (MPSO) method. Variable detection vectors were added to widen the particle exploration space in the proposed MPSO method. Moreover, the learning strategy is improved to aid the particles jump out of the local optimum. Assuming that the problem is in D-dimensional space, the velocity vectors and position vectors are defined as e updated equations of velocity and position can be expressed as where α is the inertial weight coefficient for linear decline, c 1 , c 2 are the acceleration coefficients, rand 1 , rand 2 are the random numbers generated from [0, 1], β Best i d represents the best location for the ith particle (individual optimum), and β Best i d represents the best location in the whole population (global optimum).
In practice, the fitness function should be multimodal. When the particle is trapped in the local optimum, the proposed parameter optimization algorithm should be able to change its original trajectory to adaptively explore a new solution space. To achieve this, the learning strategy is applied in the proposed MPSO method. ere are two key points to be emphasized here. First, to improve the dynamic performance of PSO, a new velocity update equation is Complexity designed. en, a backward learning strategy based on adaptive Gauss distribution is proposed to overcome the blindness in stochastic evolutionary search, which enables particles to escape from the local optimum. It should be noted the proposed MPSO algorithm with the learning ability does not increase the time complexity compared with the original PSO algorithm. e detailed steps of the MPSO with the learning ability are shown in Figure 2.
In recent years, many studies have observed that if the particles converge too fast, they will shrink locally in several generations [26]. is phenomenon leads to a similar search behavior among individuals and loss of population diversity. If the particles are trapped in the local region, it will be difficult to have them jump out of the local optimum because of their similar search behavior and lack of adaptive detection ability. To improve the performance of the PSO algorithm, particles should be able to adaptively change the original trajectory and explore new spaces. e problem here is how to guide the particles to move to different regions, which might become the global optimum, and explore the solution space more extensively. erefore, in this section, an improved method with an adaptive detection vector is proposed as e added detection vector (R(t) − P i d (t)) could help the particles to cover a wider range of solutions with a larger probability through the adaptive variable detection radius R(t): where μ ∈ [0, 1] is a random number, P max d , P min d are the upper and lower boundaries of the problem, λ is a variable parameter, and t represents the iteration times. e speed update equation of the algorithm shows that the group members can explore unvisited regions with high probability in the solution space. e larger detection radius enhances the exploratory behavior of the particle, enables it to leave the current region, and encourages it to search for other regions. A small detection radius enhances the development of particle optimum solutions by searching for a small area near the optimum solution. Hence, the entire feasible solution space can be covered and explored as much as possible using the velocity update equation of the adaptive variable detection vector.

Example
In this section, an example is given to verify the effectiveness of the proposed decision-making algorithm. Suppose "A" and "D" are engaged in an two-vs-two underwater confrontation, which means that there are four UUVs "A1," "A2" and "D1," "D2". Each UUV has seven strategies according to equation (1) Both sides have the same control ability, and the time interval of the confrontation steps is 5 s. It is evident that "D" possesses some advantage at beginning. Notably, the maximum maneuver steps should be decided according to the effectiveness of the UUV used in the confrontation.
ere are 40 steps in the confrontation process whose return values are shown in Figure 3. According to Section 5, the obtained return values show that the Nash equilibrium condition of the intuitionistic fuzzy game is satisfied. Based on Figure 3, this is a very weak dominant strategy equilibrium. In theory, the strategy sets of "A" and "D" are the same, such that their strategy equilibrium is a very weak dominant.
To compare the confrontation performance, "A" employs the cooperative dynamic maneuver decision-making algorithm proposed in this study, and "D" employs the max-min decision-making algorithm during the multi-UUV confrontation process [25]. e three-dimensional confrontation process with five main stages is shown in Figures 4-8. e red dotted line represents the path of "A1," the red solid line represents "A2," and the blue dotted and solid lines represent "D1" and "D2," respectively. e"⋆'' shows the initial position, and"△'' shows the current 6 Complexity position. e confrontation ends when the return value of one side reaches the absolute advantage. For stage 1, the calculated optimal mix strategy of "A" is presented in Table 3, and then, depicted in Figure 4; "D" possesses the dominant position, in which "D1" tries to attack "A1" and "D2" is moving towards "A2". e optimal mix strategy of "A" for stage 2 is calculated and listed in Table 4. As shown in Figure 5, "D1" and "D2" try to attack "A2," and "A1" attempts to turn to escape. Table 5 proposes the optimal mix strategy of "A" in stage 3; in Figure 6, "D1" and "D2" continue to attempt to attack "A2," but "A2" turns to escape, and "A1" turns to return to the confrontation. In stage 4, the optimal mix strategy of "A" is shown in Table 6. "A2" turns continuously and escapes successfully, "A1" also turns and tries to move towards "D1" and "D2," and "D1" and "D2" turn back to "A2" in Figure 7. e situation varies here, in that "A" possesses the dominant position. Additionally, this is validated in Figure 3, in which the return values change from negative to positive. In the end, both "A1" and "A2" possess the dominant positions, such that "A" achieves the absolute advantage and ends the confrontation, which is illustrated in Table 7 and Figure 8. e example validates the effectiveness of the proposed multi-UUV maneuver decision-making algorithm.

Conclusion
In this study, an intuitionistic fuzzy set is introduced into game theory to examine the cooperative dynamic maneuver decision-making algorithm for a multi-UUV. e characteristics of underwater environment including different kinds of uncertainties are expressed using intuitionistic fuzzy sets. e maneuver game model with intuitionistic fuzzy information is established, and the condition of the Nash equilibrium strategy is presented. Combined with the background and model characteristics, the optimal maneuver strategy is obtained using MPSO in each step of the dynamic confrontation process. Moreover, an example of a multi-UUV dynamic confrontation with several maneuver decision-making steps is utilized to show the superiority and effectiveness of the proposed maneuver decision-making algorithm.

Data Availability
e data used to support the findings of this study are currently under embargo, while the research findings are commercialized. Requests for data, 6 months after publication of this article, will be considered by the corresponding author.

Conflicts of Interest
e authors declare that they have no conflicts of interest.

Authors' Contributions
e work presented in this paper was a collaboration of all authors. Lu Liu contributed the idea and wrote the paper. Lichuan Zhang did the strategy optimization and reviewed the paper. Shuo Zhang analyzed the performance of the multi-UUV system. Sheng Cao made the software of the simulation.