Cooperative Task Allocation Method of MCAV/UCAV Formation

Unmanned Combat Aerial Vehicle (UCAV) cooperative task allocation under Manned Combat Aerial Vehicle’s (MCAV) limited control is one of the important problems in UCAV research field. Hereto, we analyze the key technical and tactical indices influence task allocation problem and build an appropriate model to maximize the objective function values as well as reflecting various constraints. A novel improved multigroup ant colony algorithm (IMGACA) is proposed to solve the model; the algorithm mainly includes random sequence-based UCAV selection strategy, constraint-based candidate task generation strategy, objective function value-based state transition strategy, and crossover operator-based local search strategy. Simulation results show that the builtmodel is reasonable and the proposed algorithm performs well in feasibility, timeliness, and stability.


Introduction
UCAVs being equipped with communication system have been widely used in battlefield under the fierce battlefield countermeasure and complex battlefield environment [1].By means of forming task coalition, MCAVs cooperate with UCAVs to execute combat tasks, and in this way they can exert respective advantages [2].By taking advantages of strong maneuver, outstanding stealth, and low cost, UCAVs get deep into the battlefield front with harsh battlefield environment, carrying out various combat tasks such as battlefield detection, interference, attack, and assessment, while MCAVs mainly provide command and control (C2) outside the combat zone and give full play to MCAV commanders' comprehensive judgment and rational decision-making ability.
MCAV/UCAV cooperative task allocation problem is one of the key technologies in MCAV/UCAV cooperative engagement, and it is a NP-hard problem [3].Firstly, it takes task types, values, and locations into account and then establishes the allocation model which maximizes allocation efficiency function and meets all technical and tactical constraints, and finally it applies the appropriate algorithm to solve the model.Currently, the research on the problem focuses primarily on model building [4][5][6] and algorithm solving [7,8].
In terms of task allocation models, researchers usually abstracted the task allocation problem into existing mature models such as multiple travelling salesman problem (MTSP), mixed integer linear programming (MILP), and vehicle routing problem (VRP).Besides, Air Force Research Laboratory (AFRL) raised a model called capacitated vehicle routing problem with time window (CVRPTW) and applied it to reconnaissance mission planning of Global Hawks and Predators [9].
As far as task allocation algorithms are concerned, they mainly include centralized algorithms and distributed algorithms.The former ones include integer programming algorithm, swarm intelligence heuristic algorithm and evolutionary algorithm, and the latter ones include contract net protocol and auction algorithm, which are based on market mechanism and realize dynamic task allocation through negotiation.In [10], a novel task allocation method based on genetic algorithm (GA) was proposed; it realized quick and effective allocation within a relatively short time, while, in [11], the balance of UCAVs' task load was considered, and the adaptive inertia weight improved particle swarm algorithm was introduced to solve the allocation model; in the end, the business contract mechanism was adopted to realize task coordination.Choi et al. [12] put forward two decentralized algorithms to coordinate a fleet of autonomous vehicles: consensus-based auction algorithm (CBAA) and its generalization to the multiallocation problem, that is, consensus-based bundle algorithm (CBBA), both of which produced conflict-free feasible solutions that were robust to inconsistencies in the situational awareness and communication topology changes via the synchronous interaction of information status.Furthermore, to solve the information transmission chaos problem in asynchronous communication system in [12], an asynchronous consensusbased bundle algorithm (ACBBA) was proposed in [13]; however, it still could not realize cooperation between UCAVs.
Ant colony algorithm (ACA) was first presented by Italian scholar Dorigo and Gambardella [14] to solve travelling salesman problem (TSP), and it has been applied to solve various kinds of combinatorial optimization problems afterwards [15][16][17].The principle of ACA is to simulate the real ant colony foraging mechanism in nature and introduce heuristic information into pheromone release/volatile mechanism, which will speed up the convergence rate of ACA.
The remainder of this paper is organized as follows.A task allocation optimization model of MCAV/UCAV formation, which overall considers various constraints, is built in Section 2, and in this section, mathematical complexity of the model is also analyzed.Based on ACA, we propose a novel improved multigroup ant colony algorithm (IMGACA) in Section 3, and the algorithm includes random sequencebased UCAV selection strategy, constraint-based candidate task generation strategy, objective function value-based state transition strategy, and crossover operator-based local search strategy.By adopting these strategies, the ability of algorithm to eliminate conflicts and escape from local optimum is enhanced.Three simulation cases are designed to verify that the algorithm is feasible, time-effective, and stable in Section 4. At last, in Section 5, a summarization is given, and the future work is prospected.

Model Building
Basic elements in MCAV/UCAV cooperative task allocation problem include MCAVs, UCAVs, and tasks.Under the centralized-distributed command and control structure of MCAV, UCAVs formation attacks enemy targets cooperatively.The set of UCAVs is U = { 1 ,  2 , . . .,   V }, where  V is the number of UCAVs in the set, and the attribute values of   ( = 1, 2, . . .,  V ) mainly include location, value, task load, and velocity, which are represented as   ,   ,   , and   , respectively.The set of tasks is T = { 1 ,  2 , . . .,    }, where   is the number of tasks in the set, and the attribute values of   ( = 1, 2, . . .,   ) mainly include location and value, which are represented as   and   , respectively.There are no nofly zones and unexpected obstacles.
Figure 1 shows a typical task execution process of MCAV/UCAV task coalition.Tasks are mainly executed by UCAVs; only when asked for cooperation request, MCAV provides cooperative guidance support.

Assumptions.
To simplify the problem, three assumptions are given.Assumption 1.For MCAV/UCAV cooperative engagement system, all platforms are not limited to flight range.
Assumption 2. All platforms move in a straight line; that is, there are no platform kinematical constraints.
Assumption 3. The number of tasks   is less than the sum of task loads of all UCAVs: ∑  V =1   =   ; that is, all tasks can be executed.

Definition of Decision Variables and Constraint Conditions.
To build the mathematical model for the MCAV/ UCAV cooperative task allocation, at first, we define the related decision variables.
(i) Allocation variable   , which represents the element in task allocation matrix  = (  )  V ×  ,   = 1, denotes the idea that   executes   , while   = 0 on behalf of the opposite meaning.
(ii) Transfer variable   , which represents the transfer status between tasks,   = 1, denotes the idea that   is assigned to   after executing   , while   = 0 on behalf the opposite meaning.In particular, if  =  is true, then   = 0.
To   and   , if   = 1, then it contains two cases.The first case is that   is assigned to   after executing   , which means   = 1.The second case is that   is assigned tasks for the first time; that is,   has no leading tasks, which means   = 0. Assuming that there is a virtual task,  0 should be assigned to all UCAVs at first, so the constraint relationship between   and   can be expressed as Meanwhile, after executing   ,   can be assigned only one task; namely, In addition, in consideration of task load constraint, task number constraint, and so on, we give task allocation constraints as follows: where formula (3) represents the task load constraint, formula (4) denotes the idea that a task only can be executed by UCAV for once, and formula (5) represents the task number constraint.

MCAV Trust Degree.
Due to the heterogeneous characteristics of UCAV, it is bound to have differences in performance and result when different UCAVs execute the same task.Hence, when decomposing a mission into several tasks, the MCAV commander has his subjective preference, which is defined as trust degree.The value of trust degree depends on task executed quality (TEQ) of UCAVs, drawing on the experience of [18]; TEQ  is defined as where   ( = 1, 2, . . .,  V ;  = 1, 2, . . .,   ) is   's kill probability to   .

Attack Benefit Function.
The definition of attack benefit is the benefit value gained during the execution of tasks, and it is related to the MCAV trust degree.This paper defines attack benefit function as 2.5.Attack Cost Function.When   executes combat tasks, it has got to pay the corresponding costs such as threat cost, flight range cost, and ammunition cost; we will define the concept of these three costs in the following text.
If   's damage probability to   is   , the value of   is   , then the threat cost of   to   can be expressed as  1 , and the calculating formula is In addition, to make the flight range of all UCAVs in MCAV/UCAV cooperative engagement system as short as possible is also an important optimization objective; we define the flight range cost as where   represents the distance from   to   , and  max is the maximum value within   .And lastly, if the different types of ammunition loaded on UCAVs have the similar attack effect to the same task, then it should be possible to choose the ammunition that has lower cost.Suppose the cost of ammunition   loaded on   is   ; the ammunition cost in task allocation is So the total cost is 2.6.Allocation Model.MCAV commanders always want task completion time to be as short as possible, in MCAV/UCAV cooperative engagement system; it is necessary to introduce the influence of task completion time into the model.Suppose the expected task completion time of MCAV commanders is  exp ; then the definition of task completion time as well as aging factor is expressed as where   is the completion time of task   , and  is regulatory parameter and satisfied with 0 <  < 1.
From the above analysis, it can be seen that the MCAV/ UCAV cooperative task allocation is not a single-objective optimization problem but a multiobjective optimization problem.We adopt linear weight sum method to translate it to a single-objective optimization one; therefore, the objective function is defined as where  1 and  2 are weighting coefficient of  and , respectively; we usually take them to be 1.

Mathematical Problems in Engineering
The task allocation model can be expressed as max always to be true.Then, the conclusion just drawn above can be extended to general case; that is, if Theorem 5. When task loads of UCAVs are unlimited, the upper bound of feasible solution number is   = 2(  ,   ).
Proof.According to Assumption 3,   ≤   , so it can be seen that   tasks are allocated to   task loads independently, and the permutation of allocation plan is (  ,   ).So if   tasks are allocated among  V UCAVs, there are two cases: (i) the most extreme case is that all tasks are allocated to one UCAV, and, in this case, the permutation of   tasks is   !; (ii) tasks are allocated to more than two UCAVs; based on Lemma 4, in this case, the permutation of   tasks is less than !.So the total feasible solution number is less than (  ,   ) × (! + !) = 2(  ,   ) × ! = 2(  ,   ).

Establishment of Job-Division Ant Colony.
There are obvious differences between cooperative task allocation problem and TSP, and the latter one can be solved effectively by traditional ACA.The main differences are listed as follows.
(i) Task execution has parallel characteristic; that is, tasks are likely to be executed by UCAVs simultaneously, and it is difficult for traditional ACA to reflect the influence of such parallel characteristic.
(ii) Platform has collaborative characteristic, and there exists task and resource collaboration among different UCAVs; traditional ACA is difficult to handle this kind of collaborative relationship with mutual coupling.
(iii) Task execution has aftereffect characteristic; that is, due to the presence of various constraints, current task execution order will affect the subsequent task, and traditional ACA is difficult to solve this kind of aftereffect characteristic.
So, based on the analysis to MCAV/UCAV cooperative task allocation, UCAVs are mapped to ant subgroups in ant colony, ants belonging to different subgroups with the same serial number make up ant subclusters, and each cluster realizes implicit communication via pheromone interaction to generate the allocation plan.
Suppose the number of subgroup of artificial ant colony is  SG =  V ; for   , the number of ants in subgroup is .The set of   's ant subgroup is SG  = {ant  ,  = 1, 2, . . ., }; all ant subgroups form ant colony system G = {SG  ,  = 1, 2, . . .,  SG }.Note that ant  and ant  belonging to different ant subgroups SG  and SG  , respectively, will influence each other while generating their own task allocation plan; to avoid this situation, we establish  local tabu table Tabu = {tabu 1 , tabu 2 , . . ., tabu  }, and each tabu table is written in the task index that has been executed by other ants and in this way can solve the repeated execution problem for tasks.
Above all, job-division ant colony can be shown as ant matrix in Figure 2, the matrix row represents all ant subgroups, and the matrix column represents all ant subclusters SG  = {ant  ,  = 1, 2, . . .,  V }.

Random Sequence-Based UCAV Selection Strategy.
When selecting specific UCAV to execute task, deterministic selection sequence will lead the search algorithm to trap in local optimum, so, to guarantee the solution diversity in the algorithm solving process, random sequence is adopted to select UCAV.Moreover, to avoid conflict, it is necessary to exclude that the UCAV has reached maximum task load.chooses candidate task in the set allow T(ant  ), it should screen out those tasks that do not meet only in this way can computational scale and complexity be reduced.

Constraint-Based Candidate Task Generation
Based on the above analysis, a candidate task set generating algorithm (CTSGA) is proposed; the details are shown in Algorithm 1.

Objective Function Value-Based State Transition Strategy.
When ant  selects a task in allow T(ant  ), which one to be selected and how to be selected are significant issues.In this paper, pseudo-random proportional rule is considered to be a reasonable method and it can be expressed as where   () is pheromone value from  to  in th iteration and   () is heuristic information value from  to  in th iteration, while  and  represent pheromone factor and heuristic information factor, respectively, reflecting the relative importance of pheromones and heuristic information in solving process to get feasible solutions.And  0 is a constant value that lies between 0 and 1, while  is a random number that obeys uniform distribution in [0, 1].
In other words, ants will select the optimal path for state transition with probability  0 and will explore a new path in the roulette way with probability 1 −  0 . in formula ( 15) is a selected task in random proportional rule shown as follows: ACA has the positive feedback, self-organization and distributed characteristics; the positive feedback mechanism especially will lead search algorithm to trap in local optimum in later iterations [19].Aiming at this problem, an adaptive adjustment strategy for task selection is proposed: at the earlier stage of iteration, to improve the computational convergence, we set smaller candidate tasks size, while at the later stage of iteration, to ensure the diversity of solutions, we set larger candidate tasks size.
The concept of task selection window, which can be utilized to shift between intensification and diversification [20], is applied to set the size of candidate tasks dynamically.When ant  selects a specific path at a certain probability, it does not mean that all candidate tasks in allow T(ant  ) will have the chance to be selected as the next executed task; specific steps are as follows.
Step 1. Sort all tasks in allow T(ant  ) on the basis of aggregation operator of pheromone and heuristic information value.
Step 2. Judge if current number of iteration has reached half of the total number of iterations; if so, set  1 = 0.5; otherwise, set  2 = 0.9.
Step 3. On the basis of formula (17), it generates updated size of candidate task set, where  is filtration factor, the value of  is assigned  1 or  2 depending on the iterative process, and ⌈⋅⌉ is rounded-up operation.Note that if allow T(ant  ) contains only one task, then it will be selected immediately.After all ant subclusters finish a round of searching, the feasible solution space Θ  ( = 1, 2, . . ., ) is formed, and we utilize neighborhood heuristic knowledge to direct its local search; the steps are as follows.

󵄨 󵄨 󵄨 󵄨 󵄨 allow T (ant
Step 1. Sort all objective function values in U − T allocation matrix where   = 1, and seek out the   −   pair which has the minimum objective function value.
Step 2. Find out UCAVs whose remaining task load is greater than 0 in U except for   , then put these UCAVs into the set FY, and, at last, calculate and record the element number  of FY.
Step 3. Allocate   to UCAV in FY randomly; if it satisfies the idea that the variation of objective function value Δ swap (  ,   − ,   ) > 0, then change   's execution platform from   to   − ; otherwise, stay the same.
Step 4. Update FY, and repeat Steps 1 − −3 for  times;  is a positive integer, and its usual value is set to be 3.

Pheromone Update Strategy.
In multigroup ant colony search system, only iteration-best ant can update global pheromone; the goal is to make the search process more targeted and guide ants to search near the optimal solution in the current iteration.So when IMGACA updates global pheromone, it mainly adopts ant-density system update strategy, and the updating formula is defined as where  is a constant value and  glo is a global update factor, and it is satisfied with 0 <  glo < 1.
When updating local pheromone, in order to guide ants to search near the solution where task completion time is shorter, it mainly adopts ant-quantity system update strategy, and the principle is to release more pheromone in such search path.

Simulation Experiment
Table 1 lists the task information including location and value.
Table 2 lists the UCAV information including location, value, velocity, and value of ammunition loaded on UCAVs.

Feasibility Analysis.
Set the number of UCAVs as 6 and that of tasks as 18; besides, the matrices representing the kill probability of UCAVs to tasks and the damage probability of tasks to UCAVs are AT and DA; they are, respectively, set to be as follows.
We make a comparative simulation between simulation scenarios with limited (task load vector is TL = [2, 2, 2, 2, 2, 8]) or unlimited task load for all UCAVs; the iteration number is 600, and, as shown in Table 3, is the simulation results in the two cases.
It is observed that when task load of UCAVs is limited, the task allocation solution is not an optimal one, and when Figures 3-5 give, respectively, comparison results of the objective function value, task completion time, and flight range of all UCAVs between the two cases.It can be seen from Figure 3 that when task load of UCAVs is unlimited, the final objective function value is superior to that of another one; it is because the feasible solution space of the former case is larger, and it is more likely to find out a better feasible solution, while the convergence rate is slower.From Figures 4 and 5, we can draw the conclusion that the task completion time of the case with unlimited task load is much shorter, while the flight range is longer.This is because, with limited task load, the task allocation is unbalanced, tasks are mainly executed by  6 , and the task completion time is relatively long.

Timeliness Analysis.
Select the general situation that task load is limited (task load vector is TL = [4,5,4,3,4,3]), set the simulation time to be 50 s, and take an average for the   data to compare IMGACA with GA by running 10 times, respectively.
From Table 4, we can see that when simulation time is 50 s, IMGACA can iterate for 230.2 times on average, while GA is 842.3 times; this is due to GA's algorithm design which is simple, so it can iterate for more times in unit time, because IMGACA adopts better strategy to carry out state transition, while GA adopts random generation method, so the initial value of IMGACA is superior to that of GA.When it comes to the final objective function value, IMGACA can reach 492.61   and GA is 426.28,so, at the same time constraint, IMGACA outperforms GA with better find-best ability and efficiency.

Stability Analysis.
According to Theorem 5, mathematical complexity of the model is high, so IMGACA and GA are difficult to converge to the global optimal solution.To prove the stability of IMGACA, we design two cases to illustrate: (i) in small problem scale, IMGACA can converge rapidly to a fixed value (it is very likely to be global optimal solution); (ii) in large problem scale, the variance of all final objective function values of IMGACA is small.As shown in Figure 6, the convergence process is in small problem scale for 10 times.From Figure 6, it can be seen that in small problem scale, for example, 3 UCAVs, 8 tasks (select  1 - 3 ,  1 - 8 in Tables 1 and 2), IMGACA can always converge to 188.74, and we can draw a conclusion that IMGACA has excellent convergence ability.
However, in large problem scale, we use the simulation scenario in Section 4.3 (6 UCAVs, 18 tasks).After calculation, the variance of 10 final objective function values of IMGACA is 17.56, and that of GA is 135.40.Hence, compared to GA, IMGACA exhibits more stability in convergence.

Conclusions
Aiming at the UCAV cooperative task allocation problem under MCAV's limited control, we establish the task allocation model considering attack benefit, attack cost, and task completion time comprehensively and propose a novel algorithm called IMGACA to solve the model.The simulations reveal that IMGACA performs quite well in some aspects such as feasibility, timeliness, and stability.
The future work is to study the dynamic adjustment problem of task allocation plan existing parameter and event uncertainties.
Strategy.By formulas (3),(4), and(5), in the MCAV/UCAV cooperative task allocation problem, constraints mainly include load constraint of UCAVs and number constraint of tasks.To avoid the generation of infeasible solutions brought by blindness or randomness of search algorithm and further result in performance degrading of algorithm, when ant  Input: TA(ant  −  ), allow T(ant  ) = ø, T, TL  ; Output: allow T(ant  ) (1) procedure Select ant  and task set T (2) if |TA(ant  )| <   then (3) if   ∉ TA(ant  −  )then (4) allow T(ant  ) = allow T(ant  ) ∪ {  }, ∀ ∈ [1,   ] (5) end if (6) end if (7) end procedure Algorithm 1: CTSGA for ant  based on constraint condition.Where | ⋅ | denotes the cardinality of the vector, TA(ant  ) and TA(ant  −  ) denote the task set whose task element has been executed by ant  and other ants in SC  except for ant  , respectively.

Figure 2 :
Figure 2: Ant colony based on the job-division mechanism.

Figure 3 :
Figure 3: Comparison figure on objective function value.

Figure 4 :
Figure 4: Comparison figure on task completion time.

Figure 5 :
Figure 5: Comparison figure on flight range of all UCAVs.

Figure 6 :
Figure 6: Convergence process in small problem scale.
Crossover Operator-Based Local Search Strategy.ACA has strong global search ability, but search efficiency is low; therefore, local optimizer is introduced to raise its solution precision and convergence rate.

Table 4 :
Comparison result between IMGACA and GA in 50 s.