UAV Task Allocation Based on Clone Selection Algorithm

With the continuous development of computer and network technology, the large-scale and clustered operations of drones have gradually become a reality. How to realize the reasonable allocation of UAV cluster combat tasks and realize the intelligent optimization control of UAV cluster is one of the most challenging di ﬃ culties in UAV cluster combat. Solving the task allocation problem and ﬁ nding the optimal solution have been proven to be an NP-hard problem. This paper proposes a CSA-based approach to simultaneously optimize four objectives in multi-UAV task allocation, i.e., maximizing the number of successfully allocated tasks, maximizing the bene ﬁ ts of executing tasks, minimizing resource costs, and minimizing time costs. Experimental results show that, compared with the genetic algorithm, the proposed method has better performance on solving the UAV task allocation problem with multiple objectives.


Introduction
With the rapid development of Internet of Things and 5G communication technologies, UAV systems are increasingly used in the military field and UAV operations have become an important part of modern military operations. Single-UAV combat often lacks support, guarantee, and target cover, and it usually requires an overall force to complete combat missions. Task allocation is one of the most important problems that need to be solved in multi-UAV operations, and it directly affects the efficiency and profitability of operations. Finding the optimal solution of the task allocation problem has been proven to be NP-hard, and the solving difficulty increases exponentially with the scale of UAV cluster and tasks. Furthermore, task allocation is often a multiobjective optimization problem, which makes the model more complicated, e.g., simultaneously maximizing the number of tasks to be performed, maximizing the benefits of performing tasks, minimizing time costs, and minimizing resource consumption.
Many methods for solving the task allocation problem have been proposed, which can be roughly divided into four categories: graph theory [1], integer linear programming [2], state space search [3], and Artificial Intelligence (AI) methods such as genetic algorithm, particle swarm algorithm, simulated annealing, and ant colony algorithm [4][5][6][7][8]. Most methods in the first three categories are complete search algorithms. These algorithms can get the optimal solution, but they require a lot of computing resources and time cost, and it is impractical to apply them to a large-scale problem. AI methods cannot guarantee an optimal solution, but they usually can obtain a local-optimal solution [9] within a reasonable period of time. The above algorithms often optimize a certain goal, such as task revenue [10] or time/resource cost [11].
Artificial immune system (AIS) is an emerging research direction of computational intelligence. A clone selection algorithm (CSA) is proposed based on related immune principles [12][13][14]. This algorithm is widely used in function optimization [15] (e.g., multimodal optimization and continuous function optimization), pattern recognition [16,17] (e.g., binary character and face recognition), and scheduling problems [18]. Compared with those complete search algorithms, CSA has some advantages and is convenient for practicality and engineering. At the same time, CSA can be used to solve multiobjective problems. Comparing CSA with the GA [19][20][21], the main difference is the way the population evolves. In the GA, the population evolves through crossover and mutation, and in the CSA, cell reproduction is asexual, with each offspring produced by one cell being an exact copy of its parents, and mutation and selection are made through these offspring. This paper proposes to use CSA to optimize four objectives in UAV task allocation, i.e., maximizing the number of successfully assigned tasks, maximizing the benefits of executing tasks, minimizing resource cost, and minimizing time cost, and comprehensively considers the time constraints, resource constraints, and functional constraints in real-world scenarios. Comparing with the brute-force search algorithm and genetic algorithm, experimental results show that the proposed method could achieve better performance on solving high-dimensional multiobjective task allocation problems.
The remaining sections of this paper are organized as follows: Section 2 introduces the description of task allocation problems; Section 3 present the details of the proposed method; Section 4 presents the experimental results and our analysis; the proposed work is summarized in Section 5.

Problem Description
Task allocation involves many objects and related attributes. This section will give a formal representation of the elements, goals, and constraints involved in task allocation [22,23] to facilitate later expression.

Task Allocation
(1) UAV: as the code of military operations, UAV is expressed as U = fu 1 , u 2 , ⋯, u m g, where u i represents UAV i ði = 1, ⋯, mÞ and m is the number of UAVs. The initial position of the UAV is expressed as Pu = fpu 1 , pu 2 , ⋯, pu m g, where pu i represents the initial position of the UAV i. The ammunition and fuel carried by each UAV are expressed as ResU = fresu 1 , ⋯, resu m g, where resu i represents the number of resources carried by the UAV i.
(2) Task: as a combat task and an indivisible unit, the task is expressed as T = ft 1 , t 2 , ⋯, t n g, where t j represents the task j ðj = 1, ⋯, nÞ and n is the number of tasks. The initial position of the task is expressed as Pt = fpt 1 , pt 2 , ⋯, pt n g, where pt j represents the initial position of task j. The execution of each task requires certain resources to be consumed, which is expressed as ResT = frest 1 , ⋯, rest n g, where rest j represents the resources consumed by task j.
Performing each task will obtain a different task revenue expressed as Reward = freward 1 , ⋯, reward n g, where reward j represents the revenue of executing task j. The validity period of each task is limited by time. Each task has the earliest start execution time early j and the latest start execution time late j ; the executable range of each task is expressed as TR = f ½early 1 , late 1 , ⋯, ½early n , late n g, where ½early j , late j is the executable range of task j. It tasks different time to execute different tasks; the time consumed to execute the task is expressed as Time = ftime i , ⋯, time n g, where time j represents the time consumed to execute task j.
(3) Execution sequence: a task is executed by only one drone. The execution sequence of the UAV is represented as Su = fsu 1 , ⋯, su m g. Among them, su i represents the execution sequence of UAV i, where su i is composed of corresponding tasks, specifically represented as su i = ft x t y t q t z t d g, where 0 < x, y, q, z, d ≤ n. |su i | is the number of tasks executed by the UAV i, and they are, respectively, task x, task y, task q, task z, and task d. su ij is the jth task in the execution sequence of the UAV i. According to the position of the task in the execution sequence, the corresponding task can be expressed as t x = su i1 , t y = su i2 , t q = su i3 , t z = su i4 , and t d = su i5 .
(4) Task allocation: UAV and task are the two subjects of allocation. Since a task can only be performed by one UAV, the allocation relationship can be expressed as A = fa 1 , ⋯, a n g, where a j represents the assignment relationship of the task j. If task j is not assigned to a drone, then a j = NULL, and if assigned, it means the inequality a j ≠ NULL. At the same time, the assignment to the relevant UAV can be obtained one step further, expressed as a j = u i ; it means that task j is assigned to UAV i for execution.

Optimization Objective.
With the basic description of the above basic elements, we further derive the definition of objectives to be optimized in UAV task allocation problems.
(1) Maximize the number of successfully assigned tasks: where a j ≠ NULL is true; the return value is 1; otherwise, the value is 0.
(2) Maximize the benefits of performing tasks: (3) Minimize resource cost: Wireless Communications and Mobile Computing where arrive costðpu i , su i1 Þ represents the resource consumption cost required to execute the first task in the task sequence from agent i to agent.
(4) Minimize time consumption: where time costðsu i , jsu i jÞ represents the time; it tasks for the UAV r i to execute the tasks in the s i sequence.

Constraints.
In real-world scenarios, there are often some constraints in task allocation problems. In this paper, we mainly consider the following common constraints [16]: (1) Time constraint: a UAV can only successfully execute the task when it starts within the executable time range of the task. For the execution sequence Su i of the UAV u i , the constraint is as follows: (2) Resource constraints: UAV tasks are limited by its own resources. For the execution sequence Su i of UAV u i , the time constraints are as follows: (3) Functional constraints: in real scenarios, different types of UAVs have different functions and therefore perform different tasks. This constraint is explained from the perspective of drones and tasks: (i) From the perspective of agents, for i = 1, ⋯, m, The constraint indicates that agent i can only execute task x, task y, task z, and task d due to functional limitations, where 0 < x, y, q, z, d ≤ n.
(ii) From the perspective of tasks, for j = 1, ⋯, n, This shows that the constraint indicates that task j can only be executed by UAV e, UAV f , etc. due to functional limitations, where 0 < e, f ≤ m.

The Proposed Method
The CSA algorithm is a kind of the artificial immune system, which mainly contains ideas such as clone selection, receptor editing, and antibody circulation supplement mechanism and selects mature antibody cells through affinity, and uses a limited gene library to identify endlessly changing antigens. The CSA algorithm simulates immune mechanisms such as clone selection and amplification of antibodies, highfrequency mutations, and receptor editing during the immune response process of the immune system, so that it has strong self-learning, self-organization, and adaptive capabilities. Optimization-related fields are widely used. In this paper, the distribution plan of the UAV is mainly coded into antibody cells, and the final Pareto solution set is obtained through continuous iteration of antibodies [24,25] (described in detail in Section 3.5).

Basic Framework.
In this section, the basic program flow chart of the algorithm will be given, as shown in Figure 1.
It can be seen that the CSA model is relatively simple and convenient for coding operation. The main steps are described as follows: (1) First, randomly initialize the UAV task allocation solution, organize the execution sequence of each UAV, and evaluate the affinity function of each antibody, and set the number of antibodies toN (2) Judging the number of iterations, when it reaches a certain number (maxGen is the maximum number), the algorithm ends the output distribution plan  (2)-(5). This is a simple program description process of the algorithm, and each process will be described in detail below 3.2. Encoding of Antibody. An important part of using evolutionary algorithms is to encode real-world problems into antibodies, which are feasible solutions to allocation problems. For the CSA algorithm, the concept of antibody is very important, and the code of the antibody also determines the actual optimization effect of the algorithm. These solutions are composed of the execution sequence of each UAV. These components are called "genes," and their practical meaning is the actual distribution execution sequence of the UAV. Here is a specific description of the encoding rule [22]. For example, Table 1 shows an example of 3 drones and antibody codes composed of 10 tasks.
The following information can be obtained from the coding example in the figure above. From the functional constraints of the UAV, function constraint U 1 = f1, 7, 2, 4g, function constraint U 2 = f3, 6, 4, 5g, and function constrain t U 3 = f8, 9, 2,10,1g. From the perspective of the functional constraints of the task, function constraint T 2 = f1, 3g, and function constraint T 3 = f2g. This kind of coding is directly based on the functional constraints of the drone, reducing a lot of useless calculations and judgments, which is conducive to improving efficiency. At the same time, it can also get su 1 ⊆ f1, 7, 2, 4g, su 2 ⊆ f3, 6, 4, 5g, and su 3 ⊆ f8, 9, 2,10,1g. We know the actual execution sequence that the UAV finally allocates must be a subset of the UAV's functional constraints. At the same time, there are some shortcomings in this coding. As far as t 2 is concerned, due to its simultaneous existence in the functional constraints of u 1 and u 3 (in simple terms, there is a many-to-many relationship between the UAV and the task), it is difficult for the algorithm to choose who performs the more excellent, simple processing is carried out in the encoding here, and it is stated that the UAV with lower encoding will be executed first; that is to say, if u 1 is not limited in resources and time, t 2 can be executed first, and then t 2 in the function constraint of u 3 fails; if the u 1 constraint is not satisfied, u 3 starts to judge the conditions for executing t 2 .
3.3. Cloning, Mutation, and Selection. The original CSA algorithm selects cloned antibodies according to the degree of affinity. Most antibodies with higher affinity are cloned for mutation in order to produce better individuals; a small number of antibodies with poor affinity are cloned to prevent mutation. The algorithm enters the local optimum to improve the quality of the solution. Such a strategy is in line with the realistic model and has a certain optimization effect; however, because the problem is a multiobjective optimization, each solution may evolve into a Pareto solution. Therefore, under this problem model, we assume that each antibody in each original population will be cloned according to its affinity to change the following individuals.
The mutation operation is performed in the cloned individual. The main operation of mutation is to randomly change part of the gene in the antibody. The algorithm mutates according to the degree of affinity. It is assumed that individuals with higher affinity have higher quality solutions, individuals with higher clone affinity have a lower mutation rate, and individuals with lower affinity have a higher mutation rate. This is because the quality of individual solutions with high affinity has been further optimized. Less variation is to maintain the quality of its own solutions and to explore around the solution at the same time; while individuals with low affinity undergo a lot of variation to let it explore a larger solution space. This also means that individuals with less affinity change fewer gene segments, while individuals with greater affinity change more. The detailed change parameters will be given in the experimental part.

Affinity Function.
The four objectives in Section 2 are normalized, and the corresponding constraints are added to the calculation of the affinity function. When evaluating each individual x, the following affinity function can be used: (1) The value of the first objective can be calculated by the following formula: where x i = i 1 , ⋯, i b and e1ðx i Þ are calculated by and it judges whether v i w can be added to the current execution sequence of u i , x E i (it is the initial empty set) is the current execution sequence of a i , and x E i + i w means adding i w to x E i (2) The value of the second objective can be calculated by the following formula: where The value of the third objective can be calculated by the following formula: where max r represents the maximum amount of resources carried by the drone, e3ðx i Þ = ∑ k w=1 hði w Þ × ðtravel costðp last , p vi w Þ + rescost i w Þ, and p last represents the location of the last task assigned to u i (when w = 1, p last represents the location of u i ) (4) The value of the forth objective can be calculated by the following formula: among them, max t is the longest time to complete and e4ðx i Þ = ∑ k w=1 hði w Þ × ðtravel timeðp last , p vi w Þ + timecost i w Þ It is worth noting that not all tasks encoded in antibodies can be successfully assigned to drones, and the order in which drones perform tasks is implicit in the antibody. Through above transformation, we can input the individual into the four functions in formulas (10)- (15) to facilitate the evaluation of the individual.

Multiobjective Optimal Solution Set.
Since in the case of multiobjective, each individual has multiple attributes (each task is identified as an attribute in the coding here); the comparison between two individuals cannot simply use the size relationship. Therefore, this section will briefly introduce the basis of multiobjective optimization. (1) For all subgoals (referred to as four goals in this article), x is no worse than y, that is, f k ðxÞ ≤ f k ðyÞðk = 1 , 2, ⋯, rÞ (2) There is at least a certain subgoal that makes x better than y. It is expressed as ∃l ∈ f1, 2, ⋯, rg, which satisfies f l ðxÞ < f l ðyÞ At this time, x is called nondominated and y is dominated, where x dominates y. It can be symbolized as x ≺ y. If x and y do not meet the above conditions, it proves that there is no dominant relationship between the two.

Pareto Solution Set.
It can be obtained from the above dominance relationship that all individuals in a population can be sorted by the definition of dominance relationship, but because some solution sets may not have dominance relationships, these solution sets are in the same position. Through these characteristics, a solution set can be obtained. Each individual z in this solution set satisfies (1) Individuals in the population are dominated by it, expressed as z ≺ p, p ∈ A (2) Other individuals cannot dominate it; that is, there is no dominance relationship between the two, expressed as z ⊀ q, q ⊀ z, q ∈ B In (1) and (2), At the end of the experiment, we will select the Paretobased multiobjective optimal solution set as the task allocation plan, in which the subobjectives compared between individuals can refer to the affinity function.

Experimental Results
This section will demonstrate the effectiveness of the method proposed in this article through experimental comparison and conclusion analysis.

Experimental Environment.
The experimental environment of this article is as follows: Window 10 operating system 64-bit professional edition, Intel i7-7600U CPU, clocked at 2.80 GHz, memory 8 G, and the programming environment is Visual Studio 2010.   Wireless Communications and Mobile Computing First, randomly generate relevant agent and task data. In order to facilitate comparative experiments, randomly generate test data in the preparation phase. In order to simulate the experiment, supposing the initial test position of each UAV is pu i = ðx, yÞði = 1 ⋯ mÞ, x and y are random integers between ½0,100Þ; the earliest possible execution time early j is a random number between ½0,100, and the latest execution time late j is a random number between [early j ,150); the time required to execute the task time j is random number between [1,10); the resources consumed to execute the task rest j is a random number between [0,100); the revenue of the task reward j is a random number between [0,1) [22]. For functional constraints, this article assumes that each task must have one or more drones that can perform it. After a certain number of iterations, the overall UAV and task change. We make corresponding changes to the antibody through the change information. In order to facilitate the experiment, it is assumed that the time cost and resource cost of transmission  Table 2.

Comparison of Results of Different Algorithms.
In this paper, the CSA algorithm and the GA algorithm are consistent with the experimental test data. The algorithm parameter environment and parameters are as shown in the subsection. Four sets of task allocation test data of different scales will be carried out to compare the final Pareto set. Figure 2 shows the two algorithms ðm = 5, n = 10Þ, ðm = 20, n = 50Þ, ðm = 50, n = 200Þ, and ðm = 200, n = 1000Þ, respectively. The distribution of Pareto solution set. Table 3 compares the dominance of the Pareto solution set.
It can be observed from Figure 2 that in the comparative experiments of different scales, the number of nondominated solutions of the CSA algorithm is more, and the distribution area of the solutions is wider, which also means that the diversity is better, from the analysis in Table 3. It can be obtained that the solution of the CSA algorithm is generally better than that of the GA algorithm. This trend becomes more obvious with the increase of scale. The main reason is that the mutation strategy of the CSA algorithm is better than the mutation strategy of the GA; the CSA variation dynamically changes individuals with the degree of affinity, while the GA algorithm only performs a small amount of mutation on the basis of crossover and variation range is often small, resulting in a small exploration pace, and the dynamic change of the variation range is beneficial to improve the efficiency of the solution. The experimental results just proved this point.

Results on Different
Ratios of Tasks to UAVs. In this section, the results of CSA algorithm on different ratios of tasks to UAVs are presented, i.e., n = 100, 150, 200, and 250, when m = 50. Since there are many Pareto solutions and it is impossible to compare all the solutions, there is only one of the similar data (i.e., if the difference between two data is within 0.05 in f 1 , f 2 , f 3 , f 4 ). Table 4 compares the experimental results of different ratios.
According to Section 3.4, the smaller the target value of f 1 , f 2 , f 3 , f 4 , the better the optimization effect. In Table 4, it can be seen that in the comparative experiments of different ratios, when the number of UAVs remains the same, as the task scale increases rapidly, the difficulty of solving the problem is further increased. It will lead to a decrease in the number of nondominated solutions and a decrease in quality (the values of f 1 , f 2 , f 3 , f 4 continue to increase), which is in line objective facts. Combining this, in order to maximize the number of tasks successfully assigned, maximize the benefits of tasks execution, minimize the resource costs, and minimize the time costs, the ratio of drones and tasks must be balanced as much as possible. Increase the number of drones as much as possible to optimize the allocation plan, when considering the constraints of many conditions.

Conclusion
In order to solve the problem of UAV combat task allocation, this paper proposes a clone selection algorithm based on the artificial immune system; this algorithm simultaneously optimizes four objectives, namely, maximizing the number of tasks successfully assigned, maximizing the benefits of tasks execution, minimizing the resource costs, and minimizing the time costs. And the effectiveness of method is proved by experimental comparison. In the later stage, the comparison algorithm model will be improved mainly by the highdimensional multiobjective optimization strategy of genetic algorithm (for example, niche strategy and reference point based on hyperplane), and the gene recombination of clone selection algorithm will be added to optimize the CSA algorithm to further improve the quality of solutions.

Data Availability
All the data can be generated according to the steps described in our paper, and readers can also ask for the data by contacting cxj_dna@yeah.net.

Conflicts of Interest
The authors declare that they have no conflicts of interest.