A Hybrid Discrete Grey Wolf Optimizer to Solve Weapon Target Assignment Problems

We propose a hybrid discrete greywolf optimizer (HDGWO) in this paper to solve theweapon target assignment (WTA)problem, a kind of nonlinear integer programming problems. Tomake the original grey wolf optimizer (GWO), which was only developed for problems with a continuous solution space, available in the context, we first modify it by adopting a decimal integer encoding method to represent solutions (wolves) and presenting a modular position update method to update solutions in the discrete solution space. By this means, we acquire a discrete grey wolf optimizer (DGWO) and then through combining it with a local search algorithm (LSA), we obtain the HDGWO. Moreover, we also introduce specific domain knowledge into both the encoding method and the local search algorithm to compress the feasible solution space. Finally, we examine the feasibility of the HDGWO and the scalability of the HDGWO, respectively, by adopting it to solve a benchmark case and ten large-scale WTA problems. All of the running results are compared with those of a discrete particle swarm optimization (DPSO), a genetic algorithm with greedy eugenics (GAWGE), and an adaptive immune genetic algorithm (AIGA).Thedetailed analysis proves the feasibility of theHDGWO in solving the benchmark case and demonstrates its scalability in solving large-scale WTA problems.


Introduction
As a classic military operational problem, the purpose of weapon target assignment (WTA) is to find an optimal or a satisfactory assignment solution, which determines the target attacked by each weapon, so as to maximize the total damage expectancy of hostile targets or minimize the loss expectancy of own-force assets.Since first proposed in the 1950s to support operation planning, command officers training, and weapon selection and acquisition [1], the WTA problem has attracted the attention of relevant researchers for decades.From the mathematical point of view, it is essentially an NPcomplete problem [2] with nonlinear objective functions, discrete decision variables and multiple constraints.These intricate features indicate that seeking out the optimal solution for small-scale WTA problems is relatively realistic, but becomes impracticable for large-scale ones.This means in such circumstances searching for a satisfactory or suboptimal solution may be more efficient.
In the WTA study, the process of solving WTA problems can generally be divided into two phases, establishing a WTA model, and finding an optimal or a suboptimal solution for the model through an appropriate algorithm.In modeling, a variety of factors need to be considered, such as the type and quantity of equipment involved, the combat capability of each equipment, the characteristics of the battlefield, and the focus of the commander.In addition, for the convenience of modeling, certain assumptions may be made.Conventional WTA models include up to three types of entities, platforms, weapons, and targets.However, considering that weapons are increasingly dependent on the information provided by sensors in modern warfare, Bogdanowicz et al. [3,4] believed that sensors should be considered in WTA models and established the sensor-WTA (S-WTA) model for the first time.They simplified the S-WTA problem through the sensor/target/weapon decomposition to the following scenario, each sensor-weapon pair can be assigned to at most one target, and each target can be engaged by at most one sensor-weapon pair.Furthermore, through the sensor/weapon/target augmentation, they translated the derived problem into a symmetric optimization problem with an input consisting of the same numbers of sensors, weapons, and targets, along with the benefit matrix, and presented the Swt-opt algorithm derived from the auction algorithm to optimally assign sensors and weapons to targets.Since then, the S-WTA problem has been successively studied by some scholars.Li et al. [5,6] proposed an improved Swtopt algorithm to solve the same problem by combining the consensus algorithm, so that the shortcoming of the Swtopt algorithm, i.e., highly depending on perfect network topologies, can be overcome.Later, they put forward a decentralized cooperative auction algorithm for a similar S-WTA problem where the number of targets that can be struck varied in different operational phases [7].Chen et al. [8] proposed a particle swarm optimization based on genetic operators to solve the S-WTA problem where each sensor can guide only one weapon once and each target can be engaged by multiple weapons once.Mu et al. [9] proposed a multiscale quantum harmonic oscillator algorithm to solve the S-WTA problem in intelligent minefields considering the probability of detection and killing.Xin et al. [10] constructed an efficient marginal-return-based heuristic to solve the S-WTA problem considering the situation where multiple sensors/weapons can be assigned to one target, but each sensor can detect only one target at the same time and each weapon can shoot only one target simultaneously.The proposed heuristic exploited the marginal return of each sensor-weapon-target triplet and dynamically updated the threat value of all targets.It relied only on simple lookup operations to choose each assignment triplet, thus resulting in very low computational complexity.
Both the conventional WTA model and the S-WTA model can be classified as follows.For ease of presentation, the two models will not be distinguished, both called the WTA model.A WTA model is called to be dynamic [10][11][12], if the operational process is time related.This indicates there are always several operational stages or new situations usually arise during operation.Otherwise, the model is static [3-6, 8, 9, 13-16].If two or more objectives, such as maximizing the damage expectancy of hostile targets and minimizing the consumption of own ammunition and mission completion time, are considered, the WTA model is multiobjective [17,18].If there is only one objective, it is single-objective [3][4][5][6][7][8][9][10][11]19].Depending on the type of combat scenario, the WTA model may be asset-based [11] or target-based [17].In defensive missions, the asset-based model is often established to minimize the loss expectancy of own-force assets, while in offensive missions, the target-based model is always adopted to maximize the total damage expectancy of hostile targets.In addition to modeling, another research aspect for WTA is to develop algorithms to solve the problem optimally in the least possible time.These algorithms can be generally divided into two categories: exact algorithms and heuristic algorithms.Exact algorithms are developed according to the specific mathematical properties of WTA problems and can gradually eliminate the nonlinearity of the problem through transformation, decomposition, and other processing means [16].By this way, the model is translated to a linear one, and then classical operations research methods can apply, such as the dynamic programming method [12], the branch and bound method [20,21], the branch and price method [22], the mixed integer linear program (MILP) algorithm [17], and the Lagrange relaxation method [14].These methods have demonstrated the feasibility and effectiveness on solving static and dynamic WTA problems but may become difficult to apply when a large number of weapons and targets are involved [23].
With the rapid development of heuristic algorithms, more complex WTA problems are able to be well solved.Most research to date on solving WTA problems by heuristic algorithms either constructs some specific search rules based on the properties of the problem to achieve solutions rapidly or introduces some local search mechanisms into the original algorithms to improve the solution quality.These algorithms, including auction algorithms [3][4][5][6]15], improved genetic algorithms [18,[24][25][26], clonal selection algorithms [27,28], particle swarm algorithms [8,13,29], tabu search algorithms [30], rule-based constructive heuristic algorithms [10,31], and other intelligent optimization algorithms, have shown evident advantages over traditional methods in terms of computation time and solution accuracy and however still suffer from some drawbacks, such as easily falling into premature convergence and local optimum [32].Furthermore, considering the variability of the battlefield environment, decisions always need to be made immediately; that is, WTA problems need to be resolved in a very short time.Therefore, we propose in this paper a hybrid discrete grey wolf optimizer (HDGWO), which is an integration of a discrete grey wolf optimizer (DGWO) with a local search algorithm (LSA).Moreover, in order to enhance the efficiency of the algorithm, we introduce the specific domain knowledge into the encode methods and the LSA.
The rest of this paper is organized as follows.The mathematical model of a typical WTA problem for an offensive mission is formulated in Section 2. Section 3 is a brief introduction to the original grey wolf optimizer (GWO) presented in [33], and Section 4 is a detailed introduction to the proposed HDGWO.In Section 5, we first analyze the feasibility of HDGWO in solving small-scale WTA problems by comparing it with the discrete particle swarm optimization (DPSO) [13], the genetic algorithm with greedy eugenics (GAWGE) [24], and the adaptive immune genetic algorithm (AIGA) [32] and then investigate the scalability of HDGWO by solving ten different scale WTA problems.Finally, we make a conclusion and look ahead to the future research in Section 6.

Problem Formulation
In general, a typical WTA problem for an offensive mission can be formulated as the following nonlinear integer programming model [13,32]:

𝑎:
The control parameter determining which rule will be adopted to update solutions;  max : The maximum iteration number; : The elitist retention proportion; Δ  : The perturbation value, used to modify the value of decision variables in local search; : The selection probability, determining whether a solution is to be selected for local search; subject to The symbols in the model are explained in Table 1.For clarity, we also list the symbols employed in the following in this table.
In the above model, the objective is to maximize the damage expectancy of all targets.Thus, we construct the above objective function (1), which consists of two parts connected by a plus sign.The part before the plus sign represents the total expectancy of the eliminated value from all targets, and the latter part indicates the penalty for the solution that violates the constraint set (3).If a solution violates the constraint set (3), then the value of the latter part will become negative, which reduces the objective value.Thus, the solution will be very likely to be discarded in the next iteration.The constraint set (2) limits the number of missiles launched by each weapon platform to no more than the available number.The constraint set (4) means that decision variables are nonnegative integers.The constraint set (5) reflects the feasibility of engagement between weapon platforms and targets.When   is unable to attack   , i.e.,   = 0, it should not launch any missiles to   , i.e.,   = 0.

Grey Wolf Optimizer (GWO)
The grey wolf optimizer (GWO) is a swarm intelligence algorithm first proposed in 2014 by Seyedali Mirjalili et al. and mimics the leadership hierarchy and hunting mechanism of grey wolves in nature [33].There are four types of wolves from top to bottom, denoted by , , , and , respectively, in the leadership hierarchy.The former three types of wolves, , , and , guide the hunting process, which is followed by the  wolves.The GWO algorithm employs two operators to achieve optimization: (1) encircling prey and (2) hunting, as introduced below [33].
3.1.Encircling Prey.Grey wolves will first encircle prey during the hunting process.The encircling behavior is modeled by formulas ( 6)- (7) as follows: The position update process in GWO [33]. where where elements of  →  are linearly decreased from 2 to 0 throughout the iteration process and  →  1 and  →  2 are random vectors in [0, 1].

Hunting.
In the GWO algorithm, the , , and  wolves are supposed to have more knowledge about the potential location of prey.Therefore, the first three best solutions obtained so far (i.e., the ,  and  wolves in the current iteration) are saved and employed to assist the other wolves in updating their positions.The process is mathematically described by formulas ( 10)-( 12) as follows:  is the position vector of a grey wolf.The position update process according to , , and  in a 2D search space is shown in Figure 1.The , , and  wolves first estimate the position of the prey, and then other wolves update their positions randomly around the prey [33].
From formulas ( 6)-( 12), we find that vectors  →  and  →  are two parameters to control the exploration and exploitation ability of the GWO algorithm.When || > 1, the algorithm devotes half of the iterations to explore the search space; when || < 1, the rest of the iterations are dedicated to exploitation [34].As can be seen in formula (9), elements of  →  are in the interval [0, 2], which provide random weights for prey in order to stochastically emphasize ( > 1) or deemphasize ( < 1) the effect of prey in determining the distance in formula ( 6).This mechanism gives the GWO algorithm a

Solution Weapon platforms
Targets The decimal integer encoding method for position vectors [32].
better random search ability throughout the optimization process, which is beneficial to exploration and local optimum avoidance [33].

Hybrid Discrete Grey Wolf Optimizer (HDGWO) for WTA
The two operators introduced above are originally developed for solving continuous optimization problems.Thus, we cannot directly employ the GWO algorithm to address WTA problems, because the search space is discrete and decision variables are nonnegative integers.And in the field of WTA research, the application of GWO has not yet been found.
For the above reasons, we first modify the original GWO to a discrete one by a decimal interencoding method and a novel position update method and then combine it with a local search algorithm (LSA).Thus, we propose a hybrid discrete grey wolf optimizer (HDGWO).The details of the HDGWO are described in the following subsections.

Decimal Integer Encoding Method.
The encoding method is an important procedure before applying the proposed HDGWO to a particular problem.It constructs a bridge between solution space of the considered problem and the search space of the search process.Therefore, designing an appropriate encoding method is a very essential issue that affects the performance of the algorithm [35].
Considering that all decision variables are decimal nonnegative integers (see constraint (4) in Section 2), we adopt the decimal integer encoding method in [32] to represent the position vector of a grey wolf (in the following sections we will use the term "solution" instead of "position vector").For the ith weapon platform   , we use a vector containing N elements to represent its assignment scheme, where the jth element corresponds to the number of missiles launched by   to   .So for M weapon platforms, we have M similar vectors.Then, we connect these vectors head to tail in the order of the assignment scheme of  1 to the assignment scheme of   and obtain a solution, as shown in Figure 2 [32].
During the construction of an initial solution, the constraint sets (2), (4), and (5) will be handled, so the search space could be effectively compressed.We do not consider the constraint set (3) here, because on one hand, we have taken into account it when constructing the objective function (1); on the other hand the initialization method would become very complicated if the constraint set (3) should be satisfied.Although we cannot guarantee that all the solutions generated are feasible, the penalty function in the objective function (1) can help to eliminate the solution that violates the constraint set (3) throughout the iteration process.In addition, to improve the quality of initial solutions, we introduce the specific domain knowledge that available missiles should be first assigned to the target with the largest probability to be destroyed.The pseudocode of the initialization method is shown in Algorithm 1.

Fitness Function.
The fitness function is used to evaluate the goodness of a solution with respect to the original objective function.For the convenience of solution comparison, we set the fitness function as the same as the objective function introduced in Section 2. The fitness value of solution  is computed as follows: 4.3.Searching for Prey.On the basis of the encoding method, the initialization method, and the fitness function, we can evaluate the goodness of each solution (wolf) and seek out the first three best solutions (the ,  and  wolves).Then the next core procedure is to search for prey.In HDGWO, taking into account the constraints in Section 2 and the characteristics of the encoding method, we propose a novel position update method, namely, the modular position update method.In addition, we also adopt the elitist retention mechanism which is usually used in the genetic algorithm, and present a local search algorithm (LSA) to avoid local optimum.The details are successively introduced in the subsequent sections. .However, the specific position update mechanism is much different from that of the original GWO.Here we propose a modular position update method, which treats the assignment scheme of each weapon platform as a module.We first define the concept of a module as the assignment scheme of one weapon platform and then denote a solution in the tth iteration by

Modular Position
is the ith module of  →  () (i.e., the assignment scheme of Generate a random integer, denoted by n rand, between 1 and q max; Assign n rand weapons to  IN() ; Update q max= q max-n rand; endif endfor endfor endfor Algorithm 1: Pseudo-code of the initialization method.
).Based on this, the modular position update method is formulated as follows: where  →   ( + 1) is the ith module of  →  ( + 1) (i.e., the assignment scheme of   in the (t+1)th iteration),  → ) is a reassignment scheme of   ,  is a random number in (0, 1), and  is a control parameter.The mechanism of the modular position update method is illustrated in Figure 3, and the pseudocode of the method is described in Algorithm 2.
As can be seen from Figure 3  is generated through a reassignment method, which is similar to the initialization method in Section 4.1 and is also based on the constraint sets (2), (4), and (5).Unlike the initialization method, the reassignment method directly assigns a random number of missiles to each target in the order of  1 to   .
Whether  →   ( + 1) is generated through the roulette wheel selection method or the reassignment method, the constraint sets ( 2), (4), and ( 5) are always satisfied during the position update process.
In the proposed HDGWO, the control parameter  determines which rule will be adopted.Based on the search mechanism of the original GWO in the continuous solution space, we set  as follows [35]: where  and  max are the current iteration number and the maximum iteration number, respectively.It can be seen that as  increases,  linearly decreases from 1 to 0. Therefore, the modular position update method generates  →   ( + 1) ().Then in the second half, the modular position update method acquires a larger probability to generate  →   ( + 1) through the reassignment method, hence giving other solutions an increasing probability to diverge from the prey.In summary, throughout the process of searching for prey, the proposed HDGWO mainly focuses on exploitation in the first half of the iterations (when 0.5 ≤  ≤ 1), and the second half of the iterations is mainly devoted to exploring the search space (when 0 ≤  ≤ 0.5).

Elitist Retention Mechanism.
The elitist retention mechanism is employed to select a certain proportion, denoted by , of the best solutions (wolves) in the current iteration to be a part of the next iteration directly without their positions updated.By this means, excellent solutions (wolves) are able to be preserved during the iterations  x il and thus the convergence speed of the algorithm gets enhanced.

Local Search Algorithm (LSA).
In the proposed HDGWO, we proposed a local search algorithm (LSA) based on a perturbation operator [36] to exploit the neighborhood of a solution.The mechanism of the perturbation operator is illustrated in Figure 4. Firstly randomly select a solution to be modified, denoted by   .Secondly randomly select one weapon platform   from the total M ones, of which the assignment scheme will be adjusted.Thirdly, randomly choose two of its targets   and   (,  = 1, 2, ⋅ ⋅ ⋅ , ;  ̸ = ), satisfying   =   = 1 and   ,   > 0. Finally, perform the perturbation operation.If   ≥   , modify   to   − Δ  and   to   + Δ  , and vice versa.Note that Δ  is called the perturbation value, used to adjust the value of decision variables, and is set as 1 in this paper.After the perturbation operation, we obtain a new solution    based on   .
Considering that local search is relatively time consuming, and the search process in HDGWO is guided by the ,  and  solutions (wolves), we just apply the LSA to these three solutions with a selection probability  that determines whether a solution is to be selected as the base solution   .The pseudocode of the LSA is described in Algorithm 3.

Programming Procedures.
The pseudocode of the proposed algorithm is specified in Algorithm 4.

Simulation Results and Analysis
This section aims to validate HDGWO in solving WTA problems and includes two parts.In the first part, we adopt the instance in [32], an asset-based defensive WTA problem with four weapon platforms and five targets, as a benchmark case, and then compare the HDGWO with the discrete particle swarm optimization (DPSO) [13], the genetic algorithm with greedy eugenics (GAWGE) [24], and the adaptive immune genetic algorithm (AIGA) [32] by solving this benchmark case, so that the feasibility of the HDGWO to solve WTA problems could be demonstrated.To evaluate the solution quality, we also solve the case by the global solver of Lingo 11.0, a professional software package for nonlinear programming problems, and use the running result as a reference.In the second part, in order to examine the scalability of HDGWO, we first employ the above four algorithms to solve ten different scale WTA problems produced by a test case generator, and then compare the results of all algorithms from three aspects, that is, the comparison of the solution quality, the comparison of the

Input:
Combat scenario parameters, including M, N, Necessary parameters of HDGWO, including the population size   , the maximum iteration number  max , the elitist retention proportion , the penalty factor vector R = [  ] 1× , the perturbation value Δ  and the selection probability .

Output:
The optimal or suboptimal assignment scheme.
Initialize the solutions in the first iteration; Step .
Calculate the fitness value of all solutions in the tth iteration by formula (13), and sort solutions in descending order according to their fitness value, denoted by  → (),  → (), . ..,  → ().

Step .
Return the solution with the best fitness value in   ( + 1).
computation time and the comprehensive comparison based on a modified performance index (MPI).In this part, we also use the running results of Lingo 11.0 after 10000 iterations as references.All algorithms are coded in Matlab R2015a and all experimental tests are implemented on a computer with Intel Core i7-4790, 3.6 GHz, 8 GB RAM, Windows 7 operating system.

Analysis of Feasibility.
Although the combat scenario in [32] is a ground-based air defensive one, the WTA model in math is essentially the same as that introduced in Section 2. The parameters to describe the combat scenario are the number of weapon platforms is M=4, the number of targets is N=5, the missile number vector is Q=[4, 3, 2, 5], the value vector of targets is V=[3, 1, 5, 4, 8], the engagement constraint factor matrix is E= [1, 1, 0, 1, 0; 1, 0, 1, 1, 1; 0, 1, 1, 1, 0; 1, 1, 1, 0, 1], and the lethality probability matrix is P= [0.2, 0.6, 0.1, 0.9, 0.7; 0.8, 0.5, 0.5, 0.7, 0.3; 0.3, 0.7, 0.8, 0.6, 0.5; 0.7, 0.5, 0.4, 0.2, 0.9].The necessary input parameters of the HGDWO, the DPSO, the GAWGE, and the AIGA are shown in Table 2.Note that, in Table 2, we only list part of input parameters of the DPSO, the GAWGE, and the AIGA; the parameters specific to these three algorithms (i.e., the inertia weight of the DPSO, the crossover probability of the GAWGE, etc.) are consistent with those in the corresponding references and will not be listed in this paper.We independently run each algorithm above for 100 trials and record the maximum fitness value in each trial, as illustrated in Figure 5. Then we solve the benchmark case by the global solver of Lingo 11.0.After iterating 37869 times in 13s, the global solver found the global optimal solution, as illustrated in Figure 6.The corresponding objective value was 20.736.In this paper, we take the solution and the corresponding objective value obtained by Lingo 11.0 as the reference solution and the reference objective value, as shown in Table 3 and Figure 6, respectively.Then we define the successful trials of each algorithm as the trials which can obtain the reference objective value, as shown in Figure 7. Table 4 gives three statistical indicators: the average value of the maximum fitness value (AVMFV) of each algorithm for 100 trials with its standard deviation, the average value of the computation time (AVCT) of each algorithm for 100 trials with its standard deviation, and the number of the successful trials (NST).
It can be inferred from Figures 5 and 6 that the maximum fitness value is 20.736, which is consistent with that in [32].As reflected in Figure 7 and Table 4, among the four algorithms, the proposed HDGWO wins the second largest AVMFV and NST with the least AVCT, indicating that the proposed    algorithm is able to obtain a relatively good solution in the least computation time.To be specific, in terms of the solution quality, the AVMFV of the HDGWO is, respectively, -0.002%, 0, and 0.005% higher than those of the DPSO, the GAWGE, and the AIGA.Here the negative sign "-" means the HDGWO is inferior to the DPSO.The similar superiority and inferiority can also be reflected in the NST of all algorithms.Furthermore, regarding the computation time, the AVCT of the HDGWO is about 88.1%, 93.2%, and 34.5% less than those of the DPSO, the GAWGE, and the AIGA, respectively.Therefore, it can be concluded that the HDGWO is narrowly inferior to the DPSO in terms of the solution quality, but has a great advantage over the DPSO in terms of the computation time.Therefore, we can conclude that the HDGWO proposed in this paper achieves a better trade-off between the solution quality and the computation time than the DPSO, the GAWGE, and the AIGA, and thus is feasible to solve small-scale WTA problems.

Analysis of Scalability.
In this subsection, we analyze the scalability of the HDGWO by employing it to solve large-scale WTA problems and compare the running results with those of the DPSO, the GAWGE, and the AIGA.For this purpose, we first develop a test case generator to produce ten cases as introduced as follows.

Test Case Generator
(1) Generate the number of air-to-ground missiles available for the ith weapon platform   as   = ( max ),  = 1, 2, ⋅ ⋅ ⋅ , .The function randi(x), where x is a positive integer, returns a pseudorandom scalar integer between 1 and x.In this paper, we set  max as 8.
(4) Generate the engagement constraint factor as The function sign() returns 1 if x is a positive number, and -1 otherwise.  denotes the probability that   is equal to 0, and is set as 0.2.

Comparison of Solution
Quality.We execute each algorithm for 50 trials for each independently.Based on the running results, we then calculate the following indicators, as shown in Table 5, for the comparison of the solution quality: the average value of the maximum fitness value (AVMFV) of each algorithm for 50 trials with its standard deviation and the task completion ratio (TCR), and the improvement ratio of the solution quality (IRSQ) of one algorithm to that of another algorithm.TCR and IRSQ are defined as formula (16) and formula (17), respectively.In fact, in formula ( 16), the denominator should be the fitness value of the optimal solution, but it is too difficult to obtain for large-scale WTA problems.Therefore, we use the objective bounds obtained by the B-and-B (branch and bound) solver of Lingo 11.0 after 10000 iterations instead, denoted by Lingo ref, which will not affect the ranking of the four algorithms on the TCR indicator.The objective bounds for each case are shown in Table 5.In addition, it may be noted that in this paper our purpose is to compare the HDGWO with the DPSO, the GAWGE, and the AIGA, so we only need to analyze the IRSQ of the HDGWO.
We can see from Table 5 that the proposed HDGWO lags behind the DPSO and the GAWGE and thus achieves the third ranking on both the AVMFV and TCR indicators in all ten cases, which indicates that it is inferior to the DPSO and the GAWGE, but superior to the AIGA in terms of the solution quality.Then we analyze the disadvantage of the HDGWO relative to the DPSO and the GAWGE though ISRQ 1 and ISRQ 2 .The AVMFV of the HDGWO was only 3.07% and 5.81% smaller than those of the DPSO and the GAWGE in the worst case (i.e., in Case 10, ISRQ 1 =-3.07%,ISRQ 2 =-5.81%), respectively.And in all ten cases, the average values of ISRQ 1 and ISRQ 2 are separately -1.55% and -3.76%, indicating that the solution quality of the HDGWO is 1.55% and 3.76% worse than those of the DPSO and the GAWGE on average, respectively.Furthermore, we find that the TCR indicator of HDGWO in all cases exceeds 90% and averages 95.76%, which means that the solution obtained by the HDGWO is of satisfactory quality, even in Case 10 which involves 200 weapon platforms and 200 targets.Therefore, based on the above analysis a conclusion can be made that the proposed HDGWO has a relatively good scalability to provide a satisfactory solution for large-scale WTA problems, even though it is a little inferior to the DPSO and the GAWGE.

Comparison of Computation Time.
Another indicator for evaluating the performance of an algorithm is related to the computation time.In this paper, we calculate the average value of the computation time (AVCT) of each algorithm for 50 trials in all ten cases and the reduction ratio of the computation time (RRCT) of one algorithm to that of another algorithm by formula (18), as shown in Table 6.Considering our purpose in this paper, we only calculate the RRCT of the HDGWO.
As reflected in Table 6, on the AVCT indicator, the HDGWO is ranked second, just behind the AIGA in the first five cases (from Case 1 to Case 5), while it achieves the first ranking in the last five cases (from Case 6 to Case 10).This means that, in the aspect of the computation time, the HDGWO is inferior to the AIGA in the first five cases, but is superior to the AIGA in the last five cases and both the DPSO and the GAWGE in all cases.Since the HDGWO is inferior to both the DPSO and the GAWGE in terms of the solution quality, then we mainly investigate the advantage of the HDGWO over the DPSO and the GAWGE in the aspect of the computation time in this subsection.Note that a negative RRCT indicates HDGWO is less time consuming.By analyzing RRCT 1 and RRCT 2 we can easily find that the AVCT of the HDGWO is at least 75% smaller than those of the DPSO and the GAWGE, and on average it is 82.87% and 89.07%smaller, respectively.Thus, it can be inferred that on average the HDGWO consumes less than one-fifth of the computation time of the DPSO and one-ninth of the computation time of the GAWGE, but only reduces the solution quality by 1.55% and 3.76%, respectively.This means that the advantage of the HDGWO over the DPSO and the GAWGE in terms of the computation time is much more significant than the advantages of the DPSO and the GAWGE over the HDGWO in the aspect of the solution quality.

Comprehensive Comparison Based on MPI.
In this subsection, we make a comprehensive comparison based on the modified performance index (MPI), which is derived from the performance index (PI) defined by Bharti [37].The PI can be used to test the relative performance of different algorithms.Later, Mohan and Deep et al. [38][39][40][41] adopted the indicator to compare the performance of algorithms in their research.The PI of an algorithm is calculated as follows: where   is the total number of problems solved by the algorithm,   is the number of successful runs of the jth problem,   is the total number of runs of the jth problem,   is the minimum of the average computation time used by all algorithms in obtaining the solution for jth problem,   is the average computation time used by an algorithm in obtaining the solution for the jth problem,   is the minimum of the average number of function evaluations of successful runs used by all algorithms in obtaining the solution for the jth problem,   is the average number of function evaluations of successful runs used by the algorithms in obtaining the solution for the jth problem, and  1 ,  2 , and  3 ( 1 +  2 +  3 = 1 and 0 ≤  1 ,  2 ,  3 ≤ 1) are the weights assigned to the percentage of success, the computation time, and the average number of function evaluations of successful runs, respectively.Since it is too difficult to obtain the optimal solutions for large-scale WTA problems, the number of successful runs of the jth problem   is equal to 0, indicating that PI is equal to 0. Therefore, the original PI cannot be directly used to compare the performance of the four algorithms.Considering that the performance of an algorithm needs to be comprehensively evaluated in terms of both the solution quality and the computation time, we modify the original PI from the perspective of multiattribute decision making, where the solution quality is a benefit-type attribute and the computation time is a cost-type attribute.Then we normalize the values of the two attributes to the [0, 1] interval and aggregate the normalized values in the way similar to the formula (19).The modified performance index (MPI) of the ith algorithm is calculated as follows: where is the total number of problems solved by the ith algorithm,    is the  of the ith algorithm in solving the jth problem,    is the  of the ith algorithm in solving the jth problem, and  1 and  2 are the weights of the solution quality and the computation time, satisfying  1 +  2 = 1 and 0 ≤  1 ,  2 ≤ 1.Note that  and  are the average value of the maximum fitness value and the average value of the computation time, respectively, as defined in Sections 5.2.2 and 5.2.3.
It can be seen from Figure 8 that as the weight of the solution quality  increases, both the MPIs of the HDGWO and the AIGA decrease while those of the DPSO and the GAWGE increase.These trends are consistent with our previous analysis.The larger the weight of the solution quality , the more obvious the advantages of the DPSO and the GAWGE over the HDGWO and the AIGA in terms of the solution quality.Since no algorithms can keep the first ranking on the MPI indicator under different , then we further calculate the average value and the standard   deviation of the MPI for each algorithm.The average value of the MPI reflects the overall performance of the algorithm under different weights, and the standard deviation of the MPI measures the stability of the overall performance of the algorithm as the weight changes.The average value and the standard deviation of the MPI for each algorithm are shown in Table 7.
As reflected in Table 7, the average value of the MPI of the HDGWO is the largest and the corresponding standard deviation is the second smallest, only larger than that of the DPSO.This means that, from an overall point of view, the HDGWO has better performance than the other three algorithms and has the ability to maintain the performance at a relatively stable level, as the weight changes.Therefore, when the weight of the solution quality is unknown, the HDGWO should be recommended first among the four algorithms.But when the weight of the solution quality has been specified, the algorithm is recommended according to the following rules: the HDGWO is preferred if 0 ≤  < 0.6533; otherwise, the GAWGE is preferred.

Conclusion and Future Research
In this paper, we present the hybrid discrete grey wolf optimizer (HDGWO) to effectively solve WTA problems on the basis of the original grey wolf optimizer (GWO), which was developed only for optimizing the problems with a continuous solution space.To make GWO available, we adopt the decimal integer encoding method to represent solutions and propose the modular position update method to update solutions in the discrete solution space.Thus, we obtain the discrete grey wolf optimizer (DGWO) and then by combining it with the local search algorithm (LSA), we acquire the HDGWO.To evaluate the feasibility of HDGWO in solving WTA problems, we first employ it to solve the benchmark case in [32] and compare the running results with those of the DPSO, the GAWGE, and the AIGA.To analyze the scalability of the HDGWO, we compare three kinds of indicators that, respectively, measure the solution quality, the computation time, and the comprehensive performance of algorithms based on the running results of the HDGWO, the DPSO, the GAWGE, and the AIGA in solving ten different scale WTA problems produced by the test case generator.The simulation results prove the feasibility of the HDGWO in solving smallscale WTA problems and demonstrate the scalability of the HDGWO in solving large-scale WTA problems.In addition, the rules for selecting the best one among the four algorithms are also provided based on the comparison of the modified performance index.
We only implement the proposed HDGWO for the static WTA problems in the current work.In future research, we will concentrate on applying HDGWO to solve more complex WTA problems, such as the dynamic problems and consider more constraints in modeling, such as the necessary damage degree to targets, dynamic engagement feasibility factors, etc.Furthermore, we will also consider to employ the HDGWO to optimize similar resource assignment problems in other areas, such as industrial production and transportation, etc.

Figure 3 :
Figure 3: The mechanism of the modular position update method.

Figure 4 :
Figure 4: The mechanism of the perturbation operator.

Figure 5 :
Figure 5: The maximum fitness value in each trial of the HDGWO, the DPSO, the GAWGE, and the AIGA.

Figure 6 :
Figure 6: The running result of the global solver of Lingo 11.0.

Figure 7 :
Figure 7: The number of successful trials of the HDGWO, the DPSO, the GAWGE, and the AIGA.
and E = [  ] × , and the population size   .  = 1 →   Set the initial value of the n p th solution to a zero-vector [0] 1× .  can attack  IN() through the roulette wheel selection method in the first half of the iterations and thus obliges other solutions and E = [  ] × .  and   in turn; Calculate   =   /(  +   +   ), ( = , , ); Calculate the fitness value of   and    , denoted by   and   respectively; if   <   Replace   with    ,   =    ; endif (), denoted by   ,  ; for j=1→N if q max<=0 // determine if all weapons of   have been assigned break // terminates the execution of the for loop, i.e. for j=1→N endif if E(i, j))==1 //   can attack   Generate a random integer, denoted by n rand, between 1 and q max; Assign n rand weapons to   ; Update q max= q max-n rand;Input: Combat scenario parameters, including M, N, V = [V  ] 1× , P = [  ] × and E = [  ] × .()=   ; endif endfor Algorithm 3: Pseudo-code of the local search algorithm.

Table 2 :
The necessary input parameters of the HDGWO, the DPSO, the GAWGE, and the AIGA.

Table 4 :
Three statistical indicators: the AVMFV, the AVCT, and the NST.

Table 5 :
Three indicators of the HDGWO, the DPSO, the GAWGE and the AIGA for each case.

Table 6 :
The AVCT indicators of the HDGWO, the DPSO, the GAWGE and the AIGA for each case.

Table 7 :
The avg. and the std. of the MPI for the HDGWO, the DPSO, the GAWGE, and the AIGA.