An Improved Differential Evolution Algorithm Based on Dual-Strategy

Diﬀerential Evolution (DE) has shown excellent performance in solving optimization problems over continuous space and has been widely used in many ﬁelds of science and engineering. How to avoid the local optimal solution and how to improve the convergence performance of DE are hotpot problems for many researchers. In this paper, an improved diﬀerential evolution algorithm based on dual-strategy (DSIDE) is proposed. The DSIDE algorithm has two strategies. (1) An enhanced mutation strategy based on “DE/rand/1,” which takes into account the inﬂuence of reference individuals on mutation and has strong global exploration and convergence ability. (2) A novel adaptive strategy for scaling factor and crossover probability based on ﬁtness value has a positive impact on population diversity. The DSIDE algorithm is veriﬁed with other seven state-of-the-art DE variants under 30 benchmark functions. Furthermore, Wilcoxon sign rank-sum test, Friedman test, and Kruskal–Wallis test are utilized to analyze the results. The experiment results show that the proposed DSIDE algorithm can signiﬁcantly improve the global optimization performance.


Introduction
Differential Evolution (DE) is an emerging optimization technique proposed by Storn and Price [1] in 1995, which was initially used to solve Chebyshev polynomials. Later, it is demonstrated that DE is also an effective method to solve complex optimization problems. Similar to other intelligent evolutionary algorithms, DE is a stochastic parallel optimization algorithm based on swarm intelligence, which guides optimization search by imitating heuristic swarm intelligence generated by cooperation and competition among individuals in the population.
In DE, the population consists of several individuals, each of which representing a potential solution to an optimization problem. DE generates offspring individuals through mutation, crossover, and selection, and the offspring individuals are expected to be closer to the optimal solution. In the process of evolution, with the increase of generations, the population diversity becomes worse, leading to premature convergence or evolutionary stagnation, which is undoubtedly fatal to the algorithm that depends on the difference of population. Also, the performance of DE is affected by control parameters [2,3]. For different optimization problems, these control parameters often need a large number of repeated experiments to adjust to the appropriate value for achieving better optimization effect.
To address these shortcomings in DE, many improvements have been proposed, most of which focused on control parameters and mutation strategies.
Population size NP, scaling factor F, and crossover probability CR are three crucial control parameters in DE. Experiments in many works of literatures show that the performance of DE can be improved by adjusting these control parameters. Omran et al. [4] proposed a self-adaptation scheme (SDE), in which F was adaptive and CR was generated by a normal distribution. Liu and Lampinen [5] proposed a fuzzy adaptive differential evolution algorithm (FADE), which used the fuzzy logic controller to adjust F, and CR dynamically and successfully evolved individuals and their fitness values as input parameters of the logic controller. Brest et al. [6] developed a new adaptive DE algorithm, named jDE, applying F and CR to the individual level. If a better individual is produced, these parameters would be retained; otherwise, they would be adjusted according to two constants. Noman et al. [7] proposed an adaptive differential evolution algorithm (aDE), which was similar to jDE [6], except that the updating of parameters in aDE depended on whether the offspring was better than the average individual in the parent population. Asafuddoula et al. [8] used roulette to select the suitable CR value for each individual in each generation of the population. Tanabe and Fukunaga [9] proposed the success-history-based parameter adaptation for differential evolution (SHADE), which generated new F and CR pairs by sampling the nearby space of stored parameter pairs. Later, they came up with an improved version called L-SHADE [10]. Based on SHADE, a linear population size reduction strategy (LPSR) was adopted to reduce the population size NP by a linear function continuously. Zhu et al. [11] proposed an adaptive population tuning scheme (APTS) that dynamically adjusted the population size, in which redundant individuals were removed from the population or "excellent" individuals were generated. Zhao et al. [12] proposed a self-adaptive DE with population adjustment scheme (SAPA) to tune the size of the offspring population, which contained two kinds of population adjustment schemes. Pan et al. [13] proposed a parameter adaptive DE algorithm on real-parameter optimization, in which better control parameters F and CR are more likely to survive and produce good offspring. An enhancing DE with novel parameter control, referred to as DE-NPC, was proposed by Meng et al. [14]. e update of F and CR was based on the location information of the population and the success probability of CR, respectively, and a combined paraboliclinear population size reduction scheme was adopted. Di Carlo et al. [15] proposed a multipopulation adaptive version of inflationary DE algorithm (MP-AIDEA), the parameters F and CR of which were adjusted together with the local restart bubble size and the number of local restarts of Monotonic Basin Hopping [16]. Li et al. [17] presented an enhanced adaptive differential evolution algorithm (EJADE), in which CR sorting mechanism and dynamic population reduction strategy were introduced.
To improve the optimization performance and balance the contradiction between global exploration and local exploitation, researchers have carried out a lot of work on mutation strategy in DE. Das et al. [18] proposed an improved algorithm based on "DE/current-to-best/1" strategy, which made full use of the optimal individual information in the neighborhood to guide the mutation operation. Zhang and Sanderson [19] proposed an adaptive differential evolution algorithm (JADE), which adopted "DE/current-to-pbest/1" mutation model, used suboptimal solutions to improve population diversity, and employed Cauchy and Normal distribution to generate F and CR. Qin et al. [20] proposed a self-adaptive DE (SaDE), which adopted four mutation strategies to generate mutation individuals. e selection of mutation strategy would be affected by previous performance. A DE algorithm (CoDE) using three mutation strategies and three parameters for the random combination was presented by Wang et al. [21]. Epitropakis et al. [22] proposed a novel framework that specified the selection probability in the mutation operation based on the distance between each individual and the mutation individual, thereby guiding the population to global optimization. Mallipeddi et al. [23] proposed the EPSDE algorithm, which was characterized by a stochastic selection of mutation strategies and parameters in a candidate pool consisting of three basic mutation strategies and preset parameters. Xiang et al. [24] proposed an enhanced differential evolution algorithm (EDE), which adopted a new combined mutation strategy composed of "DE/current/1" and "DE/pbest/1." Cui et al. [25] proposed a DE algorithm based on adaptive multiple subgroups (MPADE), which divided the population into three subgroups according to fitness values, each subgroup had its mutation strategy. Wu et al. [26] presented a DE with multipopulation-based ensemble of mutation strategies (MPEDE), which had three mutation strategies, three indicator subgroups, and one reward subgroup. After several evolutionary generations, the reward subgroup was dynamically assigned to the best-performing mutation strategy. Parameters with an adaptive learning mechanism for the enhancement of differential evolution (PALM-DE) were presented by Meng et al. [27]. Unlike the external archive of the mutation strategy in JADE [19] and SHADE [9], the inferior solution archive in PALM-DE mutation strategy used a timestamp mechanism. In [28], Meng et al. introduced a novel parabolic population size reduction scheme and an enhanced timestamp-based mutation strategy to tackle the weakness of previous mutation strategy. Wei et al. [29] proposed the RPMDE algorithm, designed the "DE/M_pbest-best/1" mutation strategy, used the optimal individual group information to generate new solutions, and adopted the random perturbation method to avoid falling into the local optimal. Duan's DPLDE [30] algorithm used population diversity and population fitness to determine individuals participating in mutation operation, thus influencing the mutation strategy. Tian and Gao [31] proposed NDE, which employed two mutation operators based on neighborhood-based and an individual-based selection probability to adjust the search performance of each individual appropriately. Wang et al. [32] proposed the DE algorithm based on particle swarm optimization (DEPSO), which utilized the improved "DE/rand/1" mutation strategy and PSO mutation strategy. Meng and Pan [33] presented hierarchical archive based on mutation strategy with depth information of evolution for the enhancement of differential evolution (HARD-DE), the depth information in which was the linkage of more than three different generations of populations and was included into the mutation strategy. A hybrid differential evolution algorithm based on "DE/targetto-ci_mbest/1" mutation operation of CIPDE [34] and "DE/ target-to-pbest/1" mutation operation of JADE [19] was introduced by Pan et al. [35]. Meng et al. [36] proposed depth information-based DE with adaptive parameter control (Di-DE), the mutation strategy of which contained a depth information-based external archive.
As mentioned above, mutation strategies and control parameters affect the performance of DE, and "DE/rand/1" is widely used due to its strong global exploration ability and good population diversity. Many researchers have refined the mutation strategy. In this paper, an enhanced mutation strategy based on "DE/rand/1" is proposed by introducing a reference factor. Besides, according to the maximum, minimum, average fitness value of population, and the fitness value of the individual, the scaling factor and crossover probability are changed to adjust the population diversity effectively. e remainder of the paper is organized as follows. Section 2 describes the basic DE algorithm. Section 3 provides the details of the proposed DSIDE. In Section 4, the proposed DSIDE is compared and analyzed experimentally with seven advanced DE algorithms, and the effectiveness of the enhanced mutation strategy and the novel adaptive strategy for control parameters in DSIDE is studied. Section 5 summarizes the work of this paper and points out the future research direction.

The Basic Differential Evolution Algorithm
An unconstrained optimization problem is to find the extremum of a function, which can be expressed as follows: where f( * ) denotes the fitness value, D represents the dimension of the problem, and x L j and x U j are the minimum and maximum values of x j , respectively. e process of solving optimization problems in DE is divided into initialization, mutation, crossover, and selection.

Initialization.
To establish a starting point, an initial population must be created in the search space. Without loss of generality, the jth component (j � 1, 2, . . . , D) of the ith individuals (i � 1, 2, . . . , NP) in the original population can be expressed as follows: where rand returns a uniformly distributed random number between 0 and 1 and L and U represent the lower and upper bounds of solution space, respectively.

Mutation.
e mutation strategy of the DE algorithm can be expressed as "DE/x/y," where "DE" means differential evolution algorithm, "x" represents the reference vector in the mutation operation, and "y" denotes the number of differential vectors in the mutation operation. e most common mutation strategy is to randomly select two different individuals in the population, scale their vector differences, and then conduct vector synthesis with another random individual. e obtained mutation individual V i is as follows: where r1, r2, and r3 are randomly generated integers ranging from 1 to NP, and r1 ≠ r2 ≠ r3 ≠ i; G represents the current generation number; and F denotes the scaling factor and controls the amplification of the differential vector. e mutation strategy is shown in equation (3) and is known as "DE/rand/1".

Crossover.
e purpose of the crossover operation is to generate the trial vector U G+1 i,j . e binomial crossover and exponential crossover are two main crossover operators. In this paper, binomial crossover is adopted, and its expression is as follows: where X G i,j denotes the jth component of the ith individual in the current population; CR(∈ [0, 1]) is called crossover probability, which determines the contribution of mutation vector V G+1 is a uniformly distributed random integer, ensuring that at least one-dimensional components of the trial vector U G+1 i,j inherit from the mutation vector V G+1 i,j .

Selection.
In DE, the greedy selection strategy is utilized to compare the trial vector U G+1 i,j with the target vector X G i , and the one which has better fitness value will be selected as the offspring individual X G+1 i : where f(·) stands for the fitness value.

DSIDE Algorithm
In DSIDE, the crossover and selection operations are the same as the basic DE, as shown in equations (4) and (5), respectively. Next, the improved mutation strategy and adaptive strategy will be introduced. (3), it can be seen that the reference individual X G r1 plays an important role in regulating balance in the evolutionary process. In the early stage of evolution, when most individuals are far away from the optimal solution, a larger X G r1 is conducive to jumping out of the local optimal. However, in the later stage of evolution, most individuals gradually approach the global optimal solution, and a larger X G r1 may cause individuals to deviate from the correct direction of evolution, which is not in favor of global convergence. On this basis, we propose an improved mutation strategy as follows:

An Enhanced Mutation Strategy. From equation
In equation (6), α i (∈ [0, 1]), F i , and CR i are the reference factor, scaling factor, and crossover probability for each target individual X G i , respectively; G denotes the current generation number. In equation (7), r means a random number on the interval [0, 1]. G max represents the maximum generation number. From equation (7)， it is not challenging to observe that the value of α G i is relatively large at the initial evolutionary stage, which ensures a wide range of search. As the evolutionary generation increases, the α G i value decreases and the search scope shrink.

A Novel Adaptive Strategy for Control Parameters.
During the mutation operation of equation (3), the scaling factor affects the reference individual through the differential vector (X G r2 − X G r3 ), which is called "perturbation." A larger F can produce a larger "perturbation," which is helpful to maintain the population diversity, but will reduce the search efficiency of the algorithm. A smaller F helps to improve the convergence speed, but the loss of population diversity is faster, and it is easy to fall into local optimal and premature convergence. During the crossover operation of equation (4), CR determines the contribution of the mutation vector to trial vector. A larger CR facilitates the expansion of the search space, thus accelerating the convergence. However, the mutation individuals tend to be identical in the later evolutionary stage, which weights against the maintenance of diversity. A smaller CR is not to the benefit of exploring the search area. erefore, F and CR should be adjusted adaptively to explore the global space more thoroughly in the early stage of evolution and exploit the local area near the optimal solution at the later stage of evolution. Based on these points, a novel adaptive strategy is proposed, which can dynamically adjust control parameters according to the fitness value, as shown in where f G i is the fitness value of the target individual X G i , f G max and f G min are the maximum and minimum fitness values at the current generation G, and f G mean is the average fitness value of the current population. e reference factor α G i , scaling factor F G i , and crossover probability CR G i are updated before each evolution. e entire process of DSIDE algorithm is shown in Algorithm 1.

Benchmark Functions. Unlike deterministic algorithms,
it is difficult to verify that evolutionary algorithms are superior to other algorithms due to their limited knowledge. erefore, benchmark functions are utilized to evaluate the performance of evolutionary algorithms. In this section, the performance of DSIDE is tested on 27 benchmark functions [37][38][39] listed in Table 1, where D is the dimension of the problem. f 1 ∼ f 11 are unimodal functions. f 12 has one minimum and is discontinuous.

Comparison with 7 Improved DE Algorithms.
Here, we mainly discuss the overall optimization performance among jDE [6], JADE [19], SaDE [20], CoDE [21], EPSDE [23], MPEDE [26], DEPSO [32], and the proposed DSIDE algorithm. Experiments are carried out on f 1 ∼ f 30 benchmark functions at 30 D and 100 D, respectively. e parameters of other algorithms are the same as in their original literatures. e population size NP is set to 100 for all algorithms. 30 independent runs with 1000 maximum number of evolutionary generations are conducted. Tables 2  and 3 show the mean/std (mean value and standard deviation) of fitness error over 30 runs at 30 D and 100 D, respectively. Symbols "+," " ≈ ," and "− " behind "mean ± std" pair denote "Better Performance," "Similar Performance," and "Worse Performance," respectively, all of which are measured under Wilcoxon's signed-rank test with a level of significant α � 0.05. Furthermore, Wilcoxon's rank-sum test and Kruskal-Wallis test [39,40] in Tables 4-6 are employed to further test the optimization performance of all algorithms. e best results in tables are shown in bold. In addition, the representative convergence curves of all algorithms are also given in Figures 1 and 2. (1) Initialize the original population pop and calculate their fitness values, NP � 100, G � 1, G max �1000; (4) Calculate α i in equation (7); (5) Calculate F i in equation (8); (6) Calculate CR i in equation (9); (7) Implement mutation in equation (6); (8) Implement crossover in equation (4); (9) Implement selection in equation (5); (10) end for (11) G � G + 1 (12) end while ALGORITHM 1: DSIDE. Step Scaffer's F6 Levy and Montalvo 1 f 29 (x) � π/D 10(sin(πy Levy and Montalvo 2  Mathematical Problems in Engineering            From Table 4, we can see the results of Wilcoxon's rank-sum test for 30 D and 100 D problems. R + is the sum of positive ranks in which the first algorithm performs better than the second, and R − is the sum of negative ranks in which the first algorithm performs worse than the second. As shown in the table, we can observe that, for all comparison of DEs, all R + values obtained by DSIDE are higher than R − . It proves that DSIDE outperforms other compared DE algorithms significantly. Tables 5 and 6, respectively, utilize Friedman and Kruskal-Wallis statistical test to compare the performance of each algorithm on 30 D and 100 D problems. It can be seen that the test results obtained by DSIDE are the minimum regardless of the high dimension or low dimension, indicating that DSIDE has the best performance among the comparison algorithms.

Mathematical Problems in Engineering
So far, all the nonparametric tests, including Wilcoxon's rank-sum, Friedman, and Kruskal-Wallis test, support the conclusion that DSIDE is superior to other competing algorithms.
Furthermore, we compare the convergence curves of each algorithm on benchmark functions at 30 D and 100 D. All convergence curves are studied and analyzed from the aspects of convergence precision and whether they converge to the global optimum or not. Some representative convergence curves are depicted in Figures 1 and 2.
As shown in Figures 1(a) and 1(b), in convergence curves of function f 1 at 30 D and 100 D, only DSIDE converges to the global optimum, and the average convergence accuracy is much higher than other algorithms under the same generations. Convergence curves of f 7 , as shown in Figures 1(c)and 1(d). Although convergence precision is not always optimal in the evolution process, only DSIDE gets the global optimum. Figures 1(e) and 1(f ) show convergence curves of f 13 at 30 D and 100 D, respectively. All algorithms have not found the optimal solution, but the average convergence accuracy of DSIDE is much higher than other algorithms under the same generations and obtains the best value. Figures 1(g) and 1(h) show convergence curves of f 14 at 30 D and 100 D, respectively. All algorithms have not obtained the global minimum. JADE performs the best on the low-dimensional problem, while DSIDE is the best on high-dimensional. In Figures 1(i) and 1(j)  on f 16 and consumes fewer generations and converges quickly.
In Figure 2(a), DSIDE, JADE, DEPSO, and EPSDE obtain the optimal on f 18 at 30 D. In Figure 2 In general, through the comparative analysis of the above experiments, DSIDE not only obtains the global optimal value most times on these benchmark functions but also is superior to other algorithms in terms of convergence speed and convergence accuracy.

Efficiency Analysis of Proposed Algorithmic Components.
So far, the above experiment exhibits the combined effect of the proposed DSIDE. In this section, the efficiency analysis of proposed algorithmic components is completed, including the enhanced mutation strategy of the reference factor and the adaptive strategy of the scaling factor and crossover probability. Some variants of DSIDE are listed as follows: (i )To verify the effectiveness of the enhanced mutation strategy of reference factor α, DSIDE variants adopt dynamic F, CR, and constant reference factor of α � 0.3 and α � 0.6 and random real number in  For the purpose of evaluating and comparing the performance of DSIDE variants, Friedman test, Kruskal-Wallis test, and Wilcoxon's rank-sum test are adopted, and the test results are shown in Figure 3(a), Figure 3(b), and Table 7, respectively. e following summaries can be obtained. (1) From Figure 3, we can observe that DSIDE and DSIDE-6 are the best and the second, while the performance of other DSIDE variants is relatively low. e combined effect of the proposed algorithmic components is the best. (2) From Table 7, the integrated DSIDE performs significantly better than DSIDE variants (DSIDE-2 and DSIDE-5) with a larger reference factor and a lager scaling factor, as well as DSIDE variants (DSIDE-7, DSIDE-8, and DSIDE-9) with different crossover probability. e performance between the integrated DSIDE and DSIDE-1 with a smaller reference factor, DSIDE-3 with a random reference factor, and DSIDE-4 with a smaller scaling factor show no significant difference when the significance level of Wilcoxon's rank-sum test is 0.1, but the difference is opposite when the significant level is 0.05. At the same time, there is no performance difference between DSIDE and DSIDE-6 with a random scaling factor, regardless of the significance level. e validity of the proposed mutation strategy and adaptive strategy for control parameters is demonstrated utilizing above experimental comparisons. It is noted that the contribution of the adaptive strategy of crossover probability is larger than enhanced mutation strategy and adaptive strategy of scaling factor. at is to say, although the enhanced mutation strategy of reference factor and adaptive strategy of scaling factor are effective, DSIDE is less susceptible to both a smaller or variational reference factor and scaling factor.

Conclusions
DSIDE's innovation lies in two strategies, the enhanced mutation strategy and the novel adaptive strategy for control parameters. On the one hand, the enhanced mutation strategy considers the influence of the reference individual on the overall evolution. It introduces the reference factor, which is beneficial to global exploration in the early stage of evolution and global convergence in the later stage. On the other hand, the novel adaptive strategy for control parameters can dynamically adjust the scaling factor and crossover probability according to the fitness value, which has a positive impact on maintaining the population diversity. DSIDE is compared with other seven DE algorithms, the results are evaluated by three nonparametric statistical tests, and the convergence curves are analyzed. Experimental results show that the proposed DSIDE can effectively improve the optimization performance. Besides, the efficiency analysis of proposed algorithmic components has been carried out, which further proves the comprehensive effect and validity of DSIDE.
So far, DE variants have been applied to various fields, such as target allocation [41], text classification [42], image segmentation [43], and neural network [44][45][46][47]. For the future work, the proposed DSIDE algorithm will be applied to the parameter optimization of neural network and may further apply it to the air traffic control system for flight trajectory prediction [48,49].

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper.