Adaptive Differential Evolution Algorithm with Simulated Annealing for Security of IoT Ecosystems

With the wide application of the Internet of Things (IoT) in real world, the impact of the security on its development is becoming incrementally important. Recently, many advanced technologies, such as arti ﬁ cial intelligence (AI), computational intelligence (CI), and deep learning method, have been applied in di ﬀ erent security applications. In intrusion detection system (IDS) of IoT, this paper developed an adaptive di ﬀ erential evolution based on simulated annealing algorithm (ASADE) to deal with the feature selection problems. The mutation, crossover, and selection processes of the self-adaptive DE algorithm are modi ﬁ ed to avoid trapping in the local optimal solution. In the mutation process, the mutation factor is changed based on the hyperbolic tangent function curve. A linear function with generation is incorporated into the crossover operation to control the crossover factor. In the selection process, this paper adopts the Metropolis criterion of the SA algorithm to accept poor solution as optimal solution. To test the performance of the proposed algorithm, numerical experiments were performed on 29 benchmark functions from the CEC2017 and six typical benchmark functions. The experimental results indicate that the proposed algorithm is superior to the other four algorithms.

The Artificial Intelligence Internet of Things (AIoT) makes the intercommunication of various networks and systems more efficient [25][26][27][28][29]. Deep learning has also made many contributions to the realization of AIoT [30][31][32] and other field [33,34]. Many scholars have adopted different computational intelligences, including fuzzy system, neural networks [35][36][37], swarm intelligence [38], differential evolution algorithm [39], and other evolutionary computation [40], to resolve differential optimization problems. The optimization and improvement of differential evolution (DE) algorithm have become a trend in the application of IoT. Xue et al. [41] adopted a self-adaptive differential evolution algorithm (SaDE) to deal with feature selection problems. In the nonuniform IoT node deployments, to solving nonlinear realparameter problems, Ghorpade et al. [42] proposed an enhanced particle swarm optimization algorithm and adopted differential crossover quantum in this algorithm. In heterogeneous resource allocation, to minimize service cost and service time, Fang et al. [43] proposed a dynamic multiobjective evolutionary algorithm to allocating IoT services. Iwendi et al. [44] proposed a metaheuristic optimization approach for energy efficiency in the IoT networks. Yang et al. [45] proposed an intelligent trust cloud management method for secure and reliable communication in Internet of Medical Things (IoMT) systems. Qureshi et al. [46] proposed enhanced differential evolution (EDE) and adaptive EDE algorithms to effectively improve the topology robustness of the IoT network while keeping the node degree distribution unchanged.
In 1995, Storn and Price first proposed the DE algorithm on the basis of the genetic algorithm to solve global optimization problems over continuous space [47]. As a metaheuristic algorithm, the DE algorithm utilizes the individuals in the population to present the solutions of problem and updates the individuals through the mutation operation, crossover operation, and selection operation. Due to its easy implementation, high convergence speed, and superior robustness, the DE algorithm has been widely used in many fields, including in solving dynamic optimization problems [48], constraint optimization problems [49], multiobjective optimization problems [50], and engineering design problems in practical applications [51]; it can also be used as scheduling algorithm in CPS system [52].
To achieve better improvement on the performance of the DE algorithm, many scholars have developed different optimized DE algorithms, which adopt adaptive mutation strategy and crossover strategy to optimize the mutation and crossover process. Mohamed and Suganthan [53] proposed an enhanced DE algorithm and introduced a new triangular mutation operator and two adaptive schemes to change the values of the mutation factor and crossover factor. Mohamed and Mohamed [54] proposed a new DE algorithm, namely, AGDE, to prepare two candidate pools of crossover factor and adaptively update the parameter value. Mohamed et al. [49] proposed an enhanced DE algorithm (EDDE) to solve constrained engineering optimization problems. EDDE uses individual information with different fitness function values in the population to generate a mutation vector. Wu et al. [55] realized an ensemble of three DE variants (EDEV). In each generation of the ensemble, the optimal variant is obtained by competition among three variants, and the final evolution is carried out by the optimal variant. Elquliti and Mohamed [56] proposed the nonlinear integer goal programming problem with binary and real variables and developed an improved real-binary differential evolution (IRBDE) algorithm for solving constrained optimization problems. In the developed algorithm, a new binary mutation strategy is introduced to deal with binary variables. Fu et al. [57] proposed an adaptive DE algorithm with aging leader and challenger mechanism to solve optimization problem. Sun et al. [58] proposed a hybrid adaptive DE algorithm, namely, HADE, which develops a mutation process with a disturbance factor and adjusts the crossover factor according to the fitness function values. Huynh et al. [59] added a Q-learning model to generate the values of the mutation factor and the crossover factor in the DE algorithm. Zeng et al. [60] combined the DE algorithm with the SA algorithm to generate a new individual with a Markov chain length of L time in the mutation operation.
In this study, we proposed an adaptive simulated annealing differential evolution (ASADE) algorithm based on the SA algorithm and DE algorithm. In the ASADE algorithm, the mutation factor is modified with reference to the hyperbolic tangent function curve, the crossover factor is changed to linear variation of generation, and we combine the selection operation and the Metropolis criterion of the SA algorithm. In the early evolution stages of the proposed algorithm, the mutation factor and crossover factor maintain a relatively large value, and the ability of the algorithm to get rid of a local optimal solution is enhanced. In the middle evolution stages, the values of the mutation factor and crossover factor are decreasing and the algorithm speeds up the convergence rate and obtains the trade-off between global and local abilities. In the later stage, the mutation factor and crossover factor maintain a relatively small value; the search continues until the optimal solution is found. ASADE was tested on the 2017 IEEE Evolutionary Computing Conference (IEEE CEC2017) [61] and six typical benchmark functions. The experiments and comparisons show that ASADE is superior to two typical population-based algorithms and two DE optimized algorithms. This paper is arranged as follows: the DE algorithm and SA algorithm are introduced in Section 2. Section 3 proposes the ASADE algorithm. The experimental testing results are discussed in Section 4. Finally, the conclusions of this paper are presented in Section 5.

Related Algorithms
The DE algorithm is a direct search algorithm based on biological ideas to solve global optimization problems. It utilizes evolutionary process such as the mutation and crossover operation to obtain a new individual as a new solution to the optimization function. The DE algorithm focuses on the diversity of solutions and the effectiveness of convergence. Compared with other optimization algorithms, the DE algorithm has fewer control parameters, faster convergence speed, stronger robustness in optimization results, and wider application in various fields.
The DE algorithm includes four processes: initialization, mutation, crossover, and selection. In the initialization process, the initial parameters include the population size (NP), mutation factor (F), crossover factor (CR), maximum evolutionary generations (G m ), number of variables (D), and range of variables (½x min , x max ) in individuals. The fitness function for specific problem (f ðXÞ) and the initial population (X 0 ) as the target population in the first generation are also obtained. In the mutation and crossover process, each individual in the target population is mutated and crossed to generate trial individual and the fitness function value of each individual can be obtained. In the selection process, by comparing the fitness function values, the better individuals between the target population and the trial population are selected to form the target population of the next 2 Wireless Communications and Mobile Computing generation. Algorithm 1 presents the algorithmic process of the DE algorithm.

SA Algorithm.
The SA algorithm is a stochastic intelligent optimization algorithm based on the Monte Carlo method to solve an optimized problem; the name of the SA algorithm comes from the annealing and cooling process in metallurgy [62]. The algorithm treats a feasible solution of the optimized problem as a particle in the solid. The particle will reach the final ground state in the process of cooling and annealing, and the internal energy will be reduced to the minimum value, which is similar to the process of finding the optimal value of the problem. In the process of particle cooling, at high temperature, a new state that differs significantly from the current temperature is more acceptable to be an important state, while at a low temperature, a new state with a smaller temperature difference from the current state is more inclined to accept as an important state. And as the temperature tends toward a constant, no new state can be accepted. The above criterion for accepting a new state is called the Metropolis criterion. According to the Metropolis criterion, the probability that a particle will go to equilibrium at temperature T is e ð−ðΔE/ðkTÞÞÞ , where E represents the internal energy, ΔE is its changed energy, and k is the Boltzmann constant. A new solution is accepted or rejected according to this probability while finding solution of the problem. The SA process can be described in the following steps.
Step 1. Set the initial parameters, objective function f , initial temperature T, cooling function T k , and Markov chain length L. In a single evolution, the number of iterations to generate new solutions is set by the Markov chain length.
Step 2. Choose an initial viable solution X randomly, which can be regarded as a particle in a solid. At present, the optimal solution of the objective function is f ðXÞ.
Step 3. Perform the process of generating a new solution with a Markov chain length of L times, which is called the Markov process. The method of generating new solutions is as follows: (2) Part 2 Assuming that the resulting new solution is Pðp 1 , p 2 , ⋯, p k , ⋯, p l , ⋯p n Þ, for each of the unknown variables, p i ði = 1, 2, ⋯, nÞ can be expressed as where v ∈ ½0, 1 and w is a random number between 0 and 1.
Step 4. After generating a new solution, decide whether or not to accept the new solution according to the Metropolis criterion. The criterion equation is as follows: where ΔE = f ðX new Þ f ðXÞ | and μ is the cooling coefficient. If p > rand ð0, 1Þ, accept the new solution, otherwise, reject the new solution. Decrease the Markov chain length by 1, and repeat the process of producing new solutions until the Markov chain length equals to 0. Choose the solution corresponding to the greatest p value as the optimal solution of the iteration.
Step 5. Obtain the current optimal solution from Step 4, execute the cooling function T k , and determine whether the temperature remains unchanged. If so, output the current optimal solution. If not, go to Step 3.

The ASADE Algorithm
The DE algorithm is a stochastic direct search evolutionary algorithm. In the process of evolution, the mutation operation and crossover operation greatly impact the diversity of the solutions. In the selection operation, different optimal solutions can affect the optimization process of the next evolution. Based on the literature [63], the ASADE algorithm is proposed.

Adaptive Mutation.
Previous research [64] found that the mutation factor is closely related to the search step size.
In the early stages of evolution, in the scope of the global feasible solution, a large mutation factor can search the solutions widely, the structure of the solution will be more directional and diversified, and it will be easy to get rid of the local optimal solution. In the middle and late stages of evolution, when the global optimal solution range has been found, a small mutation can help to accurately search better solutions, and the performance of the DE algorithm is more effective. According to the analysis of the range of the mutation factor, a hyperbolic tangent function between [-4,4] was adopted in this paper to adjust the value of the mutation factor, and its equation is as follows:

Wireless Communications and Mobile Computing
where G m is the maximum evolution generation, G is the current generation, F max and F min are the maximum and minimum ranges of the mutation factor, respectively. Taking the maximum evolution generation as 1000, the variation trend of the mutation factor in this paper is compared with that in the study by Sun et al. [58], and the graph is plotted in Figure 1.
In Figure 1, in the mutation process of the proposed algorithm, the mutation factor is maintained at approximately 0.75 in the first 350 generations. Through a period of global searching, the algorithm can get rid of the local optimal solution continually and find the range of the global optimal solution. In the other two cases, the rate of decreasing the mutation factors at the early stages of evolution is fast, which shortens the algorithm's global search time and makes the global search less effective. From 350 to 700 generations, the mutation factor of the proposed algorithm decreases from 0.75 to 0.25, and at the same time, the search scope also reduces from global to local. After 700 generations, the mutation factor remains at 0.25. After performing a local search for a sufficiently long period of time, the better optimal solution is found in the local scope.

Adaptive Crossover.
In the DE algorithm, the mutation factor has a great impact on the global search, while the crossover factor can increase the diversity of solutions and has the ability to affect the search range, but the influence of the crossover factor is slightly smaller. To improve the efficiency of the solution, we introduced a linear function with the number of generations in equation (5) to express the crossover factor. The value range of the crossover factor CR is CR min < CR < CR max .
Input: the initialization parameter: NP, F,CR,G m ,D,[x min , x max ] and f(X) output: the optimal solution of the problem population initialization generation=0 Select from the trial population U k and the target population Algorithm 1:The algorithmic description of DE.

Wireless Communications and Mobile Computing
In the early stage of evolution, the value of the crossover factor is relatively large, and the diversity of feasible solutions is relatively rich, while in the later stage of evolution, the crossover factor gradually decreases. According to the crossover process in the DE algorithm, the probability that the value randomly generated between [0, 1] is smaller than the crossfactor decreases, the probability that the crossover vector selects the parent vector increases, and the diversity of the population decreases.
3.3. SA Selection. In the selection process of the DE algorithm, the fitness function values of individuals are compared between the trial population and the target population, and individuals with small fitness function values are selected to form the target population in next generation. After the selection process, the individual with the smallest fitness function value is the optimal value obtained by this evolution. In the process of selection, it is easy to ignore a poor solution in the trial population. To consider the impact of a poor solution could produce more diverse mutation vectors, affecting the value of optimal solution. In this paper, the Metropolis criterion of the SA algorithm is introduced in selection process to accept a poor solution as the individual in the next population. The selection of the poor solution can make the algorithm get rid of a local optimal solution in the evolutionary process and mutate in a wider direction in the next evolutionary process. The selection operation with Metropolis criterion in the proposed algorithm is defined as follows: If p i > rand ð0, 1Þ, Else End where the equation of p i is as follows: where E = jf ðX i k Þ − f ðU i k Þj and T k is the temperature in the kth generation; the initial temperature and cooling function are set as follows: where k is the number of generation and μ is the cooling coefficient. The temperature decreases with evolution increment until the temperature remains constant, and each drop is related only to the value of the previous temperature.

Numerical Experiment and Result Analysis
The experiment in this paper used a 64-bit Windows 10 operating system. The processor is an Intel(R) Core (TM) i5-5200U CPU @ 2.20 GHz with an Intel(R) HD Graphics 5500 GPU. Python 3.5.2 is selected as the experimental code language, and the experiment is run in PyCharm software to complete the experimental process.
4.1. Experiment Setup. The performance of the ASADE algorithm in this paper was tested on the IEEE Congress on Evolutionary Computation17 test suite (CEC2017). The CEC2017 test includes 29 benchmark functions, a detailed introduction, and description of CEC2017, and its specific functions can be found in [61].  Tables 1 and 2.
To compare the results of different algorithms on test functions, this paper used the Friedman test and Wilcoxon signed-rank test to analyze and compare the solution quality. The two tests use α = 0:05 as the significance level. The Friedman test generates the final ranks of different algorithms on test functions' the Wilcoxon signed-rank test compares the specific differences between two algorithms for test functions of CEC2017. Comparing the solution solved by the former algorithm and the comparison algorithm, "R+" is the sum of ranks for the functions in the first algorithm solutions that are more than the second algorithm solutions in the row, "R-" is the sum of ranks for the opposite situation, a plus (+) sign indicates the function number of CEC2017 in which the first algorithm solutions are more than the second algorithm solutions, a minus (-) sign indicates the function number of CEC2017 in the opposite situation, and the approximation (≈) presents the number of the remaining functions. p values less than the significance level are marked in italic. SPSS 26.0 was used as an experimental tool for the statistical tests.

ASADE Parametric Study.
The ASADE algorithm optimized the mutation, crossover, and selection processes in the DE algorithm. To analyze the impact of the adaptive mutation, adaptive crossover, and SA selection on the performance of the ASADE algorithm, experiments were conducted. Three different versions of ASADE were tested and compared against the proposed version on 29 functions of CEC2017 on D = 10, D = 30, D = 50, and D = 100.
(1) Version 1. To test the individual effect of adaptive mutation on the performance of the ASADE algorithm, an ASADE version with adaptive crossover, SA selection, and a basic mutation strategy was The statistical test results of the ASADE algorithm against its alternate versions (ASADE-1, ASADE-2, and ASADE3) on CEC2017 are presented in Tables 3 and 4.  Table 3 shows the average ranks of four ASADE versions calculated by the Friedman test. In the table, the p values obtained by the Friedman test for each dimension are 0.003, 0.000, 0.000, and 0.000, which are all less than 0.05. It can be drawn that the performance of these ASADE versions has a significant difference. Compared with ASADE-1, ASADE-2, and ASADE-3, the ranks of 2.02, 1.24, 1.17, and 1.10 obtained by the ASADE algorithm for all dimensions are all the smallest, and the mean rank value 1.38 is also the smallest, which proves that ASADE is better than the other three algorithms in all dimensions. In addition, ASADE-3 ranks second, followed by ASADE-2 and ASADE-1. This proves that the adaptive modified mutation factor in the mutation process plays a key role in the ASADE algorithm. The ASADE algorithm integrates three optimization strategies to obtain the best optimization effect.  Table 4 presents the comparison results between ASADE and the other three versions on different dimensions according to the Wilcoxon test. The best results are distinguished in italic. In the table, the p value of the comparison between ASADE and ASADE-1 is less than 0.05 in each dimension, "R+" is less than "R-" and the function number of "-" is more than the function number of "+." ASADE is significantly better than ASADE-1 in performance. Additionally, the p values of the comparisons of ASADE with ASADE-3 and ASADE-2 on D = 10 are 0.545 and 0.940, respectively, both of which are greater than 0.05. However, as the dimension increases, the p value tends toward 0.000. This indicates that the optimization of the adaptive crossover process and the SA selection process has an increasing influence on the performance of the ASADE algorithm.
The final comparison results are recorded in the last column of the table; a plus (+) sign indicates that the former algorithm is superior to the compared algorithm. According to the last column, ASADE is improved to the compared algorithm in 83% of the rows.

Comparison against
State-of-the-Art Algorithms. The proposed ASADE algorithm was compared with four evolutionary algorithms, i.e., PSO, DE, HADE [58], and adaptive DE with disturbance factor algorithm (ADE-D) [65] on six typical benchmark functions; the functions are presented in Table 5. These benchmark functions have many local optimal values, a large search space, and strong deception.
The function values are all positive, and the global optimal values of these functions are all zero; then, the fitness function is defined as the function itself. The closer the solution is to zero, the closer it is to the global optimal value. The results including the best (BST), worst (WST), and average (AVG) values and the number of evolution generations (NEG) that reach the specified convergence precision for each function are recorded in Table 6, the values smaller than 10 −8 are taken as zero, and the smallest values are marked in italic. The parameter setting of DE, PSO, HADE, and ADE-D can be found in original paper, and the parameter values of ASADE are NP = 100, G m = D * 1000, F max = 0:8, F min = 0:05, CR max = 1, and CR min = 0:9.

Wireless Communications and Mobile Computing
In Table 6, the solutions for all benchmark functions of the ASADE algorithm are the smallest and only ASADE obtains the optimal solutions on the Rastrigin and Salomon function. The NEG of the ASADE algorithm on six functions are 134, 576, 236, 191, 121, and 285; among them, the probability of finding the global optimal solution before 500 generations is 83.33%, while the probability of DE, PSO, HADE, and ADE-D is 0%, 16.67%, 0%, and 66.67%. Table 7 shows the outcome of the Friedman test. The average ranks of ASADE on 10, 30, and 50 dimensions are 2.00, 1.92, and 1.75, respectively. The mean rank of ASADE is 1.89, which is the smallest among the five algorithms. The second and third best algorithms are ADE-D and HADE, with the mean rank as 2.09 and 2.61.
When D = 30, the convergent tendency curves of ASADE and other four compared algorithms for six benchmark functions are depicted in Figure 2. In Figure 2, the rate of searching the best objection function value of ASADE is faster than other algorithms; compared with ADE-D, ASADE firstly finds the optimal value in the solutions of all functions. On Rastrigin and Ackley functions, DE, PSO, and HADE easily fall into local optimal solutions, while ASADE continually presents monotonic downward trend until it find the smallest solution. Therefore, we can conclude that compared with DE, PSO, HADE, and ADE-D, ASADE has faster convergence speed and more accurate solution in the process of solving these functions.

Conclusion
In the development of Internet of Things (IoT), intrusion detection systems (IDS) play a vital role in data security. The IDS dataset has dimensional problems of irrelevant and redundant, and feature selection is employed to reduce dimensions. An adaptive simulated annealing differential evolution algorithm (ASADE) is proposed to generate multiple candidate solutions to find the global optima in the feature selection process. The ASADE algorithm optimized the basic DE algorithm in three aspects. First, in the process of mutation, the hyperbolic tangent function is used as a variable-factor change trend function to balance global exploration and local exploitation abilities in the evolution process. Second, we adapt a linearly varied crossover factor in the crossover operation; with the increase in evolutionary  time, the crossover ability gradually changes from strong to weak. Finally, in selection process, the Metropolis criterion of the SA algorithm is used to accept a poor solution as optimal solution, which gives the DE algorithm an enhanced ability to enrich population diversity and get rid of the local optimum. To test the performance of the ASADE algorithm, we analyze the effectiveness of three ASADE versions on CEC2017 test and compare it with four improved evolution algorithms on six typical benchmark functions. The experimental results demonstrate that the performance of the ASADE algorithm is improved compared with other algorithms. In the future, the population reduction strategy, success-history slots, and a uniform distribution or a Cauchy distribution for the parameters can be employed to the ASADE algorithm to improve the performance of the algorithm. In addition, more and more security problems of IoT Ecosystem could be transformed into nonlinear real-parameter optimization problems and the ASADE algorithm could be applied to resolve them with high accuracy and fast convergence.

Data Availability
The data used to support the findings of this study are included within the article.

Conflicts of Interest
The authors declare that they have no conflicts of interest.