Modified Backtracking Search Optimization Algorithm Inspired by Simulated Annealing for Constrained Engineering Optimization Problems

The backtracking search optimization algorithm (BSA) is a population-based evolutionary algorithm for numerical optimization problems. BSA has a powerful global exploration capacity while its local exploitation capability is relatively poor. This affects the convergence speed of the algorithm. In this paper, we propose a modified BSA inspired by simulated annealing (BSAISA) to overcome the deficiency of BSA. In the BSAISA, the amplitude control factor (F) is modified based on the Metropolis criterion in simulated annealing. The redesigned F could be adaptively decreased as the number of iterations increases and it does not introduce extra parameters. A self-adaptive ε-constrained method is used to handle the strict constraints. We compared the performance of the proposed BSAISA with BSA and other well-known algorithms when solving thirteen constrained benchmarks and five engineering design problems. The simulation results demonstrated that BSAISA is more effective than BSA and more competitive with other well-known algorithms in terms of convergence speed.


Introduction
Optimization is an essential research objective in the fields of applied mathematics and computer sciences. Optimization algorithms mainly aim to obtain the global optimum for optimization problems. There are many different kinds of optimization problems in real world. When an optimization problem has a simple and explicit gradient information or requires relatively small budgets of allowed function evaluations, the implementation of classical optimization techniques such as mathematical programming often could achieve efficient results [1]. However, many real-world engineering optimization problems may have complex, nonlinear, or nondifferentiable forms, which make them difficult to be tackled by using classical optimization techniques. The emergence of metaheuristic algorithms has overcome the deficiencies of classical optimization techniques to some extent, as they do not require gradient information and have the ability to escape from local optima. Metaheuristic algorithms are mainly inspired from a variety of natural phenomena and/or biological social behavior. Among these metaheuristic algorithms, swarm intelligence algorithms and evolutionary algorithms perhaps are the most attractive [2]. Swarm intelligence algorithms [3] generally simulate the intelligence behavior of swarms of creatures, such as particle swarm optimization (PSO) [4], ant colony optimization (ACO) [5], cuckoo search (CS) [6], and the artificial bee colony (ABC) algorithm [7]. These types of algorithms generally are developed by inspirations from a series of complex behavior processes in swarms with mutual cooperation and self-organization, in which "cooperation" is their core concept. The evolutionary algorithms (EAs) [8,9] are inspired by the mechanism of nature evolution, in which "evolution" is the key idea. Examples of EAs include genetic algorithm (GA) [10], differential evolution (DE) [11][12][13][14], covariance matrix adaptation evolution strategy (CMAES) [15], and the backtracking search optimization algorithm (BSA) [16].
BSA is an iterative population-based EA, which was first proposed by Civicioglu in 2013. BSA has three basic genetic operators: selection, mutation, and crossover. The 2 Computational Intelligence and Neuroscience main difference between BSA and other similar algorithms is that BSA possesses a memory for storing a population from a randomly chosen previous generation, which is used to generate the search-direction matrix for the next iteration. In addition, BSA has a simple structure, which makes it efficient, fast, and capable of solving multimodal problems. BSA has only one control parameter called the mix-rate, which significantly reduces the sensitivity of the initial values to the algorithm's parameters. Due to these characteristics, in less than 4 years, BSA has been employed successfully to solve various engineering optimization problems, such as power systems [17][18][19], induction motor [20,21], antenna arrays [22,23], digital image processing [24,25], artificial neural networks [26][27][28][29], and energy and environmental management [30][31][32].
However, BSA has a weak local exploitation capacity and its convergence speed is relatively slow. Thus, many studies have attempted to improve the performance of BSA and some modifications of BSA have been proposed to overcome the deficiencies. From the perspective of modified object, the modifications of BSA can be divided into the following four categories. It is noted that we consider classifying the publication into the major modification category if it has more than one modification: (i) Modifications of the initial populations [33][34][35][36][37][38] (ii) Modifications of the reproduction operators, including the mutation and crossover operators [39][40][41][42][43][44][45][46][47] (iii) Modifications of the selection operators, including the local exploitation strategy [48][49][50][51] (iv) Modifications of the control factor and parameter [52][53][54][55][56][57].
The research on controlling parameters of EAs is one of the most promising areas of research in evolutionary computation; even a little modification of parameters in an algorithm can make a considerable difference [58]. In the basic BSA, the value of amplitude control factor ( ) is the product of three and the standard normal distribution random number (i.e., = 3 ⋅ randn), which is often too large or too small according to its formulation. This may give BSA a powerful global exploration capability at the early iterations; however, it also affects the later exploitation capability of BSA. Based on these considerations, we focus mainly on the influence of the amplitude control factor ( ) on the BSA, that is, the fourth of the categories defined above. Duan and Luo [52] redesigned an adaptive based on the fitness statistics of population at each iteration. Wang et al. [53] and Tian et al. [54] proposed an adaptive based on Maxwell-Boltzmann distribution. Askarzadeh and dos Santos Coelho [55] proposed an adaptive based on Burger's chaotic map. Chen et al. [56] redesigned an adaptive by introducing two extra parameters. Nama et al. [57] proposed a new to adaptively change in the range of (0.45, 1.99) and new mix-rate to randomly change in the range of (0, 1). These modifications of have achieved good effects in the BSA.
Different from the modifications of in BSA described above, a modified version of BSA (BSAISA) inspired by simulated annealing (SA) is proposed in this paper. In the BSAISA, based on iterations is redesigned by learning a characteristic where SA can probabilistically accept a higher energy state and the acceptance probability decreases with the decrease in temperature. The redesigned can adaptively decrease as the number of iterations increases without introducing extra parameters. This adaptive variation tendency provides an efficient tradeoff between early exploration and later exploitation capability. We verified the effectiveness and competitiveness of BSAISA in simulation experiments using thirteen constrained benchmarks and five engineering design problems in terms of convergence speed.
The remainder of this paper is organized as follows. Section 2 introduces the basic BSA. As the main contribution of this paper, a detailed explanation of BSAISA is presented in Section 3. In Section 4, we present two sets of simulation experiments in which we implemented BSAISA and BSA to solve thirteen constrained optimization and five engineering design problems. The results are compared with those obtained by other well-known algorithms in terms of the solution quality and function evaluations. Finally, we give our concluding remarks in Section 5.

Backtracking Search Optimization Algorithm (BSA)
BSA is a population-based iterative EA. BSA generates trial populations to take control of the amplitude of the searchdirection matrix which provides a strong global exploration capability. BSA equiprobably uses two random crossover strategies to exchange the corresponding elements of individuals in populations and trial populations during the process of crossover. Moreover, BSA has two selection processes. One is used to select population from the current and historical populations; the other is used to select the optimal population. In general, BSA can be divided into five processes: initialization, selection I, mutation, crossover, and selection II [16].
2.1. Initialization. BSA generates initial population and initial old population oldP using , ∼ (low , up ) , where , and oldP , are the th individual elements in the problem dimension ( ) that falls in th individual position in the population size ( ), respectively, low and up mean the lower boundary and the upper boundary of the th dimension, respectively, and is a random uniform distribution.

Selection I.
BSA's selection I process is the beginning of each iteration. It aims to reselect a new oldP for calculating the search direction based on population and historical population oldP. The new oldP is reselected through the "ifthen" rule in if < then oldP fl | , ∼ (0, 1) , Computational Intelligence and Neuroscience 3 where fl is the update operation; and represent random numbers between 0 and 1. The update operation (see (2)) ensures BSA has a memory. After oldP is reselected, the order of the individuals in oldP is randomly permuted by oldP fl permuting (oldP) . (3)

Mutation.
The mutation operator is used for generating the initial form of trial population with where F is the amplitude control factor of mutation operator used to control the amplitude of the search direction. The value = 3 ⋅ randn, where randn ∼ (0, 1), and is standard normal distribution.

Crossover.
In this process, BSA generates the final form of trial population . BSA equiprobably uses two crossover strategies to manipulate the selected elements of the individuals at each iteration. Both the strategies generate different binary integer-valued matrices (map) of size ⋅ to select the elements of individuals that have to be manipulated.
Strategy I uses the mix-rate parameter (mix-rate) to control the numbers of elements of individuals that are manipulated by using ⌈mix-rate ⋅ rand ⋅ ⌉, where mix-rate = 1. Strategy II manipulates the only one randomly selected element of individuals by using randi( ), where randi ∼ (0, ).
At the end of crossover process, if some individuals in have overflowed the allowed search space limits, they will need to be regenerated by using (1).

Selection II.
In BSA's selection II process, the fitness values in and are compared, used to update the population based on a greedy selection. If have a better fitness value than , then, is updated to be . The population is updated by using If the best individual of ( best ) has a better fitness value than the global minimum value obtained, the global minimizer will be updated to be best , and the global minimum value will be updated to be the fitness value of best .

Modified BSA Inspired by SA (BSAISA)
As mentioned in the introduction of this paper, the research work on the control parameters of an algorithm is very meaningful and valuable. In this paper, in order to improve BSA's exploitation capability and convergence speed, we propose a modified version of BSA (BSAISA) where the redesign of is inspired by SA. The modified details of BSAISA are described in this section. First, the structure of the modified is described, before we explain the detailed design principle for the modified inspired by SA. Subsequently, two numerical tests are used to illustrate that the redesigned improves the convergence speed of the algorithm. We introduce a selfadaptive -constrained method for handling constraints at the end of this section.
3.1. Structure of the Adaptive Amplitude Control Factor . The modified is a normally distributed random number, where its mean value is an exponential function and its variance is equal to 1. In BSAISA, we redesign the adaptive to replace the original version using where is the index of individuals, is the adaptive amplitude control factor that corresponds to the th individual, |Δ | is the absolute value of the difference between the objective function values of and (individual differences), is the normal distribution, and is the current iteration. According to (6), the exponential function (mean value) decreases dynamically with the change in the number of iterations ( ) and the individual differences (|Δ |). Based on the probability density function curve of the normal distribution, the modified can be decreased adaptively as the number of iterations increases. Another characteristic of the modified F is that there are not any extra parameters.

Design Principle of the Modified Amplitude Control Factor
. The design principle of the modified is inspired by the Metropolis criterion in SA. SA is a metaheuristic optimization technique based on physical behavior in nature. SA based on the Monte Carlo method was first proposed by Metropolis et al. [59] and it was successfully introduced into the field of combinatorial optimization for solving complex optimization problems by Kirkpatrick et al. [60].
The basic concept of SA derives from the process of physical annealing with solids. An annealing process occurs when a metal is heated to a molten state with a high temperature; then it is cooled slowly. If the temperature is decreased quickly, the resulting crystal will have many defects and it is just metastable; even the most stable crystalline state will be achieved at all. In other words, this may form a higher energy state than the most stable crystalline state. Therefore, in order to reach the absolute minimum energy state, the temperature needs to be decreased at a slow rate. SA simulates this process of annealing to search the global optimal solution in an optimization problem. However, accepting only the moves that lower the energy of system is like extremely rapid quenching; thus SA uses a special and effective acceptance method, that is, Metropolis criterion, which can probabilistically accept the hill-climbing moves (the higher energy moves). As a result, the energy of the system evolves into a Boltzmann distribution during the process of the simulated annealing. From this angle of view, it is no exaggeration to say that the Metropolis criterion is the core of SA.
The Metropolis criterion can be expressed by the physical significance of energy, where the new energy state will be accepted when the new energy state is lower than the previous energy state, and the new energy state will be probabilistically accepted when the new energy state is higher than the previous energy state. This feature of SA can escape from being trapped in local minima especially in the early stages of the search. It can also be described as follows.
(i) If Δ = − ≤ 0, then the new state is accepted and the energy with the displaced atom is used as the starting point for the next step, where represents the energy of the atom. Both and are the states of atoms, and is the next state of .
(ii) If Δ > 0, then calculate the probability of (Δ ) = exp(−Δ / ), and generate a random number , which is a uniform distribution over (0, 1), where is Boltzmann's constant (in general, = 1) and is the current temperature. If (Δ ) > , then the new energy will be accepted; otherwise, the previous energy is used to start the next step. Analysis 1. The Metropolis criterion states that SA has two characteristics: (1) SA can probabilistically accept the higher energy and (2) the acceptance probability of SA decreases as the temperature decreases. Therefore, SA can reject and jump out of a local minimum with a dynamic and decreasing probability to continue exploiting the other solutions in the state space. This acceptance mechanism can enrich the diversity of energy states.

Analysis 2.
As shown in (4), is used to control the amplitude of population mutation in BSA, thus is an important factor for controlling population diversity. If is excessively large, the diversity of the population will be too high and the convergence speed of BSA will slow down. If is excessively small, the diversity of the population will be reduced so it will be difficult for BSA to obtain the global optimum and it may be readily trapped by a local optimum. Therefore, adaptively controlling the amplitude of is a key to accelerating the convergence speed of the algorithm and maintaining its the population diversity.
Based on Analyses 1 and 2, it is clear that if can dynamically decrease, the convergence speed of BSA will be able to accelerate while maintaining the population diversity. On the other hand, SA possesses this characteristic that its acceptance probability can be dynamically reduced. Based on these two considerations, we propose BSAISA with a redesigned , which is inspired by SA. More specifically, the new (see (6)) is redesigned by learning the formulation ( (Δ )) of acceptance probability, and its formulation has been shown in the previous subsection.
For the two formulas of the modified and (Δ ), the individual difference (|Δ |) of a population or the energy difference (Δ ) of a system will decrease as the number of iterations increases in an algorithm, and the temperature of SA tends to decrease, while the iteration of BSA tends to increase. As a result, one can observe the correspondence between modified and (Δ ) where the reciprocal of individual difference (1/|Δ |) corresponds to the energy difference (Δ ) of SA, and the reciprocal of current iteration (1/ ) corresponds to the current temperature ( ) of SA. In this way, the redesigned can be decreased adaptively as the number of iterations increases.

Numerical Analysis of the Modified Amplitude Control
Factor . In order to verify that the convergence speed of the basic BSA is improved with the modified , two types (unimodal and multimodal) of unconstrained benchmark functions are used to test the changing trends in and population variances and best function values as the iterations increases, respectively. The two functions are Schwefel 1.2 and Rastrigin, and their detailed information is provided in [61]. The two functions and the user parameters including the populations ( ), dimensions ( ), and maximum iterations (Max) are shown in Table 1. Three groups of test results are compared in the tests including (1) the comparative curves of the mean values of the modified and original for Schwefel 1. (1) According to the trends of F in Figure 1, both the original F and modified F are subject to changes from the normal distribution. The mean value of the original F does not tend to decrease as the number of iterations increases. By contrast, the mean value of the modified F exhibits a clear and fluctuating downward trend as the number of iterations increases.
(2) According to Figure 2, the population variances of BSAISA and BSA both exhibit a clear downward trend as the number of iterations increases. The population variances of BSAISA and BSA are almost same during the early iterations. This illustrates that the modified F does not reduce the population diversity in the early iterations. In the middle and later iterations, the population variances of BSAISA decrease more quickly than that of BSA. This illustrates that the modified F improves the convergence speed. As can be seen from Figure 3, as the number of iterations increases, the best objective function value of BSAISA drops faster than that of BSA. This shows that the modified F improves the convergence speed of BSA. Moreover, BSAISA can find a more accurate solution at the same computational cost.
Summary. Based on the design principle and numerical analysis of the modified , the modified exhibits an overall fluctuating and downward trend, which matches with the concept of the acceptance probability in SA. In particular, during the early iterations, the modified is relatively large. This allows BSAISA to search in a wide region, while maintaining the population diversity of BSAISA. As the number of iterations decreases, the modified gradually exhibits a decreasing trend. This accelerates the convergence speed of BSAISA. In the later iterations, the modified is relatively small. This enables BSAISA to fully search in the Note. " " means populations, " " means dimensions, and "Max" means maximum iterations.   local region. Therefore, it can be concluded that BSAISA can adaptively control the amplitude of population mutation to change its local exploitation capacity. This may improve the convergence speed of the algorithm. Moreover, the modified does not introduce extra parameters, so it does not increase the sensitivity of BSAISA to the parameters.

A Self-Adaptive -Constrained
Computational Intelligence and Neuroscience 7 where = ( 1 , 2 , . . . , ) ∈ is a -dimensional vector, is the total number of inequality constraints, and n is the total number of equality constraints. The equality constraints are transformed into inequality constraints by using |ℎ ( )|− ≤ 0, where is a very small degree of violation, and = 1 − 4 in this paper. The maximization problems are transformed into minimization problems using − ( ). The constraint violation ( ) is given by Several constraint-handling methods have been proposed previously, where the five most commonly used methods comprise penalty functions, feasibility and dominance rules (FAD), stochastic ranking, -constrained methods, and multiobjectives concepts. Among these five methods, theconstrained method is relatively effective and used widely. Zhang et al. [62] proposed a self-adaptive -constrained method (SA ) to combine with the basic BSA for constrained problems. It has been verified that the SA has a stronger search efficiency and convergence than the fixed -constrained method and FAD. In this paper, the SA is used to combine with BSAISA for constrained optimization problems, which comprises the following two rules: (1) if the constraint violations of two solutions are smaller than a given value or two solutions have the same constraint violations, the solution with a better objective function value is preferred and (2) if not, the solution with a smaller constraint violation is preferred. SA could be expressed by the following equations: where is a positive value that represents a tolerance related to constraint violation. The self-adaptive value is formulated as the following equation: where is the number of the current iterations. Firstly, 0 is set as ( 0 ). If the initial value 0 is bigger than Th1, ( ) will be assigned to 2 ( ); otherwise 0 will be assigned to 2 ( ). Then, if 2 ( ) < 1 ( − 1) and 2 ( ) is bigger than Th2, 2 ( ) will be assigned to 1 ( ); otherwise 1 ( −1) will be assigned to 1 ( ). Finally, is updated as (14). The detailed information of SA can be acquired from [62], and the related parameter settings of SA (the same as [62]) are presented in Table 2.
To illustrate the changing trend of the self-adaptive value vividly, BSAISA with SA is used to solve a well-known benchmark constrained function G10 in [61]. The related parameters are set as = 30, = 0.3 , cp = 5, = 2333, Th1 = 10, and Th2 = 2. The changing trend of value is shown in Figure 4. Three sampling points, that is, (500) = 0.6033, (1000) = 0.1227, and (2000) = 1.194 − 4, are marked in Figure 4. As shown in Figure 4, it can be observed that value declines very fast at first. After it is smaller than about 2, it declines as an exponential way. This changing trend of value could help algorithm to sufficiently search infeasible domains near feasible domains.
The pseudocode for BSAISA is showed in Pseudocode 1. In Pseudocode 1, the modified adaptive is shown in lines (14)- (16). When BSAISA deals with constrained optimization problems, the code in line (8) and line (40) in Pseudocode 1 should consider objective function value and constraint violation simultaneously, and SA is applied to choose a better solution or best solution in line (42) and lines (47)-(48).

Experimental Studies
In this section, two sets of simulation experiments were executed to evaluate the effectiveness of the proposed BSAISA. The first experiment set performed on 13 wellknown benchmark constrained functions taken from [63] (see Appendix A). These thirteen benchmarks contain different properties as shown in Table 3, including the number of variables ( ), objective function types, the feasibility ratio ( ), constraint types and number, and the number of active constraints in the optimum solution. The second experiment 8 Computational Intelligence and Neuroscience Note. " max " means the maximum iterations; " " means populations. Note. " " is the number of variables. " " represents feasibility ratio. "LI," "NI," "LE," and "NE" represent linear inequality, nonlinear inequality, linear equality, and nonlinear equality, respectively. "Active" represents the number of active constraints at the global optimum. is conducted on 5 engineering constrained optimization problems chosen from [64] (see Appendix B). These five problems are the three-bar truss design problem (TTP), pressure vessel design problem (PVP), tension/compression spring design problem (TCSP), welded beam design problem (WBP), and speed reducer design problem (SRP), respectively. These engineering problems include objective functions and constraints of various types and natures (quadratic, cubic, polynomial, and nonlinear) with various number of design variables (continuous, integer, mixed, and discrete). The recorded experimental results include the best function value (Best), the worst function value (Worst), the mean function value (Mean), the standard deviation (Std), the best solution (variables of best function value), the corresponding constraint value, and the number of function evaluations (FEs). The number of function evaluations can be considered as a convergence rate or a computational cost.
In order to evaluate the performance of BSAISA in terms of convergence speed, the FEs are considered as the best FEs corresponding to the obtained best solution in this paper. The calculation of FEs are the product of population sizes ( ) and the number of iterations (Ibest) at which the best function value is first obtained (i.e., FEs = * Ibest). For example, if 2500 is the maximum number of iterations for one minimization problem, (1999) = 0.0126653, (2000) = 0.0126652, (2500) = (2000) = 0.0126652, the Ibest value should be 2000. However, BSAISA needs to evaluate the initial historical population (oldP), so its actual FEs should be plus (i.e., FEs = * Ibest + ).

Parameter Settings.
For the first experiment, the main parameters for 13 benchmark constrained functions are the same as the following: population size ( ) is set as 30; the maximum number of iterations ( max ) is set as 11665. Therefore, BSAISA's maximum number of function evaluations (MFEs) should equal to 34,9980 (nearly 35,0000). The 13 benchmarks were executed by using 30 independent runs.

Simulation on
In order to further verify the competitiveness of BSAISA in aspect of convergence speed, we compared BSAISA with some classic and state-of-the-art approaches in terms of best function value and function evaluations. The best function value and the corresponding FEs of each algorithm on 13 benchmarks are presented in Table 7, where the optimal results are in bold on each function. These compared algorithms are listed below: (1) Stochastic ranking (SR) [63] (2) Filter simulated annealing (FSA) [65] (3) Cultured differential evolution (CDE) [66] (4) Agent based memetic algorithm (AMA) [64] (5) Modified artificial bee colony (MABC) algorithm [67] (6) Rough penalty genetic algorithm (RPGA) [68] (7) BSA combined self-adaptive constrained method (BSA-SA ) [62].
To compare these algorithms synthetically, a simple evaluation mechanism is used. It can be explained as the best function value (Best) is preferred, and the function evaluations (FEs) are secondary. More specifically, (1) if one algorithm has a better Best than those of others on a function, there is no need to consider FEs and the algorithm is superior to other algorithms on this function. (2) If two or more algorithms have found the optimal Best on a function, the algorithm with the lowest FEs is considered as the winner on the function. (3) Record the number of winners and the number of the optimal function values for each algorithm on the set of benchmarks, and then give the sort for all algorithms.
From Table 7, it can be observed that the rank of these 8 algorithms is as follows: BSAISA, CDE, BSA-SA , MABC, SR, RPGA, FSA, and AMA. Among the 13 benchmarks, BSAISA wins on 6 functions and it is able to find the optimal values of 10 functions. This is better than all other algorithms, thus BSAISA ranks the first. The second algorithm CDE performs better on G02, G07, G09, and G10 than BSAISA but worse on G01, G03, G08, G11, G12, and G13. BSA-SA obtains the optimal function values of 10 functions but requires more function evaluations than BSAISA and CDE, so it should rank the third. MABC ranks the fourth. It obtains the optimal function values of 7 functions, which are fewer in number than those of the former three algorithms. Both SR and RPGA have found the same number of the optimal function values, while the former is the winner on G04, so SR is slightly better than RPGA. As for the last two algorithms, FSA and AMA just perform well on three functions, while FSA is the winner on G06, so FSA is slightly better than AMA.
Based on the above comparison, it can be concluded that BSAISA is effective and competitive in terms of convergence speed.

Simulation on Engineering Design
Problems. In order to assess the optimization performance of BSAISA in real-world engineering constrained optimization problems, 5 wellknown engineering constrained design problems including three-bar truss design, pressure vessel design, tension/compression spring design, welded beam design, and speed reducer design are considered in the second experiment. Note. "Known optimal" denotes the best known function values in the literatures. "Bold" means the algorithm has found the best known function values. The same as Table 6.

Three-Bar Truss Design Problem (TTP).
The threebar truss problem is one of the engineering minimization test problems for constrained algorithms. The best feasible solution is obtained by BSAISA at = (0.788675, 0.408248) with the objective function value ( ) = 263.895843 using 8940 FEs. The comparison of the best solutions obtained from BSAISA, BSA, differential evolution with dynamic stochastic selection (DEDS) [69], hybrid evolutionary algorithm (HEAA) [70], hybrid particle swarm optimization with differential evolution (POS-DE) [71], differential evolution with level comparison (DELC) [72], and mine blast algorithm (MBA) [73] is presented in Table 8. Their statistical results are listed in Table 9.
From Tables 8 and 9 Figure 7 depicts the convergence curves of BSAISA and BSA for the three-bar truss design problem, where the value of ( * ) on the vertical axis equals 263.895843. As shown in Figure 7, BSA achieves the global optimum at about 700 iterations, while BSAISA only reaches the global optimum at about 400 iterations. It can be concluded that the convergence speed of BSAISA is faster than that of BSA for this problem.  Note. "NA" means not available. The same as Tables 8,10,11,12,13,14,15, and 17. "RK" represents the comprehensive ranking of each algorithm on the set of benchmarks. "Nu." represents the sum of the number of winners and the number of the optimal function values for each algorithm on the set of benchmarks.   For this problem, BSAISA is compared with nine algorithms: BSA, BSA-SA [62], DELC, POS-DE, genetic algorithms based on dominance tournament selection (GA-DT) [73], modified differential evolution (MDE) [74], coevolutionary particle swarm optimization (CPSO) [75], hybrid particle swarm optimization (HPSO) [76], and artificial bee colony algorithm (ABC) [77]. The comparison of the best solutions obtained by BSAISA and other reported algorithms is presented in Table 10. The statistical results of various algorithms are listed in Table 11.

Pressure Vessel
As shown Table 10, the obtained solution sets of all algorithms satisfy the constraints for this problem. BSAISA, BSA-SA , ABC, DELC, and HPSO find the same considerable good objective function value 6059.7143, which is slightly worse than MDE's function value 6059.7143. It is worth mentioning that MBA's best solution was obtained at = (0.7802, 0.3856, 40.4292, 198.4964) with ( ) = 5889.3216 and the corresponding constraint values equal to ( ) = (0, 0, −86.3645, −41.5035) in [78]. Though MBA finds a far better function value than that of MDE, its obtained variables (i.e., 0.7802 and 0.3856) are not integer multiples of 0.0625. So they are not listed in Table 10 to ensure a fair comparison. From Table 11, except for MDE with the function value of 6059.7016, BSAISA offers better function value results compared to GA-DT, CPSO, ABC, and BSA. Besides that, BSAISA is far superior to other algorithms in terms of FEs. Unfortunately, the obtained Std value of BSAISA is relatively poor compared with others for this problem.     Figure 8 describes the convergence curves of BSAISA and BSA for the pressure vessel design problem, where the value of ( * ) on the vertical axis equals 6059.7143. As shown in Figure 8, BSAISA is able to find the global optimum at about 800 iterations and obtains a far more accurate function value than that of BSA. Moreover, the convergence speed of BSAISA is much faster than that of BSA.

Tension Compression Spring Design Problem (TCSP).
This design optimization problem has three continuous variables and four nonlinear inequality constraints. The best feasible solution is obtained by BSAISA at = (0.051687, 0.356669, 11.291824) with ( ) = 0.012665 using 9440 FEs. This problem has been solved by other methods as follows: GA-DT, MDE, CPSO, HPSO, DEDS, HEAA, DELC, POS-DE, ABC, MBA, BSA-SA , and Social Spider Optimization (SSOC) [79]. The comparison of the best solutions obtained from various algorithms is presented in Table 12. Their statistical results are listed in Table 13. From Tables 12 and 13, the vast majority of algorithms can find the best function value 0.012665 for this problem, while GA-DT and CPSO fail to find it. With regard to the computational cost (FEs), BSAISA only requires 9440 FEs when it reaches the global optimum, which is superior to all other algorithms except MBA with 7650 FEs. However, the Worst and Mean and Std values of BSAISA are better than those of MBA. Consequently, for this problem, it can be concluded that BSA has the obvious superiority in terms of FEs over all other algorithms except MBA. Moreover, BSAISA has a stronger robustness when compared with MBA alone. Figure 9 depicts the convergence curves of BSAISA and BSA for the tension compression spring design problem, where the value of ( * ) on the vertical axis equals 0.012665. From Figure 9 it can be observed that both BSAISA and BSA fall into a local optimum in the early iterations but they are able to successfully escape from the local optimum. However, the convergence speed of BSAISA is obviously faster than that of BSA.    Table 14. The comparison of their statistical results is presented in Table 15.
From Tables 14 and 15, except that the constraint value of PSO is not available, the obtained solution sets of all algorithms satisfy the constraints for the problem. Most of algorithms including BSAISA, BSA, BSA-SA , MDE, HPSO, DELC, POS-DE, ABC, and SSOC are able to find the best function value 1.724852, while GA-DT and CPSO and MBA fail to find it. It should be admitted that DELC is superior   Table 15. Figure 10 depicts the convergence curves of BSAISA and BSA for the welded beam design problem, where the value of ( * ) on the vertical axis equals 1.724852. Figure 10 shows the convergence speed of BSAISA is faster than that of BSA remarkably.

Speed Reducer Design Problem (SRP).
This speed reducer design problem has eleven constraints and six continuous design variables ( 1 , 2 , 4 , 5 , 6 , 7 ) and one integer variable ( Table 17. As shown in Tables 16 and 17, the obtained solution sets of all algorithms satisfy the constraints for this problem. BSAISA, BSA, DEDS, and DELC are able to find the best function value 2994.471066 while the others do not. Among the four algorithms, DEDS, DELC, and BSA require 30000, 30000, and 25640 FEs, respectively. However, BSAISA requires only 15,860 FEs when it reaches the same best function value. MBA fails to find the best known function value; thus BSAISA is better than MBA in this problem, even though MBA has lower FEs. As for the comparison of the Std, among the four algorithms that achieve the best known function value, BSAISA is worse than the others. However, one thing that should be mentioned is that the main   purpose of the experiment is to compare the convergence speed between BSAISA and other algorithms. From this point of view, it can be concluded that BSAISA has a better performance than other algorithms in terms of convergence speed. Figure 11 depicts the convergence curves of BSAISA and BSA for the speed reducer design problem, where the value of ( * ) on the vertical axis equals 2994.471066. Figure 11 shows that the convergence speed of BSAISA is faster than that of BSA. [80] is one of the most popular statistical methods used to determine whether two algorithms are significantly different. Recently, Miao et  Note. The columns of "+," "≈," and "−" indicate the number of functions where BSAISA performs significantly better than, almost the same as, or significantly worse than the compared algorithm, respectively. " value" denotes the probability value supporting the null hypothesis.

Comparisons Using Sign Test. Sign Test
al. [81] utilized Sign Test method to analyze the performances between their proposed modified algorithm and the original one. In this paper, the two-tailed Sign Test with a significance level 0.05 is adopted to test the significant differences between the results obtained by different algorithms, and the test results are given in Table 18. The values of Best and FEs are two most important criterions for the evaluations of algorithms in our paper; they therefore should be chosen as the objectives of the Sign Test. The signs "+," "≈," and "−" represent, respectively, the fact that our BSAISA performs significantly better than, almost the same as, or significantly worse than the algorithm it is compared to. The null hypothesis herein is that the performances between BSAISA and one of the others are not significantly differential. As shown in Table 18, the values of supporting the null hypothesis of Sign Test for six pairs of algorithms (BSAISA-SR, BSAISA-FSA, BSAISA-AMA, BSAISA-MABC, BSAISA-RPGA, and BSAISA-BSA-SA ) are 0.006, 0.003, 0.000, 0.012, 0.001, and 0.039, respectively, and thereby we can reject the null hypothesis. This illustrates that the optimization performance of the proposed BSAISA is significantly better than those of the six algorithms. The value of BSAISA-CDE is equal to 0.581, which shows that we cannot reject the null hypothesis. However, according to the related sign values ("+," "≈," and "−") from Table 18, BSAISA is slightly worse than CDE on 5 problems but wins on another 8 problems, which illustrates that the proposed BSAISA has a relatively excellent competitiveness compared with the CDE. Generally, the statistical values and sign values validate that BSAISA has the superiority compared to the other well-known algorithms on the constrained optimization problems.
On the one hand, all experimental results suggest that the proposed method improves the convergence speed of BSA. On the other hand, the overall comparative results of BSAISA and other well-known algorithms demonstrate that BSAISA is more effective and competitive for constrained and engineering optimization problems in terms of convergence speed.

Conclusions and Future Work
In this paper, we proposed a modified version of BSA inspired by the Metropolis criterion in SA (BSAISA). The Metropolis criterion may probabilistically accept a higher energy state and the acceptance probability can decrease as the temperature decreases, which motivated us to redesign the amplitude control factor so it can adaptively decrease as the number of iterations increases. The design principle and numerical analysis of the redesigned indicate that the change in could accelerate the convergence speed of the algorithm by improving the local exploitation capability. Furthermore, the redesigned does not introduce extra parameters. We successfully implemented BSAISA to solve some constrained optimization and engineering design problems. The experimental results demonstrated that BSAISA has a faster convergence speed than BSA and it can efficiently balance the capacity for global exploration and local exploitation. The comparisons of the results obtained by BSAISA and other well-known algorithms demonstrated that BSAISA is more effective and competitive for constrained and engineering optimization problems in terms of convergence speed.
This paper suggests that the proposed BSAISA has a superiority in terms of convergence speed or computational cost. The downside of the proposed algorithm is, of course, that its robustness does not show enough superiority. So our future work is to further research into the robustness of BSAISA on the basis of current research. Niche technique is able to effectively maintain population diversity of evolutionary algorithms [82,83]. How to combine BSAISA with niche technology to improve robustness of the algorithm may deserve to be studied in the future.

Conflicts of Interest
The authors declare that they have no conflicts of interest.