Discrete and Continuous Optimization Based on Hierarchical Artificial Bee Colony Optimizer

This paper presents a novel optimization algorithm, namely, hierarchical artificial bee colony optimization (HABC), to tackle complex high-dimensional problems. In the proposed multilevel model, the higher-level species can be aggregated by the subpopulations from lower level. In the bottom level, each subpopulation employing the canonical ABC method searches the part-dimensional optimum in parallel, which can be constructed into a complete solution for the upper level. At the same time, the comprehensive learning method with crossover and mutation operator is applied to enhance the global search ability between species. Experiments are conducted on a set of 20 continuous and discrete benchmark problems. The experimental results demonstrate remarkable performance of the HABC algorithm when compared with other six evolutionary algorithms.


Introduction
Swarm intelligence (SI) is a new paradigm of artificial intelligence system whereby small entities are working together to produce an intelligent behaviour from simple rules [1][2][3][4]. Artificial bee colony algorithm (ABC) is one of the most popular members of the family of swarm intelligence, which simulates the social foraging behavior of a honeybee swarm [5,6]. Due to its simple arithmetic and good robustness, the ABC algorithm has been widely used in solving many numerical optimizations [7][8][9] and engineering optimization problems [10][11][12][13][14].
However, when solving complex multimodal problems, ABC algorithm suffers from the following drawbacks [8].
(1) It is easily trapped into local minimums in the early generations, which leads to low population diversity in successive generations. (2) With the dimension increasing, the information exchange of each individual is limited in a random dimension, resulting in a slow convergence rate.
(3) Due to the random selection of the neighbor bee and dimensions, food sources with higher fitness are not utilized, which influences the ability of global search.
Moreover, compared to the huge in-depth studies of other evolutionary and swarm intelligence algorithms, such as evolutionary algorithm (EA) [15][16][17][18] and particle swarm optimization (PSO) [19][20][21][22][23], how to improve the diversity of swarm or overcome the local convergence of ABC is still a challenging to the researchers in optimization domain.
In order to improve the performance of ABC on complex and high-dimensional problems, in this paper, a novel optimization algorithm called hierarchical ABC optimization (HABC) is proposed to extend the topology of original ABC algorithm from flat (one level) to hierarchical (multiple levels), which adopts multipattern cooperative evolutionary strategies.
It is noted that the hierarchical coevolutionary scheme has been incorporated in these intelligent algorithms. For instance, a hierarchical version of PSO was proposed in [24] on IEEE Trans on Evol Comp, in which a tree type hierarchy was incorporated in PSO. Chen et al. [25] proposed 2 Journal of Applied Mathematics a hierarchical swarm model where each subswarm evolved based on original PSO. Peng and Lu [26] presented a similar PSO framework, in which several swarms evolve in parallel with mutation operator. This paper further extends the above studies based on hierarchical frameworks with more complex strategies, such as variables decomposing approach, random grouping of variables, and cross and mutation operation. In particular, instead of simply employing original SI algorithm in parallel like other hierarchical algorithms, the HABC employing the variables decomposing strategy with random grouping technique divides the complex vectors into smaller components assigned to different subswarms, which achieves a significant improvement of solving high-dimensional problem.
In terms of neighborhood concept, Kennedy and Mendes [27] analyzed effects of different topologies-based PSO algorithms. In 2004, a fully informed PSO using topologies and index-based neighborhood was proposed in [28]. Recently, Qu et al. [29] developed a neighborhood-based mutation strategy and integrated it with various niching DE algorithms, in which various topology types were considered. A distancebased locally informed particle swarm model was proposed in [30], which eliminates the need to specify any niching parameter and enhance the fine search ability of PSO. Obviously, many topology patterns can be used in different levels of our mode. In this work, we employ the most common topology, ring type, as the main structure of the proposed model. The proposed HABC model is inherently different from others in the following aspects.
(1) The cooperative coevolving approach based on divide-and-conquer strategy enhances the local search ability (exploitation); that is, by applying this method, the complex high-dimensional vectors can be decomposed into smaller components that are assigned to the lower hierarchy. (2) The traditional evolution operators, namely, the crossover and mutation, are applied to interaction of different species instead of the traditional individual exchange between populations. In this case, the neighbor bees with higher fitness can be chosen to crossover and mutation, which effectively enhances the global search ability and convergence speed to the global best solution (exploration).
Extensive studies based on a suit of 20 benchmark functions (including both continuous and discrete cases) are conducted to evaluate the performance of HABC. For comparison purposes, we also implemented the particle swarm optimization algorithm (PSO), cooperative particle swarm optimization algorithm (CPSO), artificial bee colony algorithm (ABC), cooperative artificial bee colony algorithm (CABC), covariance matrix adaptation evolution strategy (CMA-ES), and HABC variants on these functions. The experimental results are encouraging: the proposed HABC algorithm achieved remarkable search performance in terms of solution accuracy and convergence rate on almost all benchmark functions.
The rest of the paper is organized as follows. Section 2 describes the canonical ABC algorithm. In Section 3, the proposed hierarchical artificial bee colony (HABC) model is given. Section 4 tests the algorithm on the benchmarks and illustrates the results. Finally, Section 5 outlines the conclusions.

Canonical ABC Algorithm
The ABC algorithm is a relatively new SI algorithm by simulating the foraging behaviors of honey bee swarm, initially proposed by Karaboga and further developed by Basturk and Li et al. [8,9,31,32]. In ABC, the colony of artificial bees are classified as three types: employed bees, onlookers, and scouts. Employed bees exploit the specific food sources; meanwhile, they pass the food information to onlooker bees. Onlooker bees choose good food sources based on the received information and then further exploit the food near their selected food source. The employed bee will become a scout when its food source has been abandoned. The fundamental mathematic representations are listed as follows.
In initialization phase, a group of food sources are generated randomly in the search space using the following equation: where = 1, 2, . . . , SN and = 1, 2, . . . , . SN is the number of food sources. is the number of variables, that is, problem dimension. min and max are the lower upper and upper bounds of the th variable, respectively. In the employed bees' phase, the neighbor food source (candidate solution) can be generated from the old food source of each employed bee in its memory using the following expression: where is a randomly selected food source and must be different from ; is a randomly chosen indexes; is a random number in range [−1, 1].
In the onlooker bees' phase, a onlooker bee selects a food source depending on the probability value associated with that food source, and can be calculated as follows: where fitness is the fitness value of th solution.
In scout bees' phase, if a food source cannot be improved further through a predetermined cycle (called "limit" in ABC), the food source is supposed to be abandoned. The employed bee subsequently become a scout. A new food source will be produced randomly in the search space using (1).
The employed, onlooker, and scout bees' phases will recycle until the termination condition is met.

Hierarchical Artificial Bee Colony Algorithm
The HABC integrates a two-level hierarchical coevolution scheme inspired by the concept and main ideas of multipopulation coevolution strategy and crossover and mutation operation. The flowchart of the HABC is shown in Figure 1. It includes four key strategy approaches: variables decomposing approach, random grouping of variables, background vector calculating approach, and crossover and mutation operation, which is presented as follows.
3.1. Hierarchical Multipopulation Optimization Model. As described in Section 2, we can see that the new food source is produced by a perturbation coming from a random single dimension in a randomly chosen bee. This causes an individual to may have discovered some good dimensions, while the other individuals that follow this bee are likely to choose worse vectors in -dimensions and abandon the good ones. On the other hand, when solving complex problems, single population-based artificial bee algorithms suffer from the following drawback of premature convergence at the early generations. Hence, the HABC contains two levels, namely, the bottom level and top level, to balance exploring and exploiting ability. In Figure 1, in the bottom level, with the variables decomposing strategy, each subpopulation employs the canonical ABC method to search the part-dimensional optimum in parallel; that is, in each iteration, subpopulations in the bottom level generate best solutions, which are constructed into a complete solution species that are updated to the top level. In the top level, the multispecies community adopts the information exchange mechanism based on crossover operator, by which each species can learn from its neighborhoods in a specific topology. The vectors decomposing strategy and information exchange crossover operator can be described in detail as follows.

Variables Decomposing
Approach. The purpose of this approach is to obtain finer local search in single dimensions inspired by the divide-and-conquer approach. Notice that two aspects must be analyzed: (1) how to decompose the whole solution vector, and (2) how to calculate the fitness of each individual of each subpopulation. The detailed procedure is presented as follows.
Step 1. The simplest grouping method is permitting adimensional vector to be split into subcomponents, each corresponding to a subpopulation of -dimensions, with individuals (where = * ). The th subpopulation is denoted as , ∈ [1 ⋅ ⋅ ⋅ ].
Step 2. Construct complete evolving solution , which is the concatenation of the best subcomponents' solutions by following: where ⋅ represents the personal best solution of the th subpopulation.
Step 3. For each component , ∈ [1 ⋅ ⋅ ⋅ ], do the following. Step 4. Memorize the best solution achieved so far, and compare the best solution with and memorize the best one.

Random Grouping of Variables.
To increase the probability of two interacting variables allocated to the same subcomponent, without assuming any prior knowledge of the problem, according to the random grouping of variables proposed by [21], we adopt the same random grouping scheme by dynamically changing group size. For example, for a problem of 100 dimensions, we can define that = {2, 5, 10, 20, 100} , ⊂ .
Here, if we randomly decompose the -dimensional object vector into subcomponents at each iteration (i.e., we construct each of the subcomponents by randomly selecting -dimensions from the -dimensional object vector), the probability of placing two interacting variables into the same subcomponent becomes higher, over an increasing number of iterations [21].

The Information Exchange Mechanism Based on Crossover
Operator between Multispecies. In the top level, we adopt crossover operator with a specific topology to enhance the information exchange between species, in which each species can learn from its symbiotic partner in the neighborhood. The key operations of this crossover procedure are described in Figure 2.
Step 5 (Select elites to the best-performing list (BPL)). First, a set of competent individuals from current species 's neighborhood (i.e., ring topology) is selected to construct the best-performing list (BPL) with higher fitness that has larger probability to be selected. The size of BPL is equal with the number of current species . These individuals of BPL are regarded as elites. The selection operation tries to mimic the maturing phenomenon in nature, where the generated offspring will become more suitable to the environment by using these elites as parents. Step 6 (Crossover and mutation between species). To produce well-performing individuals, parents are selected from the BPL's elites only for the crossover operation. To select parents effectively, the tournament selection scheme is used. Firstly, two enhanced elites are selected randomly, and their fitness values are compared to select the elites. Then, the one with better fitness value is viewed as parent. Then, another parent is selected in the same way. Two offsprings are created by performing crossover on the selected parents. This paper adopts two representative crossover methods: singlepoint crossover and arithmetic crossover. For single-point crossover, the offspring is produced as the traditional GA crossover method [33]. For arithmetic crossover method, the offspring is produced by (6) as follows: where new is newly produced offspring, parent1 and parent2 are randomly selected from BPL.
Step 7 (Update with greedy selection strategy). Not all current species are replaced by the elites from BPL; we set a selecting rate CR to determine the replaced individuals. Assuming that species size of is , then the replaced individuals number is * CR. For the selected individual, S-dim P 21 best with S-dimensions and M individuals,D = · · · · · · · · · · · · S * K Figure 3: Flowchart of the HABC algorithm.
, the newly produced offspring new is then compared with , applying a greedy selection mechanism, in which the best one is remained. We can choose four selecting approaches: selecting the best individuals (i.e., * CR individuals), a medium level of individuals, the worst individuals, and random individuals.
Hence, there are eight HABC variants according to different crossover and selection approaches, as listed in Table 1.
In the next experiments, we will study the effect of different crossovers, selection modes, and CRs. With applying the effective social learning strategy, the information exchange between species is enhanced and the food sources with higher fitness are fully utilized.
In summary, in order to facilitate the below presentation and test formulation, we define a unified parameters for HABC model in Table 2. According to the process description as mentioned above, the flowchart of HABC algorithm is summarized in Figure 3, while the pseudocode for HABC algorithm is presented in Algorithm 1. Each sub-population with S dimensions (where S is randomly chosen from a set G, and = * ). WHILE (the termination conditions are not met) for each species i, Then ⋅ = ⋅ end end end WHILE Select elites form neighborhood of BPL = the top best individuals of the ring topology { −1 , , +1 } Crossover and Mutation by (4) Update with applying Greedy selection mechanism from end find the global best solution gbest from the whole population P memorize the best solution of each Set := + 1; end WHILE Algorithm 1: Pseudocode for the HABC algorithm.

Experimental Study
In the experimental studies, to fully evaluate the performance of the HABC algorithm fairly, we employ a set of 20 benchmark functions as listed in the appendix, which can be classified as basic continuous benchmarks ( 1 -8 ), CEC2005 benchmarks ( 9 -15 ), and discrete benchmarks ( 16 ∼ 20 ) [34]. The number of function evaluations (FEs) is adopted as the time measure criterion substitute the number of iterations.

Experimental Settings.
Eight variants of HABC based on different crossover methods and CR values were executed with six state-of-the-art EA and SI algorithms for comparisons: artificial bee colony algorithm (ABC) [4], cooperative  Selection rate for replacing the offspring to the selected individuals The hierarchical interaction topology of HABC The objective optimization goals artificial bee colony algorithm (CABC) [35], canonical PSO with constriction factor (PSO) [36], cooperative PSO (CPSO) [20], standard genetic algorithm (GA) [33], and covariance matrix adaptation evolution strategy (CMA-ES) [37]. CABC adopts the similar cooperative approach into original ABC algorithm. CPSO is a cooperative PSO model, cooperatively coevolving multiple PSO subpopulations. GA is the classical stochastic search technique mimicking the process of natural selection; the principle of CMA-ES is to apply the information of successful search steps to adjust the covariance matrix of the mutation distribution within an iterative procedure.
The population size and total generation number of all involved algorithms in this experiment are set as 50 and 100000. For the continuous testing functions used, the dimensions are all set as 100D. For the five discrete testing functions, the dimensions are set as Goldberg-30D, Bipolar-60D, Mulenbein-120D, Clerc's problem 1-120D, and Clerc's problem 2-120D.
According to the original literatures about the control parameters for the other algorithms involved, in continuous optimization experiment, the initialization conditions of CMA-ES are the same as in [37], and the number of offspring candidate solutions generated per time step is = 4 ; for ABC and CABC, the limit parameter is set to be SN× , where is the objective function dimension and SN is the employed bees number. The split factor for CABC and CPSO is equal to the dimensions [20,35]. For PSO and CPSO, the learning rates 1 and 2 were both set as 2.05 and the constriction factor = 0.729. In discrete optimization experiment, HABC is compared with the binary PSO version, ABC and GA. GA employs single point crossover operation (i.e., crossover rate is 0.8 and mutation rate is 0.01). For discrete HABC and discrete ABC, the update process used in the employ and onlooker stages is changed based on (1). The learning factor of producing new solution is calculated using (7). Then, the position update equation is defined by (8) for discrete problems. Discrete HABC variant adopts single-point crossover method. Consider  Figure 4, it is visible that our proposed ABC got faster convergence rate and better optimal solutions on the involved test functions with increased . However, the performance improvement is not very remarkable using this parameter.  Table 3, the performances of the continuous algorithms involved are ordered as HABC AW > HABC AB > HABC AR > HABC AM > HABC SR > HABC SM > HABC SB > HABC SW and for the discrete versions are ordered as HABC SB > HABC SM > HABC SW > HABC SR.

Choices of Crossover Mode and CR
Then, the HABC AW for continuous functions and HABC SB for discrete functions are implemented to determine CR value. Form Tables 4 and 5, we can find that HABC variant with CR equal to 1 performs best on four functions among all five functions while CR equal to 0.05 gets best result on one function. According to the results with different CRs, we chose CR equal to 1 as an optimal value for the next experiments.

Effects of Dynamically Changing Group Size .
Obviously, the choice of value for split factor (i.e., subpopulation number) had a significant impact on the performance of the proposed algorithm. In order to vary during a run, we defined = {2, 5, 10, 50, 100} for 100D function optimization and set randomly to choose one element of . Then, the variant of HABC with dynamically changing is compared with that with fixed split number on four benchmark functions for 30 sample runs. From the results listed in Table 6 and Figure 5, we can observe that the performance is sensitive to the predefined value. HABC, with the dynamically changing , consistently obtained superior performance to other variants except 2 . Moreover, it is not very easy to obtain the prior knowledge about the optimal value of the most real-world cases, so the random grouping scheme can be a suitable solution.

Comparing HABC with Other State-of-the-Art Algorithms on Benchmark Problems.
To validate the effectiveness of the proposed algorithm, the basic benchmark functions ( 1 -8 ) and CEC2005 functions ( 9 -15 ) are employed [38]. HABC (adopting HABC AW for continuous problems, CR = 1) is tested on a set of benchmark functions ( 1 ∼ 15 ) in comparison with CABC, CPSO, CMA-ES, ABC, PSO, and GA algorithms. Table 7 demonstrates the corresponding results of mean and standard deviation for each algorithm on 1 ∼ 15 with 100 dimensions. Figure 6 shows the convergence characteristics in terms of the best mean run of each algorithm for 1 ∼ 15 with 100 dimensions.

Results on Basic Benchmark Continuous Functions.
On the unimodal basic benchmark functions ( 1 -4 ), from Table 7 and Figures 6(a)-6(d), HABC converged faster than all other algorithms. HABC was able to consistently find the minimum to functions 1 , 2 , and 3 within 100000 FEs. Statistically, HABC has significantly superior performance on continuous unimodal functions 1 ∼ 3 . On 4 , HABC, CABC, and CMA-ES have almost the same average value and CMA-ES is a little better. According to the rank in Table 7, the performance order of the algorithms involved is CMA-ES > HABC > CABC > ABC > CPSO > PSO > GA.
On the multimodal functions ( 5 ∼ 8 ), from Table 7    function 5 and 8 , and CABC also can consistently find the minimum of 5 within relatively more FEs while other algorithms perform poorer. On 6 and 7 , the HABC and CABC obtain similar performance. This can be explained that the multipopulation cooperative coevolution strategy integrated by HABC and CABC enhance the local search ability, contributing to their better performances in the multimodal problems. According to the rank values in Table 7, the performance order of the algorithms tested is HABC > CABC > ABC > CPSO> CMA-ES > PSO > GA.

Results on CEC2005 Continuous
Functions. Seven shifted and rotated functions 9 ∼ 15 are viewed as the most difficult functions to tackle. From Table 7, HABC outperformed CMA-ES on five out of the seven functions, except 10 and 13 . CMA-ES also outperformed CABC on six functions, except 14 . HABC can find the global optimum for 9 and 14 within 10000 Fes, because HABC balances the exploration and exploitation by decomposing high-dimensional problems and using crossover and mutation operations to maintain its population diversity, which is a key contributing factor. On the other hand, CMA-ES converged extremely fast. However, CMA-ES either converged very fast or tended to be trapped into local minima very quickly, especially on multimodal functions. According to the rank values, the performance order of the algorithms involved is HABC > CMA-ES > CABC > ABC > CPSO > PSO > GA.

Results on Discrete Functions.
In this experiment, the HABC SB version (CR = 0.1) is employed on a set of discrete benchmark functions 16 ∼ 20 . The results obtained by HABC SB, ABC, GA, and PSO on each function are listed in Table 8, consisting of mean and standard deviations found over 30 runs. Figure 7 shows the search progress of the average values found by the four algorithms over 30 runs on 16 ∼ 20 . From the results in Table 8, we can observe that HABC gets the best means, and converges faster than other algorithms on all discrete cases. According to the rank values presented in Table 9; the search performance order of the algorithms involved is HABC > PSO > ABC > GA.

Algorithm Complexity Analysis.
Algorithm complexity analysis is presented briefly as follows. If we assume that the computation cost of one individual in the HABC is , the cost of the crossover operator is and the total computation cost of HABC for one generation is * * * + * . However, because the heuristic algorithms used in this paper cannot ensure comprehensive convergence, it is very difficult to give a brief analysis in terms of time for all algorithms. Through directly evaluating the algorithmic time response on different objective functions, the average computing time in 30 sample runs of all algorithms is given in Figure 8. From the results in Figure 8, it is observed that the HABC takes the most computing time in all compared algorithms and its time increasing rate is the highest one. This can be explained that the multipopulation cooperative coevolution strategy integrated by HABC enhanced the local search ability at cost of increasing the computation amount. In summary, it is concluded that, compared with other algorithms, the HABC requires more computing time to achieve better results.

Conclusion
In this paper, we propose a novel hierarchical artificial bee colony algorithm (HABC), extending single-level to multilevel aiming to improve the performance of solving complex and high-dimensional problem. The concept and main idea is extending single artificial bee colony (ABC) algorithm to hierarchical and cooperative mode by combining the multipopulation cooperative coevolution approach based on vector decomposing strategy and the comprehensive learning  method. At the same time, the discrete version of HABC is presented in this paper. We demonstrate that the hierarchical cooperative framework proposed is useful to improve ABC's performance to tackle high-dimensional optimization problems (up to 100 dimensions). The effects of using random grouping scheme has shown that the performance is sensitive to the grouping number, and without any prior knowledge, the random grouping can be a suitable solution. Subsequently, a group of appropriate parameters consisting of crossover modes, selection methods, and CRshas been determined for both continuous and discrete optimizations, respectively.
Then, HABC was compared with other SI and EA algorithms on a set of 20 benchmark functions (including          continuous and discrete cases). Our results show that, for all the test functions, HABC is more competitive than other classical powerful algorithms. HABC is the only one to consistently find the minimum of 1 (Sphere), 2 (Rosenbrock), 3 (Quadratic), 5 (Rastrigin), and 8 (Griewank), though CMA-ES got significantly better performance on the unimodal functions ( 1 -4 ). On CEC2005 functions, HABC exhibits more significant advantages in terms of accuracy, convergence rate, and robustness. In the future work, we will extend the optimization problem to higher-dimensional (up to 1000 dimensions) objective function solved by our proposed HABC. 1 ∼ 8 were adopted widely in evolutionary computation domain to show solution quality and convergence rate. 9 ∼ 15 are shifted and rotated functions selected from CEC2005 functions; their global optimum values are different to each other. In particular, function 12 has no bounds, and its initialization range is [0, 600]. The discrete functions 16 ∼ 20 were used in Clerc's literature [39,40] and can be found at http://clerc.maurice.free.fr/pso/. The fitness of a bit string is the sum of the results of separately applying the following function to consecutive groups of three components each: If the string size (the dimension of the problem) is , the maximum value is /3 for the string 111 ⋅ ⋅ ⋅ 111. In practice, we will then use as fitness the value /3 − so that the problem is now to find the minimum 0. The maximum value is /6. In practice, we will use as fitness the value /6 − so that the problem is now to find the minimum 0.

(18) Mulenbein's order-5
The fitness is the sum of the results of applying the following function to consecutive groups of five components each: The maximum value is 3.5 /5. In practice, the value 3.5 /5 − is used as the fitness so that the problem is now to find the minimum 0. The maximum value is /3. In practice, we will then use as fitness the value /3 − so that the problem is now to find the minimum 0. The maximum value is /3. In practice, we will use as fitness the value /3 − so that the problem is now to find the minimum 0.

B. Parameters of the Test Functions
The dimensions, initialization ranges, global optima * , and the corresponding fitness value ( * ) of each function are listed in Table 9.