Hierarchical Artificial Bee Colony Optimizer with Divide-and-Conquer and Crossover for Multilevel Threshold Image Segmentation

,


Introduction
Image segmentation is considered as the process of partitioning a digital image into multiple regions or objects.Among the existing segmentation methods, multilevel threshold technique is a simple but effective tool that can extract several distinct objects from the background [1][2][3][4].This means that searching the optimal multiple thresholds to ensure the desired thresholded classes is a significantly essential issue.Several multiple thresholds selection approaches have been proposed in literatures; [3,5,6] proposed some methods derived from optimizing an objective function, which were originally developed for bilevel threshold and later extended to multilevel threshold [1,2].However, these methods based on exhaustive search suffer from a common drawback; that is, the computational complexity will rise exponentially when extended to multilevel threshold from bilevel.
Recently, to address this concerned issue, swarm intelligence (SI) algorithms as the powerful optimization tools have been introduced to the field of image segmentation owing to their predominant abilities of coping with complex nonlinear optimizations [7][8][9][10].Akay [11] employed two successful population-based algorithms-particle swarm optimization (PSO) and artificial bee colony (ABC) algorithm to speed up the multilevel thresholds optimization.Ozturk et al. [12] proposed an image clustering approach based on ABC to find the well-distinguished clusters.Sarkar and Das [13] presented a 2D histogram based multilevel thresholding approach to improve the separation between objects, which demonstrated its high effectiveness and efficiency.
It is worth noting that due to its simple arithmetic and good robustness, the ABC-based approaches have achieved more attentions from researchers [14][15][16][17][18] and are widely used in various optimization problems [19][20][21][22][23][24].However, when solving complex problems, ABC algorithm will still suffer from the major drawbacks of being easily trapped into local minimums and loss of population diversity [20].
Comparing with other evolutionary and swarm intelligence algorithms [25][26][27][28][29][30][31], how to improve the diversity of swarm or overcome the local convergence of ABC is still a challenging issue in the optimization domain.Thus, this paper presents a novel hierarchical optimization scheme based on divide-and-conquer and crossover strategies, namely, HABC, to extend ABC framework from flat (one level) to hierarchical (multiple levels).Note that such hierarchical schemes have been applied in optimization algorithms [32][33][34][35][36][37][38].The essential differences between our proposed scheme and others lie in the following aspects.
(1) The divide-and-conquer strategy with random grouping is incorporated in this hierarchical framework, which can decompose the complex high-dimensional vectors into several smaller part-dimensional components that are assigned to the lower level.This can enhance the local search ability (exploitation).
(2) The enhanced information exchange strategy, namely, crossover, is applied to interaction of different population to maintain the diversity of population.In this case, the neighbor bees with higher fitness can be chosen to crossover, which effectively enhances the global search ability and convergence speed to the global best solution (exploration).
The rest of the paper is organized as follows.Section 2 describes the canonical ABC algorithm.In Section 3 the proposed hierarchical artificial bee colony (HABC) model is given.Section 4 presents the experimental studies of the proposed HABC and the other algorithms with descriptions of the involved benchmark functions, experimental settings, and experimental results.And its application to image segmentation has been presented in Section 5. Finally, Section 6 outlines the conclusion.

Canonical ABC Algorithm
The artificial bee colony (ABC) algorithm, proposed by Karaboga in 2005 [19] and further developed by Karaboga and Basturk [20,21] for real-parameter optimization, which simulates the intelligent foraging behavior of a honeybee swarm, is one of the most recently introduced swarm-based optimization techniques.
The entire bee colony contains three groups of bees: employed bees, onlookers, and scouts.Employed bees explore the specific food sources and, meanwhile, pass the food information to onlooker bees.The number of employed bees is equal to that of food sources; in other words, each food source owns only one employed bee.Then onlooker bees choose good food sources based on the received information and then further exploit the food near their selected food source.The food source with higher quality would have a larger opportunity to be selected by onlookers.There is a control parameter called "limit" in the canonical ABC algorithm.If a food source is not improved anymore when limit is exceeded, it is assumed to be abandoned by its employed bee and the employed bee associated with that food source becomes a scout to search for a new food source randomly.The fundamental mathematic representations are listed as follows.
Step 1 (initialization phase).In initialization phase, a group of food sources are generated randomly in the search space using the following equation: where  = 1, 2, . . ., ,  = 1, 2, . . ., .  is the number of food sources. is the number of variables, that is, problem dimension. min  and  max  are the lower upper and upper bounds of the th variable, respectively.
Step 2 (employed bees' phase).In the employed bees' phase, the neighbor food source (candidate solution) can be generated from the old food source of each employed bee in its memory using the following expression: where  is a randomly selected food source and must be different from ;  is a randomly chosen indexes;  is a random number in range [−1, 1].
Step 3 (onlooker bees' phase).In the onlooker bees' phase, an onlooker bee selects a food source depending on the probability value associated with that food source;   can be calculated as follows: where fitness  is the fitness value of th solution.
Step 4 (scout bees' phase).In scout bees' phase, if a food source cannot be improved further through a predetermined cycle (called "limit" in ABC), the food source is supposed be abandoned.The employed bee subsequently becomes a scout.
A new food source will be produced randomly in the search space using (1).The employed, onlooker, and scout bees' phase will recycle until the termination condition is met.

Hierarchical Artificial Bee
Colony Algorithm

Hierarchical Multipopulation Optimization Model.
As shown in Figure 1, the HABC framework contains two levels to balance exploring and exploiting ability.In the bottom level, with the variables decomposing strategy, each subpopulation employs the canonical ABC method to search the partdimensional optimum in parallel.That is, in each iteration,  subpopulations in the bottom level generate  best solutions, which are constructed into a complete solution species updated to the top level.In the top level, the multispecies community adopts the information exchange mechanism based on crossover operator, by which each species can learn from its neighborhoods in a specific topology.The vectors decomposing strategy and information exchange crossover operator can be described in detail as follows.

Variables Decomposing Approach.
The purpose of this approach is to obtain finer local search in single dimensions inspired by the divide-and-conquer approach.Notice that two aspects must be analyzed: (1) how to decompose the whole solution vector and (2) how to calculate the fitness of each individual of each subpopulation.The detailed procedure is presented as follows.
Step 1.The simplest grouping method is permitting a Ddimensional vector to be split into  subcomponents, each corresponding to a subpopulation of s-dimensions, with  individuals (where  =  * ).The jth subpopulation is denoted as Step 2. Construct complete evolving solution Gbest, which is the concatenation of the best subcomponents' solutions   by fowling: where   ⋅  represents the personal best solution of the jth subpopulation.
Step 4. Memorize the best solution achieved so far; compare the best solution with Gbest and memorize the better one.
Random Grouping of Variables.To increase the probability of two interacting variables allocated to the same subcomponent, without assuming any prior knowledge of the problem, according to the random grouping of variables proposed by [29], we adopt the same random grouping scheme by dynamically changing group size.For example, for a problem of 50 dimensions, we can define that  = {2, 5, 10, 25, 50} ,  ⊂ .
Here, if we randomly decompose the D-dimensional object vector into  subcomponents at each iteration (i.e., we construct each of the  subcomponents by randomly selecting S-dimensions from the D-dimensional object vector), the probability of placing two interacting variables into the same subcomponent becomes higher, over an increasing number of iterations [30].

The Information Exchange Mechanism Based on Crossover
Operator between Multispecies.In the top level, we adopt crossover operator with a specific topology to enhance the information exchange between species, in which each species   can learn from its symbiotic partner in the neighborhood.
The key operations of this crossover procedure are described in Figure 2.
Step 1 (select elites for the best-performing list (BPL)).Individuals from current species   's neighborhood (i.e., ring topology) with higher fitness have larger probability to be  selected into the best-performing list (BPL) as elites, whose size is equal to current population size.
Step 2 (crossover operation) Step 2.1.Parents are selected from the BPL's elites using the tournament selection scheme: two enhanced elites are selected randomly, and their fitness values are compared to select the elites.The one with better fitness is viewed as parent.Another parent is selected in the same way.
Step 3 (update with different selection schemes).If population size is , then the replaced individuals number is  *  (CR is a selecting rate).The greedy selection mechanism is adopted to replace the selected individual.There are three replacing approaches of selecting the best individuals (i.e.,  *  individuals), a medium level of individuals and the worst individuals.To maintain the population diversity, we randomly select one replacing approach at every iteration.In summary, in order to facilitate the below presentation and test formulation, we define a unified parameters for HABC model in Table 1.According to the process description as mentioned above, the flowchart of HABC algorithm is summarized in Figure 3, while the pseudocode for HABC algorithm is presented in Pseudocode 1.

Experimental Study
In the experimental studies, according to the no free lunch (NFL) theorem [40], a set of eight basic benchmark functions and twelve CEC2005 benchmark functions are employed The th population (of the th species) CR Selection rate for replacing the offspring to the selected individuals.

𝑇
The hierarchical interaction topology of HABC The objective optimization goals to fully evaluate the performance of the HABC algorithm fairly [27], as listed in the Appendix section.The number of function evaluations (FEs) is adopted as the time measure criterion substitutes the number of iterations.
In all experiments in this section, the values of the common parameters used in each algorithm such as population size and total generation number are chosen to be the same.Population size is set as 50 and the maximum evaluation number is set as 100000.For the fifteen continuous testing functions used in this paper, the dimensions are all set as 50.
All the control parameters for the EA and SI algorithms are set to be default of their original literatures: initialization conditions of CMA-ES are the same as in [43], and the number of offspring candidate solutions generated per time step is  = 4; for ABC and CABC, the limit parameter is set to be  × , where  is the dimension of the problem and SN is the number of employed bees.The split factor for CABC and CPSO is equal to the dimensions [28,41].For canonical PSO and CPSO, the learning rates  1 and  2 are both set as 2.05 and the constriction factor  = 0.729.For EGA, intermediate crossover rate of 0.8, Gaussian mutation rate of 0.01, and the global elite operation with a rate of 0.06 are adopted [39].For the proposed HABC, the species number , split factor , and the selection rate CR should be tuned firstly in next section.
Divided into K subpopulations by

Effects of Species Number N.
The species number of the top level in HABC needs to be tuned.Three 50D basic benchmark functions ( 1 - 3 ) and five 50D benchmark functions ( 9 - 13 ) are employed to investigate the impact of this parameter.Set CR equal to 1 and all the functions run 30 sample times.As shown in Table 2, it is visible that our proposed ABC got better optimal solutions and stability on the involved test functions with increased .However, the performance improvement is not very remarkable using this parameter.the performance of HABC variants with different crossover CRs.All the functions with 30 dimensions are implemented for 30 run times.From Table 3, we can observe that HABC variant with CR equal to 1 performed best on four functions among all five functions while CR equal to 0.05 get best result on one function.Therefore, the value of selection rate CR in each crossover operation can be set to 1 as an optimal value in all following experiments.

Effects of Dynamically
Changing Group Size K. Obviously, the choice of value for split factor  (i.e., subpopulation number) had a significant impact on the performance of the proposed algorithm.In order to vary  during a run, we defined  = {2, 5, 10, 25 50} for 50D function optimization, and set  randomly choosing one element of ; then, the HABC with dynamically changing  is compared with that with fixed split number on these benchmark functions for runs and were reported in Table 5, where the best results among those algorithms were shown in bold.Figures 4(a)-4(h) presented the average convergence rates of each algorithm for each basic benchmark.On the unimodal basic benchmark functions ( 1 - 4 ), from Table 5 and Figure 4, HABC converged faster than all other algorithms.HABC was able to consistently find the minimum to functions  1 ,  2 , and  3 within 100000 FEs.Statistically, HABC has significantly superior performance on these unimodal functions.On the multimodal functions (f 5 ∼f 8 ), it is visible that HABC algorithms markedly outperforms other algorithms on most of these cases.For example, HABC quickly find the global minimum on functions  5 and  8 , and CABC also can consistently find the minimum of  5 within relatively more FEs while other algorithms perform poorer.This can be In order to further investigate the efficacy and robustness of the proposed HABC, the analysis of variance (ANOVA) test was employed to determine the statistical characteristics of each proposed algorithm over the others.In this work, the graphical ANOVA analyses were done through a graphical tool of box plots, which took on many important aspects of a distribution.Through the box plot, the general features of the distribution can be noticed.The box plots shown in Figures 5(a)-5(l) demonstrate the statistical performance of each involved algorithms on the classical test suite for 30 runs individually.From this box plot representation, it is clearly visible and proved that HABC achieved good variance distribution of compromise solutions on all classical functions.Note that the CABC algorithm also exhibited its robustness on almost some classical functions.

Algorithm Complexity Analysis. Algorithm complexity
analysis is also presented briefly as follows.If we assume that the computation cost of one individual in the HABC is Cost a, the cost of the crossover operator is Cost c, and the total computation cost of HABC for one generation is  *  *  *   +  *  .However, because the heuristic algorithms used in this paper cannot ensure comprehensive convergence, it is very difficult to give a brief analysis in terms of time for all algorithms.Through directly evaluating the algorithmic time response on different objective functions, the average computing time in 30 sample runs of all algorithms is given in Figure 6.From Figure 6, it is observed that the HABC takes the most computing time in all compared algorithms and the time increasing rate of it is the highest one.This can be explained as the multipopulation cooperative coevolution strategy integrated by HABC enhanced the local search ability at cost of increasing the computation amount.In summary, it is concluded that, compared with other algorithms, the HABC requires more computing time to achieve better results.

Multilevel Threshold for
Image Segmentation by HABC

Entropy Criterion Based Fitness
Measure.The Otsu multithreshold entropy measure [45] proposed by Otsu has been popularly employed in determining whether the optimal threshold method can provide image segmentation with satisfactory results.Here, it is used as the objective function for the involved algorithms and its process can be described as follows.
Let the gray levels of a given image range over [0,  − 1] and ℎ() denote the occurrence of gray-level . Let Maximize where and the optimal threshold is the gray level that maximizes (8).
Then, (8) can also be written as  where  0 ,  1 ,  0 , and  1 are the same as given in ( 9), and Expanding this logic to multilevel threshold, where  is the number of thresholds.Equation ( 12) is used as the objective function for the proposed HABC based procedure which is to be optimized (minimized).A close look into this equation will show that it is very similar to the expression for uniformity measure.

Probabilistic Rand Index as a Qualitative Measure.
Consider a set of manually segmented (ground truth) images { 1 ,  2 , . . .,   } corresponding to an image  = { 1 ,  2 , . . .,   , . . .,   }, where a subscript indexes one of  pixels.Let  test be the segmentation that is to be compared with the manually labeled set.We denote the label of point We chose to model label relationships for each pixel pair by an unknown underlying distribution.One may visualize this as a scenario where each human segmenter provides information about the segmentation   of the image in the form of binary numbers ΙΙ ( for each pair of pixels (  ,   ).The set of all perceptually correct segmentations defines a Bernoulli distribution over this number, giving a random variable with expected value denoted as   .Hence, the set {  } for all unordered pairs (, ) defines a generative model of correct segmentations for the image X.
Consider the Probabilistic Rand Index (PRI) [46] PRI Let   denote the event of a pair of pixels i and j having the same label in the test image  test Then, the PRI can be written as This measure takes values in [0, 1], where 0 means  test and { 1 ,  2 , . . .,   } have no similarities (i.e., when  consists of a single cluster and each segmentation in { 1 ,  2 , . . .,   } consists only of clusters containing single points, or vice versa) to 1 when all segmentations are identical.

Experiment Setup.
The experimental evaluations for segmentation performance by HABC are carried out on the Berkeley Segmentation Database (BSDS).The BSDS consists of 300 natural images, manually segmented by a number of different subjects.The ground-truth data for this large collection shows the diversity, yet high consistency, of human segmentation (Available at http://www.eecs.berkeley.edu/Research/Projects/CS/vision/bsds/BSDS300/html/dataset/images.html).These datasets involve a suit of popular standard images [47][48][49], namely, Figures 7(a  evaluate all involved algorithms.The numbers of thresholds M − 1 investigated in the experiments were 2, 3, 4, 5, 7, and 9 while all the experiments were repeated 30 times for each image for each M − 1 value.The population size is 20 and the maximum number of FEs is 2000.Figure 7 donates the original images and their histograms. The basic parameter settings of these algorithms, namely, HABC, ABC, PSO, EGA, and CMA ES, are set as default in Section 4.1.It is noteworthy that the split factor  of HABC should be adjusted according to the dimension of the image segmentation problems.In this experiment, for 2dimensional problem,  ⊂ {1, 2}; for 3-dimensional problem,  ⊂ {1, 3}; for 4-dimensional case,  ⊂ {1, 2, 4}; for 5dimensional case,  ⊂ {1, 5}; for 7-dimensional case,  ⊂ {1, 7}; for 9-dimensional case,  ⊂ {1, 3, 9}.In order to test the effect of segmentation, the Probabilistic Rand Index (PRI) is chosen as a qualitative measure.

Experimental Results of Multilevel Threshold
Case 1 (multilevel threshold results with M − 1 = 2, 3, 4).Table 7 presents the fitness function values, mean computation time, and corresponding optimal thresholds (with M − 1 = 2, 3, 4) obtained by Otsu.It is noteworthy that the term CPU time is also an important issue in the real-time applications.From Table 8, there are not obvious differences about CPU times between the involved population-based methods, which are suitably significantly superior in terms of time complexity for high-dimensional image segmentation problems.
As can be seen form Table 9, the proposed HABC algorithm generally performs close to the Otsu method in term of fitness value when M − 1 = 2, 3, 4, whereas the performance of HABC on time complexity is significantly superior to its counterpart Otsu.Furthermore, the HABC-based algorithm achieves the best performance among the population-based methods in most cases.This means that HABC can obtain an appropriate balance between exploration and exploitation.However, compared with the other population-based algorithms, the PRI results for low-dimensional segmentation obtained by HABC have no obvious enhancement.This can be explained as HABC endowed with crossover operation performed better global search in the higher-dimensional search space as well as the hierarchical cooperation strategy will be used to emphasize the fine exploitation around the promising area.Moreover, the differences between the HABC and the other algorithms are more evident as the segmentation level increases.
Case 2 (multilevel threshold results with M − 1 = 5, 7, 9).Regarding the high-dimensional segmentation problems with M − 1 = 5, 7, 9, Table 10 demonstrates the average fitness value, the standard deviation, and the PRI obtained by each population-based algorithm, where the correlative results with the larger values, the smaller standard deviations or the higher PRIs, indicate the better achievement.
From Table 10, depending on the crossover method and the fast convergence rate, HABC demonstrates the best performance in terms of efficiency and stability on the highdimensional cases.Furthermore, as the level of segmentation increases, the fitness of HABC increases faster than that of other methods, especially ABC almost did not achieve any improvement for 5, 7, and 9 levels of segmentation.From the experimental results of Tables 8 and 9, the PRI results for high-dimensional segmentation are significantly better than that for the low-dimensional cases.Meanwhile, HABC can also achieve better statistical results regarding PRI than its counterparts in most tested datasets.This can be explained that, based on Otsu method, the segmentation with more thresholds can achieve greater consistency to human segmentation and HABC algorithm can demonstrate powerful performance in searching high-dimensional space.Figures 8,9,10,11,12,and 13 show the original images, multilevel threshold segmentations of those image, and the ground truth human segmentations of those images.From these results as shown in Figures 8-13, it is clearly visible that the HABC-based method is more suitable to deal with such multilevel segmentation problem.

Conclusion
In this paper, we propose a novel hierarchical artificial bee colony algorithm, called HABC, to improve the performance of solving complex problem.The main idea of HABC is to extend single artificial bee colony (ABC) algorithm to a hierarchical and cooperative mode by combining the multipopulation vector decomposing strategy and the comprehensive learning method.With the divide-and-conquer strategy, the complex vectors can be decomposed into smaller components, which are easily resolved.Through the crossover-based comprehensive learning, the information exchange between individuals can be significantly enhanced.In addition, due to lack of prior knowledge in most applications, the random   grouping technology is employed as a suitable solution.The experimental results demonstrate that the proposed HABC achieved performance superior to other classical powerful algorithms.
Finally, the HABC algorithm is applied for resolving the real-world image segmentation problems.The correlative results obtained by HABC-based method on each image indicate a significant improvement compared to several other popular population-based methods.In our future work, we aim at finding a simpler and more efficient optimization frame by using HABC's merits and then apply it to more complex image processing (up to 30 dimensions) and computer vision problems.

Figure 2 :
Figure 2: The information exchange mechanism based on crossover operation.

Figure 6 :
Figure 6: Computing time of all algorithms on different RNP problems.

Figure 7 :
Figure 7: Test images and their histograms.

Table 2 :
HABC's results on nine test functions with different swarm number N.

Table 3 :
Results of HABC on benchmark functions (f 1 -f 3 & f 9 -f 13 ) with different CRs.In bold are the best results.30sampletimes.From the results listed in Table4, we can observe that the performance is sensitive to the predefined  value.HABC, using a dynamically changing  value, consistently gave a better performance than the other variants except  2 and  10 .Moreover, in most real-world problems, we do not have any prior knowledge about the optimal s value, so the random grouping scheme can be a suitable solution.
4.3.Comparing HABC withOther State-of-the-Art Algorithms on Benchmark Problems 4.3.1.Results on Basic Benchmark Continuous Functions.The means and standard deviations obtained all involved algorithms on the 50-dimensional classical test suite for 30

Table 4 :
Performance of HABC on 50-D (f 1 -f 3 & f 9 -f 13 ) with the different grouping number .In bold are the best results.
[44]2.Results on CEC2005 Continuous Functions.To validate the effectiveness of the proposed algorithm, a suit of CEC2005 benchmarks ( 9 - 20 ) is employed[44].According to Section 4.2, we ensure the following optimal parameter setting of HABC:  = 1,  = 10, and  ⊂ , in comparison with CABC, CPSO, CMA-ES, ABC, PSO, and GA algorithms.From the experimental results in terms of means and standard deviations in Table6, HABC outperformed CMA-ES on eight out of the twelve functions .CMA-ES also outperformed CABC on most functions.HABC can find the global optimum for  9 ,  11 ,  13 ,  17 ,  18 , and  20 within 10000 FEs; this is due to the fact that HABC can balance the CABC > ABC > CPSO > PSO > GA.

Table 5 :
Performance of all algorithms on basic benchmark functions f 1 -f 8 .
by    .It is assumed that each label   test  can take values in a discrete set of size   , and correspondingly takes one of  test value.

Table 6 :
Performance of all algorithms on CEC 2005 functions f 9 -f 20 .

Table 7 :
Objective values and thresholds by the Otsu method and their PRI.

Table 8 :
The mean CPU time of the compared population-based methods on Otsu algorithm.