The estimation of distribution algorithm (EDA) aims to explicitly model the probability distribution of the quality solutions to the underlying problem. By iterative filtering for quality solution from competing ones, the probability model eventually approximates the distribution of global optimum solutions. In contrast to classic evolutionary algorithms (EAs), EDA framework is flexible and is able to handle inter variable dependence, which usually imposes difficulties on classic EAs. The success of EDA relies on effective and efficient building of the probability model. This paper facilitates EDA from the adaptive memory programming (AMP) domain which has developed several improved forms of EAs using the CyberEA framework. The experimental result on benchmark TSP instances supports our anticipation that the AMP strategies can enhance the performance of classic EDA by deriving a better approximation for the true distribution of the target solutions.
Since the 1960s, the idea of bioinspired computation has created fruitful implementations for evolutionary algorithms (EAs). Among them, the genetic algorithm (GA) is one of the most remarkable notions that draw on Darwinian Theory. Researchers have developed successful applications by using GAs in the domains where the classic analytical and numerical methods are not suitable to be applied. A typical example is the optimization problems with the blackbox evaluation for the solution quality. This is commonly seen in real world practice that no analytical expression is available for computing the solution quality. Instead, a blackbox procedure running a simulation model of the problem is applied to estimate the fitness of the trial solution. The use of evolution metaphor and blackbox estimation has promoted the popularity of GA as being a viable approach for tackling complex and unstructured problems.
Holland [
The estimation of distribution algorithm (EDA) intends to explicitly model the probability distribution of uniformly sampled solutions from those which have a fitness value greater than a threshold. Through selection on elitism to form the next population, the fitness threshold is increasingly strict along the number of generations until the elite individuals in the population cannot be further improved. EDA thus iteratively approximates the probability distribution of the global optima. The probability model can render the dependence between variables, and the linkage relationship is reserved during sampling on the model. The variabledependence modeling is performed through the factorization technique. Mutually dependent variables are grouped as a factor. The solution is considered as a set of nonoverlapping factors where the variable is dependent on the remaining variables contained in the same factor but is independent of those variables belonging to the other factors. Then the conditional probabilities for each factor are computed, and their multiplications form the final aggregate probability model. The success of EDA relies on if the probability model is learned efficiently and accurately by the algorithm.
The source of the original idea for developing the EDA can be dated back to 1994 [
We construct our method primarily by drawing notions from the domain of the adaptive memory programming (AMP) [
The remainder of this paper is organized as follows. Section
The flow diagram of the EDA framework is illustrated in Figure
Flow diagram of the EDA framework.
As previously noted, there are three learning issues in previous EDA research. The first issue investigates the selection of an appropriate probability model that fits the characteristic of the addressed problem. To model the variable (in)dependence relationship in a problem is a difficult task and could be an optimization process itself. Grahl et al. [
The second issue relates effective methods for learning the parameters of the adopted probability model. The methods for constructing the probability model can be divided into nonparametric and parametric approaches. Nonparametric approaches are adopted for dealing with combinatorial optimization problems. The approaches, such as the probability vector, marginal histogram model, tree structure, directed acyclic graph, and Bayesian network, consider the relative frequency of each bit value stored in the frequency table as the probability parameters. The complexity of computations and memory storage depends on the cardinality of each variable factor and the learning schemes themselves. For example, PBIL employs competitive learning for shifting the probability vector towards representing those in high evaluation sampled solutions. The BOA uses the previously built Bayesian networks to predict the next network structure. On the other hand, parametric methods are adopted for modeling the continuous optimization problems and these methods usually rely on variants of the normal probability density function (pdf). The extension of some discrete models has been developed for this purpose. PBILc [
Finally, the third issue of EDA research contemplates that the performance of EDA algorithms can be significantly improved if they are hybridized with appropriate search heuristics in other domains. Handa [
Glover [
In addition to responsive strategies, another feature of AMP is the use of adaptive memory mechanism which is further classified as shortterm and longterm memory. The shortterm memory stores the values found in recent status such as the current population and the forbidden solutions in the tabu list. The longterm memory monitors the status changes in a longer term of time duration. Previous researches have suggested, for example, the frequency of dominant variable values, the reference solutions with both quality and diversity properties, and the duration in which the best solution has not improved. The reference set stores longterm elite solutions with respect to a threshold for the minimal mutual distance between these elite solutions. The reference set has a fixed size. The worst solution
In contrast to the development conception of previous EDA variants, our proposed method enhances EDA from a system point of view. Recall the flow diagram of the EDA framework (Figure
The CyberEDA strategies for enhancing EDA components.
EDA component  CyberEDA strategy 

Selection  Diversification generation 
Improvement method  
 
Learning  Thresholding with shortterm memory 
Incremental learning with longterm memory  
Probability model rebuilding with longterm memory  
 
Sampling  Weighting sampling scheme 
 
Replacement  Generational population update with shortterm memory 
Restart population rebuilding with longterm memory  
Partial population rebuilding with longterm memory 
In the original SS/PR template, the diversification generation method is used for constructing the initial or rebuilt populations and the update of the reference set. The aim is to preserve contrasting solutions in a hope to produce promising solutions that are unobtainable by using greedy selection. Analogously, we apply the diversification generation method for the EDA selection component to obtain
Improvement method is a local search heuristic that has been included as a mandatory component in several effective metaheuristic algorithms such as the Scatter Search, GRASP, and Memetic Algorithms. The improvement method pushes the trial solutions towards nearby local optima, and the schema contained in the local optima is helpful in constructing promising solutions. We anticipate that the improvement method can be also used to enhance the performance of EDA. The learned probability model is inevitably biased due to the limit of the population size and the process of the sampling. The classic EDA intends to reduce the bias through successive generations of greedy competition among individuals. However, this may lead to a lengthy process. The improvement method can expedite this competition process by making it start with a population of diverse local optima and also by maintaining a reference set which contains quality and diverse solutions. Consequently, in the proposed CyberEDA, the improvement method is performed at two places. First, it is executed in succession of the diversification generation method; thus, the members in the initial or rebuilt populations are drawn towards local optima. Second, the improvement method is applied to the new member(s) of the reference set during the reference set update process. The reference solutions are used for partial population rebuilding upon the critical event as will be noted. There are several effective combinatorial improvement methods in the AMP literature such as the insertion method, nopt, ejection chain, and path relinking, to name a few. These methods can be separately used as a single improvement method or they can be combined as a compound one with a more complex neighborhood structure.
The probability model learning component in CyberEDA makes use of memory manipulations as has been contemplated in the AMP domain. The CyberEDA has two types of memory structures. The shortterm memory tallies the probability distribution describing the new population. The longterm memory keeps records of both the probability model that is learned through successive populations and the AMP parameters which are used to guide the evolution conducted between successive restarts. The two types of memory manipulations are described as follows.
To reduce the sampling bias incurred by the limitedsize of the population and the sampling process, we propose the thresholding technique to manipulate the shortterm memory which records the frequencies of the solutions selected from the current population. The thresholding technique restrains the highest frequency by an adaptive threshold and uniformly distributes the extra counts to the remaining frequency beams. As a result, the solution with the highest frequency will not dominate the probability model and we can avoid the possible oversampling of this solution. The adaptive threshold is tuned according to the number of executed function evaluations, which indicates the completion percentage of the planned evolution. Let
In classic EDA, the previous experience obtained through evolution is retained in two memory structures, the sample population, and the probability model. These two structures are interrelated and are updated along the evolution. In the machine learning domain, incremental learning suggests to target the objective function by combining immediate experience and previous experience. The previous experience is discounted to favor the importance of latest experience from the most immediate observations. The incremental learning technique has proved effective in many applications [
Similar to the tuning of the adaptive threshold, we use a decreasing value for
One of the prominent search strategies in the AMP domain is the restart strategy with longterm memory. A common challenging issue to all of the evolutionary algorithms is the premature convergence problem, indicating that the best, however not acceptable, individual solution found so far has not improved for a large number of function evaluations. By referring to longterm memory, the restart strategy aims to guide the search towards an uncharted space. The restart strategy reinitiates a new search session by rebuilding the reference solutions, and the new search session may use a different set of the search scheme parameters in contrast to the previous session. The rebuilt solutions must have contrasting features that are not contained in previously found solutions and the employment of new values for search scheme parameters can create different neighborhood concept for the new search session. The general concept of restart search strategy is shown in Figure
General concept of the AMP restart search strategy.
In the context of EDA, restart might mean the probability model rebuilding. Its necessity emerges from the sort of critical event in which all of the new sampled solutions cannot replace any of the solutions in the population for a sufficiently large number of generations. The critical event can be detected by referring to the longterm memory which keeps record of population replacement status along generations. The restart strategy for the probability model rebuilding is facilitated in two aspects. First, the population is rebuilt by using new solutions which possess mutual diversity. This process is referred to as population rebuilding and will be articulated in description for the CyberEDA replacement component. Second, the system parameters can be adaptively tuned in accordance with the status of the longterm memory. These parameters include either or both the probability model parameters (i.e., probabilities) and the search strategy parameters. The probabilities can be either retained or reset to null value and learn from scratch in the new restart session. For the latter case, we do not need to worry about the performance deterioration due to the abandon of the probabilities because the elite solutions found in the search history have been tallied in the reference set. On contrary, the reconstruction of the probability model may be beneficial to future findings of potential solutions which are overlooked from the previous search trajectories. As indicated in Figure
Most EDA algorithms perform the probabilistic sampling according to the learned probability model to generate new solution samples. The contribution of every sample for the update of the probability model is the same. We consider a more aggressive form of sampling, which we call weighting sampling scheme, in the CyberEDA. Inspired by the success of the Cyber Swarm Algorithm [
The CyberEDA applies the generational population update in the same way as that performed by most EDA algorithms. The sampled solutions obtained from the sampling component compete for survival with all the solutions in the current population. The survival competition is conducted based on a greedy selection on fitness, such that the best fitness solutions will form the next population. The generational population update is essential to the continuing improvement on the lower bound of the fitness for all the solutions in the current population. Hence the estimation of the distribution converges to that for the global optima. As the sampled solutions are temporarily stored in the shortterm memory, they are removed after the replacement component process unless the sampled solutions satisfy the quality and diversity test for updating the reference set.
In contrast to the generational population update which is conducted at every generation by using the sampled solutions stored in the shortterm memory, the population rebuilding is activated upon signal of critical events which are detected by reference to longterm memory. The CyberEDA facilitates two types of population rebuilding, namely, the restart population rebuilding and the partial population rebuilding, as presented as follows. First, the restart population rebuilding is part of the noted restart strategy which reinitiates a new search session. It is activated at the critical event when the previous search session stagnates in improving the best solution for
Second, the partial population rebuilding is activated at a less critical event in which the best solution has not improved for
To evaluate the performance of the proposed CyberEDA algorithm, we have chosen the wellknown traveling salesman problem (TSP) as the optimization test benchmark. Twenty problem instances covering different scales of problem size from 48 to 400 cities are selected from the TSPLIB. The platform for conducting the experiments is a personal computer equipped with a 2.26 GHz CPU and 4.0 GB RAM. All programs are codified in C# language. We have determined the parameter values for all the compared algorithms through the experimental design method. For each parameter, several typical values are tested with a subset of the dataset. The parameter value leading to the best result is adopted in subsequent experiments. In particular, for the parameter setting of all the compared EDAbased algorithms, the marginal histogram model with bivariate interactions is employed as the probability model; that is, the joint probability of any two cities observed in succession within the visiting route is estimated. The population size
As for the time complexity of all the compared EDA variants, most researchers working in the EDA domain use the number of the solution fitness evaluations as the measure of the time complexity because the solution fitness evaluation is often the most timeconsuming component in the designed algorithms. We adopted the same approach in our experiments and fixed the number of the solution fitness evaluations to 200,000 for all the compared EDA variants. By fixing the number of the solution fitness evaluations, the effectiveness for obtaining the global optimal solutions by each competing algorithm can be observed. Moreover, as for the space complexity issue, the CyberEDA uses an extra longterm memory structure called the reference set which intends to reserve the information diversity of the population and the probability model. The size of the reference set is negligible since the population size varies from 48 to 400 depending on the problem instance.
In this section we present the comparative performances of the classic EDA and the CyberEDA. Because the CyberEDA is coupled with various AMP strategies, we have conducted preliminary experiments to identify the principal strategies which significantly affect the performance of the CyberEDA. We found that such strategies as the diversification generation, improvement method, incremental learning, and the weighting sampling scheme are the principal strategies and, if they are employed simultaneously, can improve the performance of the classic EDA by at least 30% over all of the test problem instances. We deem these principal strategies as the default strategies as our baseline CyberEDA, which we denote as
All the compared EDA variants are stochastic method. Each single run of the same algorithm may produce different results depending on the random values generated. In order to conduct a fair performance comparison, each compared algorithm is executed 30 times on every problem instance with various initial random numbers and the mean and the standard deviation of the TSP tour cost calculated over these multiple runs are used as the performance indices. The mean cost reflects the quality level of the solution obtained by an algorithm and the standard deviation indicates the stabilization of the corresponding algorithm to obtain such a solution. The experimental results on the 20 TSP problem instances are listed in Table
The mean and the standard deviation (Std) for the cost of the obtained tour.
Problem  EDA 



CyberEDA  

Mean  Std  Mean  Std  Mean  Std  Mean  Std  Mean  Std  
Att48  50532  4271  35038  672  35637  623  34846  675  35046  622 
Berlin52  9732  816  8098  202  8167  211  8115  221  8132  225 
Eil51  658  20  449  8  452  8  449  9  448  7 
Eil76  1173  17  605  11  609  18  607  25  601  9 
Pr76  250519  4768  122428  8785  121878  3295  120713  4832  119689  2712 
St70  1559  38  767  21  750  11  766  19  754  19 
KroA100  75500  1327  39739  1691  37008  1786  37251  1453  35487  1244 
kroB100  73802  1975  39947  1550  37401  1151  37734  1852  36923  982 
kroC100  74348  1416  38927  1453  37210  1331  37266  1526  37236  1152 
lin105  53509  1499  25150  1676  23509  1149  24103  2360  22939  1613 
pr124  306891  6291  159876  4234  161007  2992  177010  7723  158744  3098 
pr144  383242  5754  197845  5023  192221  4601  207098  10282  192230  5584 
pr152  483534  11007  238234  5873  239714  5537  257858  14711  238367  5983 
rat195  11538  172  6945  154  6898  90  6941  203  6877  143 
kroA200  170498  1531  95000  1552  94595  1560  93986  1408  94286  1345 
kroB200  166499  4315  94211  1691  93910  1388  93975  1492  93944  1371 
pr226  837048  11014  436945  10364  436573  10587  437816  13274  436872  10477 
tsp225  22220  320  13543  278  13091  230  13429  171  13043  245 
lin318  331904  3233  228198  2588  227380  3391  227597  3222  227394  3024 
rd400  122917  1105  88641  1242  88727  3975  88361  970  87380  1012 
In this section we present the epoch analysis for the critical events observed during the simulation runs. Recall that the CyberEDA uses two distinct levels of critical event to trigger the restart population rebuilding and the partial population rebuilding. The two distinct levels indicate two different lengths of stagnation in improving the best solution. The restart population rebuilding is designed for escaping the longer stagnation by creating a new random population with diversification control. The epoch analysis is shown in Figure
Epoch analysis of the restart population rebuilding.
On the other hand, the partial population rebuilding is designed to make the current search session more effective by replacing part of the current population with diverse and quality solutions produced from the reference set, a novel notion in the SS/PR template. The partial population rebuilding is activated at a less critical event than that used by the restart population rebuilding. Figure
Epoch analysis of the partial population rebuilding.
We further conduct a statistical test using the 95% confidence interval analysis. Thirty runs are executed for the CyberEDA algorithm. Figure
Convergence analysis with 95% confidence interval.
The research for investigating the variable (in)dependence relationships through the evolutionary successions is critical for preserving important schema against genetic operations. The Estimation of Distribution Algorithms (EDAs) create a promising domain that learns the probability model of variable (in)dependence. Our CyberEDA algorithm draws on strategies from the adaptive memory programming (AMP) for improving the performance of each EDA component. Notable strategies, namely, the diversification generation, improvement method, thresholding, incremental learning, weighting sampling scheme, probability model rebuilding, and population rebuilding were exploited in this paper. The experimental result on benchmark TSP instances supports our anticipation that the AMP strategies can enhance the performance of classic EDA by deriving a better approximation for the true distribution of the target solutions. This study reveals a promising research direction to develop a more effective EDA by combining various AMP strategies that adaptively respond to different scenarios (such as the problem size and the landscape profile) of the optimization problems.
This research is partially supported by National Science Council of ROC, under Grant NSC 982410H260018MY3.