The adaptive operator selection (AOS) and the adaptive parameter control are widely used to enhance the search power in many multiobjective evolutionary algorithms. This paper proposes a novel adaptive selection strategy with bandits for the multiobjective evolutionary algorithm based on decomposition (MOEA/D), named latest stored information based adaptive selection (LSIAS). An improved upper confidence bound (UCB) method is adopted in the strategy, in which the operator usage rate and abandonment of extreme fitness improvement are introduced to improve the performance of UCB. The strategy uses a sliding window to store recent valuable information about operators, such as factors, probabilities, and efficiency. Four common used DE operators are chosen with the AOS, and two kinds of assist information on operator are selected to improve the operators search power. The operator information is updated with the help of LSIAS and the resulting algorithmic combination is called MOEA/DLSIAS. Compared to some wellknown MOEA/D variants, the LSIAS demonstrates the superior robustness and fast convergence for various multiobjective optimization problems. The comparative experiments also demonstrate improved search power of operators with different assist information on different problems.
Multiobjective optimization is a common problem that scientists and engineers face, which concerns optimizing problems with multiple and often conflicting objectives. In principle, there is no single solution for a multiobjective optimization problem (MOP), but a set of Paretooptimal solutions. This paper considers the following continuous MOP:
Over the past decades, a number of multiobjective evolutionary algorithms (MOEAs) have been proposed. The first MOEA, named vector evaluated genetic algorithm, has been used for MOPs since 1980s [
An extreme difficulty for most MOEAs is how to promote the searching efficiency; that is to say how to improve the operator to induce a more probability for searching high dimensional space. There are two improving methods: enhancing operator with adaptive parameter control and using multiple operators with an adaptive operator selection (AOS). The simulated binary crossover (SBX) [
The major intention of the adaptive parameter control is to solve an essential problem regarded as the exploration versus exploitation (EvE) dilemma. Exploitation means searching the local space deeply, that is to say making full use of the current operator with current parameters. However, exploration means the operator search power for unfamiliar areas, which is displayed by other operators or the configuration of the current operator with different parameters. In conclusion, the EvE dilemma can be considered as looking for a tradeoff of the search power both in unfamiliar and familiar areas.
The EvE dilemma is of great significance for the existence of AOS methods. The EvE dilemma has been intensively studied in the game theory community for dealing with the multiarmed bandit (MAB) problem, which was first proposed in 1952 [
Whether enhancing operator with adaptive parameter control or using multiple operators with AOSs, the statistical data about the operators is vitally meaningful with no doubt. Enlightened by the sliding time window, we present a novel adaptive method which is used to store the information about the operators, called latest stored information based adaptive selection (LSIAS) strategy. The information of operators includes operator names, operator efficacies, and parameters about operators such as neighborhood sizes, scaling factors, and other parameters of operators (the parameters about operators will be regarded as assist information on operator for convenience). In this paper, two kinds of assist information are taken into account and they are used within MOEA/D with dynamical resource allocation (MOEA/DDRA) [
The rest of this paper is organized as follows. The background and some works regarding the AOS and the adaptive parameter control are described in Section
MOEA/D provides a method which decomposes an MOP into a series of single problems. It is suitable to evaluate the performance of operators. There are three common decomposition methods: the weighted sum approach, the Tchebycheff approach, and the boundary intersection method, which are all described in [
Differential evolution (DE) is a parallel direct search method. There are many various mutation strategies about DE. Here, several frequently used DE operators [
DE/rand/1:
DE/rand/2:
DE/targettorand/1:
DE/targettorand/2:
DE/best/1:
DE/best/2:
DE/targettobest/1:
where
After the differential evolution, the crossover operation and the polynomial mutation usually go on to mutate the vector
The polynomial mutation provides a deep mutation for the
The description above is a common DE mutation process. It has been validated that these different DE operators enjoy different search powers. “DE/rand/1” and “DE/rand/2” show strong exploration performance. “DE/targettorand/1,” “DE/best/1,” and “DE/best/2” manifest their perfect exploitation performance and are useful for unimodal problems. However, “DE/targettorand/1” is more suitable for rotated problems than other DE operators [
Every operator is in possession of different search power. Although parameter selfadaptive adjustment improves their search power, it is limited by its best performance. The AOS offers intense search power by choosing different operators when faced with different dilemmas.
There are mainly two AOS methods: the probability based methods, such as the probability matching (PM) [
Two credit assignment methods are often used. One is the dynamic statistic information evaluation about operators, and the other is the search power evaluation which uses various complex statistics to detect outlier production. The former takes recent assist information on operator into account. Some recent assist information is employed as rewards which decide the credit assignment of operators [
Assume that there are
Each operator gets different credit after credit assignment, which is the key of operator selection. As (
Except for the two methods mentioned above, there are some other kinds of adaptive operator selection methods, such as gradient based methods and multiple trial vector comparison based methods. Schütze et al. [
In this section, we present an improved bandit based method for MOPs, named latest stored information based adaptive selection strategy. This method attaches more attention to the AOS dynamic nature. It is mainly composed of two parts, credit assignment and operator selection.
Credit assignment contains two main tasks: one is to calculate credit value of applied operators; the other is to assign the credit fairly.
For the first task, fitness change is adopted as rewards of successful applied operators, which is regarded as Fitness Improvement Rate (FIR). During different search processes, the convergence levels of individuals are highly different. Normalization is used for FIR as follows:
A sliding window with length
Supposing that the operator number is
An illustration of sliding window in the way of FIFO.
An illustration of the index of FIR in credit value computation.
As rewards guide operator selection methods, some unexpected extreme fitness improvement values brought by operators can appear. In case of that, we discard 5% best rewards and 5% worst rewards, and the rest of the rewards are denoted as
Based on the credit assignment described above, each configuration of operator and parameters gets its FRR. Then the best configuration will be selected. Refer to MAB algorithms; the selection of operators and parameters are defined as
As (
The pseudocode of operator and parameter selection is given in Algorithm
Initialize latest stored information sliding window
selected operator and parameters, which are more important than the current type of
parameter.
The MOEA/D provides a decomposition method which could decompose a MOP into a series of single problems. It is the single problems that could give an exact value to measure the performance of every operator. Therefore, the main reason we choose MOEA/D is that the metrics for evaluating operators are easy to get.
Since the MOEA/D was presented in 2007, a good deal of research has been put forward to improve its performance and its applied range. Any improvement on MOEA/D could be of practical interests. Here we choose a famous improved version of MOEA/D as the framework, MOEA/DDRA, as it is the champion of the CEC 2009 MOEA contest. Consequently, we investigate how to enhance the MOEA/D with LSIAS.
In this paper, we use Tchebycheff with objective normalization instead of classic Tchebycheff. It is known that the objective normalization performs better than classic methods especially when the objective space becomes more complex.
MOEA/D minimizes all these
Suppose that the MOP is decomposed into
a population of
An important advantage of MOEA/D and its improved versions is that a better solution
Select
Randomly select several solution from
Generate a new solution after using genetic operators;
Compare the new one with
To combine the LSIAS with MOEA/DDRA, we set the content of latest stored information and define the reward calculation. In this paper, we choose four kinds of operator related information as the content of latest stored information, which are an operator pool, the scaling factor
An illustration of latest stored information in the sliding window.
In MOEA/DDRA, each vector
The scaling factor
Assume that
In MOEA/DDRA, each new offspring is compared with
The pseudocode of MOEA/DLSIAS is demonstrated in Algorithm
Initialize the population
Set
In this section, some experiments are adopted to analyze the performance of our algorithm. In Section
Three types of benchmark functions are adopted here to manifest the effectiveness and robustness of MOEA/DLSIAS. There are 22 wellknown benchmark problems in total including ZDT problems [
The most widely used ZDT problems are biobjective test problems, including ZDT1–4 and ZDT6. This kind of problems is easy to solve as it lacks some features, such as variable linkage and multimodality. Therefore, UF problems and DTLZ problems are covered in the experiments, of which various features can make up for the deficiency of ZDT problems. It is worth noting that UF1–7 are all biobjective test problems and DTLZ1–7 and UF8–10 are threeobjective test problems. All the three types of benchmark functions are widely used for the evaluation of MOEAs.
There are many performance measures for comparisons among algorithms, like Inverted Generational Distance (IGD) [
In this paper, the proposed algorithm, MOEA/DLSIAS, is compared with various MOEA/D improved versions, including MOEA/DDE, MOEA/DDRA, MOEA/DFRRMAB, MOEA/DSTM, MOEA/DUCBT, and MOEA/DARG. All the basic experimental settings for the above competitive algorithms are shown in Table
The basic experimental parameter settings of the compared algorithms.
MOEA/DDE 

MOEA/DDRA 

MOEA/DSTM 

MOEA/DFRRMAB 

MOEA/DUCBT 

MOEA/DARG 

MOEA/DLSIAS  CR = 1.0, 
Tables
Comparative results of all the compared algorithms on the ZDT test problems regarding IGD.
Problems  Algorithms  

MOEA/DDE  MOEA/DDRA  MOEA/DFRRMAB  MOEA/DSTM  MOEA/DUCBT  MOEA/DARG  MOEA/DLSIAS  
ZDT1  
Mean 







IQR 







Rank  7  3  4  2  5  6 

ZDT2  
Mean 







IQR 







Rank  5  3  7  4  6  2≈ 

ZDT3  
Mean 







IQR 







Rank  7  5  2  6 

4  3 
ZDT4  
Mean 







IQR 







Rank  6  7  5 

4  2  3 
ZDT6  
Mean 







IQR 







Rank 

7  5  6  4  2  3 


Rank Sum  26  25  23  19  20  16  11 


+/−/≈  1/4/0  0/5/0  1/4/0  1/4/0  1/4/0  2/2/1 
“+,” “−,” and “≈,” respectively, indicate the result of the algorithm is better than, worse than, and similar to that of MOEA/DLSIAS based on the Wilcoxon’s rank sum test with a significance
Comparative results of all the compared algorithms on the ZDT test problems regarding HV.
Problems  Algorithms  

MOEA/DDE  MOEA/DDRA  MOEA/DFRRMAB  MOEA/DSTM  MOEA/DUCBT  MOEA/DARG  MOEA/DLSIAS  
ZDT1  
Mean  3.6251  3.6519  3.6466  3.6523  3.6471  3.6219 

IQR 







Rank  6  3  5  2  4  7 

ZDT2  
Mean  3.2992  3.3181  3.2892  3.3107  3.2826 

3.3231 
IQR 







Rank  5  3  6  4  7 

2 
ZDT3  
Mean  4.7789  4.7713 

4.7865  4.8063  4.7897  4.7995 
IQR 







Rank  6  7 

5  2  4  3 
ZDT4  
Mean  3.6201  3.6054  3.6258  3.6432  3.6204 

3.6389 
IQR 







Rank  6  7  4  2  5 

3 
ZDT6  
Mean  3.0420  3.0294  3.0367  3.0313  3.0375  3.0379 

IQR 







Rank  2≈  7  5  6  4  3 



Rank Sum  25  27  21  19  22  16  10 


+/−/≈  0/4/1  0/5/0  1/4/0  1/4/0  1/4/0  2/3/0 
“+,” “−,” and “≈,” respectively, indicate the result of the algorithm is better than, worse than, and similar to that of MOEA/DLSIAS based on the Wilcoxon’s rank sum test with a significance
Comparative results of all the compared algorithms on the DTLZ test problems regarding IGD.
Problems  Algorithms  

MOEA/DDE  MOEA/DDRA  MOEA/DFRRMAB  MOEA/DSTM  MOEA/DUCBT  MOEA/DARG  MOEA/DLSIAS  
DTLZ1  
Mean 







IQR 







Rank  7  2  6  4  5  3 

DTLZ2  
Mean 







IQR 







Rank  2  3  7  5  6 

4 
DTLZ3  
Mean 







IQR 







Rank  7  6  5  3  4  2 

DTLZ4  
Mean 







IQR 







Rank  7  6  4 

3  5  2 
DTLZ5  
Mean 







IQR 







Rank  6  7  4  2≈  5  3 

DTLZ6  
Mean 







IQR 







Rank  7  6  3  5  2  4 

DTLZ7  
Mean 







IQR 







Rank  6  7  4 

5  2  3 


Rank Sum  42  37  33  21  30  20  13 


+/−/≈  1/6/0  1/6/0  0/7/0  2/4/1  0/7/0  2/5/0 
“+,” “−,” and “≈,” respectively, indicate the result of the algorithm is better than, worse than, and similar to that of MOEA/DLSIAS based on the Wilcoxon’s rank sum test with a significance
Comparative results of all the compared algorithms on the DTLZ test problems regarding HV.
Problems  Algorithms  

MOEA/DDE  MOEA/DDRA  MOEA/DFRRMAB  MOEA/DSTM  MOEA/DUCBT  MOEA/DARG  MOEA/DLSIAS  
DTLZ1  
Mean  0.9556  0.9689  0.9597  0.9644  0.9598  0.9641 

IQR 







Rank  7  2≈  6  4  5  3 

DTLZ2  
Mean  7.3994  7.3902  7.3763 

7.3755  7.3808  7.4058 
IQR 







Rank  3  4  6 

7  5  2 
DTLZ3  
Mean  7.3846  7.3897  7.3924  7.4046  7.3965 

7.4124 
IQR 







Rank  7  6  5  3  4 

2 
DTLZ4  
Mean  7.3975  7.3924  7.4027  7.4118  7.4008  7.4103 

IQR 







Rank  6  7  4  2  5  3 

DTLZ5  
Mean  6.0502  6.0433  6.0833  6.0908  6.0852 

6.0910 
IQR 







Rank  6  7  5  3≈  4 

2 
DTLZ6  
Mean  6.0613  6.0737  6.1042  6.0985 

6.1007  6.1022 
IQR 







Rank  7  6  2  5 

4  3 
DTLZ7  
Mean  13.224  13.299  13.303  13.453  13.310  13.371 

IQR 







Rank  7  6  5  2≈  4  3 



Rank Sum  43  38  33  20  30  20  12 


+/−/≈  0/7/0  0/6/1  1/6/0  1/4/2  1/6/0  2/5/0 
“+,” “−,” and “≈,” respectively, indicate the result of the algorithm is better than, worse than, and similar to that of MOEA/DLSIAS based on the Wilcoxon’s rank sum test with a significance
Comparative results of all the compared algorithms on the UF test problems regarding IGD.
Problems  Algorithms  

MOEA/DDE  MOEA/DDRA  MOEA/DFRRMAB  MOEA/DSTM  MOEA/DUCBT  MOEA/DARG  MOEA/DLSIAS  
UF1  
Mean 







IQR 







Rank  7  6  4  2≈  5  3 

UF2  
Mean 







IQR 







Rank  7  6  3  5  2  4 

UF3  
Mean 







IQR 







Rank  7  3  4  6  5 

2 
UF4  
Mean 







IQR 







Rank  7  6  4  5  3  2 

UF5  
Mean 







IQR 







Rank  7  6  5  3  4 

2 
UF6  
Mean 







IQR 







Rank  6  5  3  2  4  7 

UF7  
Mean 







IQR 







Rank  7  6  4  5  3 

2 
UF8  
Mean 







IQR 







Rank  7  6  4 

3  5  2 
UF9  
Mean 







IQR 







Rank  6  7  2 

3  5  4 
UF10  
Mean 







IQR 







Rank  7  2≈  5  4  6  3 



Rank Sum  68  53  38  34  38  32  17 


+/−/≈  0/10/0  0/9/1  1/9/0  2/7/1  1/9/0  3/7/0 
“+,” “−,” and “≈,” respectively, indicate the result of the algorithm is better than, worse than, and similar to that of MOEA/DLSIAS based on the Wilcoxon’s rank sum test with a significance
Comparative results of all the compared algorithms on the UF test problems regarding HV.
Problems  Algorithms  

MOEA/DDE  MOEA/DDRA  MOEA/DFRRMAB  MOEA/DSTM  MOEA/DUCBT  MOEA/DARG  MOEA/DLSIAS  
UF1  
Mean  3.5883  3.5937  3.6157 

3.6159  3.6189  3.6225 
IQR 







Rank  7  6  5 

4  3  2 
UF2  
Mean  3.6094  3.6158  3.6355  3.6202  3.6343  3.6269 

IQR 







Rank  7  6  2  5  3  4 

UF3  
Mean  3.5457  3.5753  3.5671  3.5966  3.5688  3.6007 

IQR 







Rank  7  4  6  3  5  2 

UF4  
Mean  3.1729  3.1667  3.2067  3.1818  3.2072  3.2229 

IQR 







Rank  6  7  4  5  3  2≈ 

UF5  
Mean  2.7922  2.8537  2.9587 

2.9228  3.0537  3.1661 
IQR 







Rank  7  6  4 

5  3  2 
UF6  
Mean  2.7671  2.8021  2.9891  3.0071  2.9920  2.8444 

IQR 







Rank  7  6  4  2  3  5 

UF7  
Mean  3.4147  3.4276  3.4396  3.4534  3.4392  3.4615 

IQR 







Rank  7  6  4  3  5  2 

UF8  
Mean  6.3576  6.3327 

6.6325  6.7813  6.4792  6.5927 
IQR 







Rank  6  7 

3  2  5  4 
UF9  
Mean  6.8127  7.0236  7.0345 

7.0327  7.1082  7.1056 
IQR 







Rank  7  6  4 

5  2  3 
UF10  
Mean  3.6443  3.7456  3.7198  3.6211  3.7203 

3.8102 
IQR 







Rank  6  3  5  7  4 

2 


Rank Sum  67  57  39  31  39  29  18 


+/−/≈  0/10/0  0/10/0  1/9/0  4/6/0  1/9/0  2/7/1 
“+,” “−,” and “≈,” respectively, indicate the result of the algorithm is better than, worse than, and similar to that of MOEA/DLSIAS based on the Wilcoxon’s rank sum test with a significance
In Table
Table
DTLZ test problems are threeobjective optimization problems, and they are obviously harder than ZDT test problems. Tables
Table
In Table
Table
Table
To investigate the effectiveness of different types of assist information, a further experiment about diverse configurations of operator information is given. In Section
Method 1 (M1): AOS with no assist information,
Method 2 (M2): AOS with adaptive selection of neighborhood,
Method 3 (M3): AOS with adaptive control of factor
Figure
Box plots for the comparison of the MOEA/DLSIAS and the compared algorithms on ZDT1, DTLZ1, and UF1 regarding IGD and HV.
Regarding the HV metric, M2, M3, and MOEA/DLSIAS work better than M1 on ZDT1. M1 performs better than M2 and M3 on DTLZ1 and UF1, but is absolutely beaten by LSIAS. The results indicate that the assist information is not absolutely helpful for the AOS, but a good configuration of assist information is more useful. The improvement value depends on the configuration of assist information, and AOS with the help of different assist information can display different search power on different test problem.
For a further investigation for the dynamic performance, we analyze the usage of operators and operator with different assist information. Divide the search process into 50 search phases. Each phase is composed of 500 function evaluations for ZDT1, while it is composed of 6000 function evaluations for UF1 and DTLZ1. Then calculate the usage number of every phase during searching. The usage state of operators and operators with different assist information during the whole search process on ZDT1, DTLZ1, and UF1 are shown in Figures
Operators and assist information usage number during the whole search process on ZDT1.
Operators and assist information usage number during the whole search process on DTLZ1.
In Figure
In Figure
As observed from Figure
Operators and assist information usage number during the whole process on UF1.
In this paper, a novel AOS method called LSIAS is introduced. In the LSIAS, the best operator usage rate and the worst operator usage rate are used to improve the UCB method. In case of the influence of unexpected extremely large or small fitness improvement, a credit assignment method which is abandonment of extreme fitness improvement is introduced. LSIAS is also an adaptive selection strategy, which is used to select operators and assist information on operator adaptively. Every latest used operator with all of its assist information and its efficiency are dynamically stored in a sliding window. Based on the efficiency, the best configuration of the operator and its assist information for each search phase are dynamically chosen.
Since the decompositionbased MOEAs could help in evaluating the efficiency of operators with their assist information easily, a variant of MOEA/D is adopted to investigate the performance of our proposed LSIAS. Four DE mutation operators and two assist information, which are neighborhoods and scaling factors, are used in this method within the MOEA/D framework. We conduct extensive experimental studies on three kinds of benchmark functions. The experiments show that LSIAS is robust and effective and its adaptive selection method of operators and assist information can significantly improve the performance of MOEA/D.
In the future work, the adaptive selection of operators with assist information will be used to investigate manyobjective optimization problems and constraint optimization problems. Furthermore, as the indicatorbased MOEAs can also provide exact estimation for operators with assist information, the performance of LSIAS within this algorithm can also be studied.
The authors declare that they have no conflicts of interest.
This work was supported by the National Natural Science Foundation of China (Grants nos. 7170129 and 71771216).