Adaptive Weighted Strategy Based Integrated Surrogate Models for Multiobjective Evolutionary Algorithm

Although the integrated model has good convergence ability, it is difficult to solve the multimodal problem and noisy problem due to the lack of uncertainty evaluation. Radial basis function model performs best for different degrees of nonlinear problems with small-scale and noisy training datasets but is insensitive to the increase of decision-space dimension, while Gaussian process regression model can provide prediction fitness and uncertainty evaluation. Therefore, an adaptive weighted strategy based integrated surrogate models is proposed to solve noisy multiobjective evolutionary problems. Based on the indicator-based multiobjective evolutionary framework, our proposed algorithm introduces the weighted combination of radial basis function and Gaussian process regression, and U-learning sampling scheme is adopted to improve the performance of population in convergence and diversity and judge the improvement of convergence and diversity. Finally, the effectiveness of the proposed algorithm is verified by 12 benchmark test problems, which are applied to the hybrid optimization problem on the construction of samples and the determination of parameters. The experimental results show that our proposed method is feasible and effective.


Introduction
With the continuous upgrading of application requirements, many problems in our real-world have been abstracted into high-dimensional multiobjective optimization problems [1]. For example, regarding the Internet of Vehicles, problems such as intrusion detection faced by the in-vehicle control network, software product selection in software engineering, selection and distribution of relief supplies, and green production optimization in coal industry can be modeled as high-dimensional objective optimization problems [2]. ese practical problems may involve one or more conflicting optimization objectives, which are frequently constrained by harsh constraints. e selection of optimization schemes and the performance of optimization algorithms have become constant challenges as the complexity of the problem has increased [3]. erefore, in recent years, how to use multiobjective evolutionary algorithm to solve high-dimensional objective optimization problems has attracted more and more attention.
e existing multiobjective evolutionary algorithms are generally divided into the following three categories: Pareto domination-based [4,5], indicator-based [6], and decomposition-based [7] multiobjective evolutionary algorithm. e diversity maintenance mechanism of most Pareto domination-based multiobjective evolutionary algorithms is only applicable to Pareto-front optimization problems with regular distribution [8]. When the distribution of Pareto-front is irregular or complex, those algorithms are often difficult to obtain solutions that consider proximity, distribution, and malleability. For index-based multiobjective evolutionary algorithms, they usually choose the optimal solution that makes more contributions to the index, which makes the dominated solution more likely to be favored than the nondominated solution; decompositionbased multiobjective evolutionary algorithms often have difficulty determining the orientation vectors that are matching with the shape of irregular Pareto-front, which leads to hardly solve the complex Pareto-front multiobjective optimization problems.
us, to solve the previously mentioned problems, a variety of effective highdimensional multiobjective evolutionary algorithms (MaOEAs) have emerged [9]. Some improved MaOEAs choose more loose Pareto dominance criteria or simultaneously adopt Pareto dominance criteria and convergence criteria to alleviate the selection pressure [10]. Although this kind of algorithms can better ensure the convergence of the obtained Pareto optimal solution set, it is likely to ingrain a Pareto-front with poor distribution. On the other hand, mops problems are often mixed with noise in the process of fitness evaluation, which may mislead the search direction and reduce the optimization efficiency [11]. From this point of view, noise interferes with the steps of multiobjective optimization, such as determining nondominated individuals, diversity preservation, and elitism, so that the optimal solution of multiobjective optimization problems cannot be obtained, and the optimization problems cannot be well solved [11]. erefore, how to deal with the noise in multiobjective optimization problem becomes very critical. However, noisy multiobjective optimization problems are general in practical engineering fields. It can be seen from the previously mentioned analysis that most of the existing classical multiobjective evolutionary algorithms show bound limitations in dealing with high-dimensional objective optimization problems [12]. erefore, many scholars combine multiobjective optimization with denoising model to solve the noise in the multiobjective optimization problems, resulting in noise multiobjective optimization algorithm. Because the existence of noise affects the inaccuracy of individual selection, adjusting the selection process is a fateful method to deal with the noise multiobjective optimization problem [13]. General selection methods include nondominated sorting selection, probabilistic sorting selection, threshold selection, and so on. Optimization problems in real-world are often affected by noise. And with different distributions, constraints, and central trends, noise may appear in random model parameters, objective functions, and decision variables. Many literatures have recently adopted evolutionary computation method to solve the optimization problems in practical applications from different perspectives [14]. However, for the noise optimization problems existing in engineering problems, even if the variables are set to constant, different target values will be obtained from multiple target evaluation processes due to the noise doping in the evaluation process of fitness function. In this case, the individuals entering the next generation are often not highquality individuals [15]. To solve these problems, many researchers have proposed a variety of noise optimization algorithms.
Intuitively, a common method to reduce noise interference is to average multiple object values obtained from multiple function evaluation processes and then use the final average value to approximate the real object value. Eskandari and Geiger [16] used a fixed number of samples and averaged the target values of them and then provided a new sufficient condition for the convergence of the evolutionary algorithm that is with fixed number of samples. If there are many samples, there will be a colossal computational cost in function evaluation. Basseur and Zitzler [17] proposed a strategy to reduce the evaluation times for everyone and not the average for all individuals; they averaged the time of some of the best individuals. So, a better solution can be found by averaging a small number of function evaluations.
is simple method can significantly reduce the number of function evaluations. However, self-adaptive sampling of data can reduce the influence of noise, but sometimes it may have a great computational cost.
erefore, some scholars have adopted one threshold method in the deterministic selection of evolutionary strategies. If and only if the fitness value of a descendant is better than that of its parent, the descendant will be accepted [18]. is kind of method calculates the probability that an individual dominates all individuals, the probability that an individual is dominated by all individuals, and the probability that it has no dominance relationship with all individuals, to calculate the rank value of this individual. is mechanism greatly improves the correctness of individual ranking and proves that the hierarchical ranking scheme used by NSGA in the case of no-noise may have defects in dealing with noisy problems [19]. Shim et al. proposed a regularity model-based noise multiobjective optimization algorithm according to regularity model-based multiobjective estimation of distribution algorithm [20], where the nondominated solution is used to establish the Paretosolution model and samples the sample solution from it. With the continuous optimization of the evolution process, the established model continues to approach the real Paretosolution, and the quality of the sample solution on the model is gradually improved, which proves that the model has a certain ability to resist noise. Hong et al. proposed the strategy that dividing the search space into several nonoverlapping hyper-spheres and moving individual solutions in each sphere, which improved the average performance of the spheres [21]. is local model can filter noise and increase the robustness of the algorithm. By combining cooperative evolutionary frame and differential evolutionary algorithm, namely, cooperative differential coevolution algorithm, this model can effectively solve large-scale multiobjective optimization problems [22]. Multilevel cooperative coevolution algorithm adopts a technology called random grouping to group interactive variables into a subcomponent. In addition, another technique called self-adaptive weighting is used to adaptively adjust the subcomponents. In the cooperative coevolution with variable interaction learning [23], a new coevolution framework is proposed. Initially, all variables are regarded as independent variables, and each variable is placed in a separate group. And in the iterative process, the relationship between variables is gradually found and merged accordingly. DG2 is an improved differential grouping algorithm, which has better efficiency and grouping accuracy [24]. In the algorithm of using variable analysis method to reduce the dimension of search space, multiobjective evolutionary algorithm based on decision variable analysis controls variable analysis to identify the conflict between objective functions [25]. e algorithm carries out which variables affect the diversity of the generated solutions and which variables play an 2 Computational Intelligence and Neuroscience important role in the overall convergence.
rough the analysis of interdependent variables, the original multiobjective optimization problems are transformed into a series of sublevel multiobjective optimization problems, which can improve the convergence of most difficult multiobjective optimization problems. In order to reduce the dimension of search space through problem transformation, the weighted optimization framework is one of the most representative algorithms, where decision variables are divided into multiple groups, and each group is assigned a weight vector [21][22][23][24]. us, the optimization of original decision variables can be transformed into the optimization of weight vector. After the problem is transformed, the decision variable space of new problems will be greatly reduced. Large-scale multiobjective optimization framework also adopts the problem transforming strategy, which firstly decompose Pareto-solution into two directions associated with the weight vector, and then the weight vector is taken as input to construct a series of subproblems and is designed to track the corresponding points on the Paretosolution so as to reduce the dimension of decision variable space [18,21,22,24].
Although the integrated model has good convergence ability, it is difficult to solve the multimodal problem and noisy problem due to the lack of uncertainty evaluation. erefore, an adaptive weighted strategy based integrated surrogate model is proposed to solve noisy multiobjective evolutionary problems in this paper. Based on the indicatorbased multiobjective evolutionary framework, our proposed algorithm introduces the weighted combination of radial basis function and Gaussian process regression, and U-learning sampling scheme is adopted to improve the performance of population in convergence and diversity and judge the improvement of convergence and diversity. Finally, the effectiveness of the proposed algorithm is verified by 12 benchmark test problems, which are applied to the hybrid optimization problem on the construction of samples and the determination of parameters. e experimental results show that our proposed method is feasible and effective.

Multiobjective Optimization and Its Noisy Problem
For a n multidimensional decision variable, the mathematical model of multiobjective optimization problem with mdimension can be defined as follows: where x � (x 1 , x 2 , . . . , x n ) ∈ Ω; x is the decision variable and Ω is the decision-space Ω � n i�1 [L i , U i ]; L i and U i are the upper and lower boundaries of the x i ; F(x) is the object vector, representing the mapping relationship of Ω ⟶ R m ; G(x) and H(x) are the constraints of the problem. Given x 1 � (x 1 1 , x 1 2 , . . . , x 1 n ) and x 2 � (x 2 1 , x 2 2 , . . . , x 2 n ), they are two decision vectors satisfying constraints in the object space, and x 1 Pareto dominates x 2 , denoted as With the increasing demand of engineering problems, the results of solving multiobjective optimization problems are not only satisfied with a Pareto solution set or a Pareto optimal solution based on the preference of decisionmakers. Sometimes, due to the influence of surrounding environmental factors or noise interference, people prefer to get a more robust Pareto optimal solution set or Pareto optimal solution. e so-called robustness is simply the sensitivity of the objective function to the small disturbance in the decision parameters [25]. If a global optimal solution is very sensitive to the variable disturbance, the final optimal solution obtained in reality may correspond to a different optimal value from the theoretical solution, which means a Pareto solution set with poor robustness is generated. To facilitate analysis and description, we will introduce the noisy model for the multiobjective optimization problem and proposed a simple and novel integrated surrogateassisted model.
Since the evolutionary algorithm is less affected by noise in the early stage, Gaussian regression value can be used as the denoising object value. In practical engineering applications, the observed response usually contains noise, namely, y(x) � f(x) + n. Generally, the noise is assumed to be zero mean Gaussian distribution ε ∼ N(0, Σ n ), where the common form of Σ n is Σ n � σ n I, and I is the identity matrix. For the responses of all observed samples, the variance of their noise is the same. at means the noise is an independent and identically distributed Gaussian distribution.
To reduce the influence of noise, a noise estimation and denoising method is proposed. e joint Gaussian distribution of the prediction y(x) and noisy response Γ at x is written as follows: where the variance matrix is C � σ 2 R + Σ n and the covariance vector is c � σ 2 r T (x).

Surrogate-Assisted Evolutionary Algorithm
Surrogate-assisted model is to construct a relatively simple function from a complex function by collecting feature points and optimize the new function to obtain the optimal solution of complex function, which is shown in Figure 1.
Since the surrogate-assisted model can only represent the real model to a certain extent, so its optimal solution can not directly represent the optimal solution of the original objective function. It is necessary to update the surrogateassisted model according to a certain criterion. e framework of offspring generation is shown in Figure 2.
e search strategy of surrogate-assisted evolutionary algorithm (SAEAs) in the optimization process largely depends on a surrogate model [21,26]. e reason is that the surrogate model assumes that it can provide sufficiently accurate function estimation. Since the surrogate model cannot provide the same properties as the original objective function, it may even produce the optimal value that does not actually exist. erefore, how to select an appropriate Computational Intelligence and Neuroscience surrogate model is a very important step in surrogateassisted evolutionary algorithm. In practical application, the conventional operation is to use the surrogate model together with object-function evaluation.

Gaussian Process Regression.
Gaussian process regression is a Bayesian statistical method for modeling functions, which is very suitable for objective function evaluation. Gaussian process does not need predefined data-structure and can approximate the nonlinear, discontinuous, and multimodal functions, which provides an uncertainty measurement in the form of standard deviation for the prediction function [32]. Gaussian process is the extension of multivariate Gaussian distribution to infinite dimensional random process, where any combination of finite dimensions is a Gaussian distribution. Since Gaussian distribution is a distribution of random variables, Gaussian process can be completely determined by its mean function and covariance function. erefore, Gaussian process has also been widely used in surrogate-assisted evolutionary algorithms.
Given a dataset containing n samples (x i , y i ), i ∈ 1, 2, . . . , n, the fitness y for any candidate solution x is regarded as K is the matrix, whose element is is the input value of the sample point, and κ(x) is denoted as covariance, where C(·) is the covariance function. e covariance function can be written as follows: σ f and σ l are denoted as hyperparameter. erefore, the predicted values and variances of candidate solutions x can be derived as follows: where I is the identity column vector.
In (5) and (6), y(x) and s 2 (x) are the mean prediction function and variance prediction function of Gaussian process regression, respectively. It can be seen that the prediction value of the variance prediction function at the  point x is zero. e greater the distance from the existing sample point, the greater the prediction variance. erefore, Gaussian process regression is an interpolation model about sample points. In addition, the distribution at prediction point x should meet the following requirements: where Φ(x) is the Gaussian cumulative density function. P(y(x) ≤ t) indicates the probability that y(x) is less than or equal to t at given x. erefore, it can be given the confidence interval of y(x) with probability 1 − α at the point x.
Although Gaussian process has been widely used in solving expensive multiobjective optimization problems, it has encountered some bottlenecks in solving high-dimensional expensive problems. In addition, the existing infilling criterion [33] cannot be well applied to high-dimensional problems, and the time of constructing Gaussian process model will also increase relatively with the increase of training samples. Compared with low-dimensional problems, constructing Gaussian process model on high-dimensional problems requires more training samples. erefore, the construction of GP model on high-dimensional problems becomes more time-consuming. [25,31] is a scalar function whose value only depends on the distance from the origin. It is generally defined as a monotonic function about the radial distance between the sample and the data center. RBF kernel is one of the commonly used kernel functions. In essence, the radial basis function model is formed by linear superposition. Its formation process is denoted as follows: first, input a sample set, then select the corresponding radial basis function, and finally calculate the weighted sum of the radial basis function values between unknown points and all sample points, to obtain the predicted response value of the test point.

Radial Basis Function. Radial basis function (RBF)
Since the independent variable of radial basis function is the Euclidean distance between the testing set and the sample set, this can well transform the multidimensional problem into a one-dimensional problem with only the independent variable of Euclidean distance. Given n different samples x 1 , x 2 , . . . , x n ∈ R D and their corresponding function value f(x 1 ), f(x 2 ), . . . , f(x n ) where n and D are arbitrary integers. us, we can have the following: where the coefficient λ i , i � 1, 2, . . . , n denotes the weight of the first i − th basis function; | · | is the Euclidean norm in R D ; the degree of p is not greater than m from the polynomial space. It can be expressed as the linearity of the where there are many choices of kernel function φ(x) in radial basis function, such as linear, cubic, thin plate spline, and Gaussian. In our model, we adopted the Gaussian as kernel function, which can be written as follows: where δ > 0 and is a constant. e unknowns (λ 1 , λ 2 , . . . , λ n ) ∈ R D in the radial basis function are obtained by solving the following: In a system of linear equations, φ represents a matrix of size It can be seen that the RBF model established through the previously mentioned analysis can replace the expensive function to predict the individual fitness value in the process of algorithm optimization and performs best for different degrees of nonlinear problems with small-scale and noisy training datasets. erefore, radial basis function is used to approximate the original problem in this paper and assist evolutionary algorithm optimization, to reduce the consumption of computing resources.

Integrated Surrogate Models for Noisy MOPs
As we all know, radial basis function model performs best for different degrees of nonlinear problems on small-scale and noisy training datasets but is insensitive to the increase of decision-space dimension, while Gaussian process regression model can provide prediction fitness and uncertainty evaluation. erefore, an adaptive weighted strategy based integrated surrogate models is proposed to solve noisy multiobjective evolutionary problems in this paper.

Proposed Adaptive Weighted Strategy.
e choice of surrogate-assisted model is very important to the good performance of surrogate-assisted evolutionary algorithm. RBF model and GPR model are the two most popular surrogate-assisted models. RBF model performs best for different degrees of nonlinear problems on small-scale training datasets and is insensitive to the increase of decision-space dimension [34]. GPR model can provide prediction fitness and uncertainty evaluation. erefore, the combination of these two kinds of information can prevent the search from falling into local optimization. ese characteristics are very attractive and promising for solving high-cost or expensive optimization problems. Although the integrated model has good convergence ability, it is difficult to solve the multimodal problem and noisy problem due to the lack of uncertainty evaluation [35].
Integrated surrogate models (ISM) are a kind of ensemble of surrogate models (EM). It is an integrated model composed of a series of surrogate-assisted models by Computational Intelligence and Neuroscience weighting and combining. It can make use of the advantages of single surrogate-assisted model to effectively improve the robustness of prediction. e mathematical expression of surrogate models is described as follows: where w i (x) is denoted as weight and obeys N i�1 w i � 1. y and y i (x) represent the predicted values of EM and the integrated surrogate-assisted model, respectively.
e key step of constructing EM is to calculate the weight of each surrogate-assisted model. e prediction accuracy is positively proportional to the weight coefficient. According to the method of calculating weight, the existing weight strategies can be divided into two categories, including weighted average and point-by-point weight.
e main principle of calculating the weight coefficient in the weighted average is to assign a weight to each subsurrogate-assisted model according to its global performance. Compared with the global error measurement, the weight of a single surrogate-assisted model for the point-by-point weight is calculated using the local error measurement, which means that the weight of each subsurrogate-assisted model will change in the whole sample space. Compared with the average weight method, the point-by-point weight method allows flexible adjustment of local weight coefficients in the sample space, which can better capture the local characteristics of the objective function, but it will increase the running time and is affected by noise.
Although the integrated model can obtain more accurate prediction values than a single model, and the effectiveness of the integrated model has been proved by theory, it is very difficult to fully meet the theoretical conditions of integrated surrogate model in practical applications, so it is difficult to ensure better generalization ability than a single model [36]. In our paper, an adaptive integration surrogate-assisted model is designed, which is shown in Table 1 Step 3 is used to train RBF model and Gaussian process model, respectively. In Step 5, the performance of each surrogate-assisted model is tested on the test set, and the error of each surrogate-assisted model is calculated, respectively. In Step 6, the weight of each surrogate-assisted mode is updated. e weight of each surrogate-assisted model is equal to the ratio of the test error of the surrogateassisted model in the test set to the sum of the test errors of all surrogate-assisted models. Update the corresponding weight according to the latest test error, where the sum of the weights of all surrogate-assisted models is 1. e method of updating the weight of the surrogate-assisted model by testing the error can adjust the weight at any time according to the performance of the surrogate-assisted model, so that the weight of the surrogate-assisted model with smaller error is greater, so as to enhance the prediction accuracy and prediction stability of the integrated surrogate-assisted model Finally, each surrogate-assisted model is used to predict and output the new predicted samples in Step 7, and then, the weighted sum of all the predicted output values is used as the output value of the final integrated model. It is worth noting that the training time of the integrated surrogate-assisted model can be reduced because not all samples are used in the training of the integrated model. e next parent population need to be selected by some convergence selection strategy in the offspring generation, where we use crossover and mutation to produce the offspring solution. Table 2 shows the main steps of offspring selection.

Indicator-Based Multiobjective Evolutionary Algorithm.
e indicator-based multiobjective evolutionary algorithm adopts a fitness assignment scheme to rank the population members according to their usefulness regarding the optimization goal. e scheme is unique and simple and does not use the traditional diversity protection strategy, which makes the evolutionary algorithm have a good convergence and suitable for solving problems with a high dimension [36]. However, the algorithm performs poorly for some problems in maintaining the diversity preservation mechanism.
Performance indicator is a function that can use some preference information to assign a real number to any approximate solution set in many multiobjective evolutionary problems, so that the relative advantages and disadvantages of any two approximate solution can be judged according to the real number corresponding to each approximate solution set.
Since the binary performance index can assign a real function I(x i , x j ) to any pair of approximate solution sets (x i , x j ) in the object space, it can be directly used for fitness calculation, but there is a prerequisite that the used index must obey Pareto rule. Fitness allocation is to grade individuals in the population according to their utilization value in the process of seeking optimization objectives [37]. erefore, the formula for calculating individual fitness using performance indicators in this paper is as follows: where k is a scaling factor greater than 0.5. e experimental results show that the algorithm can achieve better results when k � 0.05; c is the maximum of the absolute values of all indicators, namely, c � max x∈S |I(x i , x j )|. e main steps of the indicator-based multiobjective evolutionary algorithm are expressed as follows.
(1) First, initialize the population Q, take it as the initial population, and set an empty population P; set a variable g that holds the evolutionary generation.

Computational Intelligence and Neuroscience
(2) e individuals in P and Q are merged into R, and the individuals in R are processed by the indicatorbased fitness assignment scheme. (3) Execute the environment selection operation, and continuously repeat the following two steps: (a) select the individual with the smallest fitness value from R and delete it; (b) update the fitness of the remaining individuals; repeat the previously mentioned two steps until the number of remaining individuals in R is equal to the size of P, and then put the remaining individuals in R into the P. (4) Judge whether the variable g is greater than the maximum evolutionary generation or meets other termination conditions. If yes, stop evolution and output the noninferior solution in P, otherwise continue to execute. (5) Use tournament selection to select individuals from P and copy them into Q. (6) Perform cross-mutation operation on the individual in Q to generate offspring individuals and replace the parent individual in Q with a new generation of individuals; add 1 to the evolutionary generation (g � g + 1) and turn to step (2).

U-Learning Sampling
Approach. e larger the sample space, the higher the accuracy of the surrogate-assisted model. However, the construction of the surrogate-assisted model based on the larger sample space takes longer, so many scholars have successively studied the sampling size of the initial sample space of the surrogate-assisted model [37]. It is believed that, in general, the sampling size of the initial sample space should be 10 times the dimension of the variable of the multiobjective optimization problem.
For simulation-based optimization, we usually update the surrogate-assisted model step by step by iteratively selecting promising sampling points until the stop condition is met. A good sampling strategy should consider both global and local search. In recent years, more and more methods with novel ideas have been proposed [25,35]. In addition, some researchers focus on selecting multiple sampling points in each iteration. ese new acquisition points can be simulated in parallel if parallel computing resources are available, and the number of iterations will be greatly reduced.
To select the training sample points of the model efficiently, it is usually necessary to adopt an adaptive method to select the appropriate learning function and convergence conditions. In this paper, U-learning (Table 3) function U(x) � |u(x)/σ(x)| is adopted as a sample point tool, which represents the probability that the positive and negative states of the sample output response y � f(x) are misclassified. e smaller the value of U-learning function, the greater the probability of sample points being misclassified. erefore, the sample points with smaller value are selected as new training points x new � argmin x∈S U(x), where S is the candidate sample pooling.
Literature [37] proves that the probability of sample points being correctly classified is 97.7% if U(x) is equal to 2, so it can be considered that the constructed Integrated surrogate-assisted model has more than 97.7% probability of Step 6: outputs the final predicted value y � k w i y i , where y i is the predicted value of the model M i at x; Table 2: Main steps of offspring selection. Input: sample-set D, the predicted sample x Output: offspring population p 0 ; Step 1: all individuals in sample-set D are evaluated by weight model; Step 2: N/2: N/2 individuals are selected from P p and P o ; Step 3: N/2 individuals are selected from all the remaining individuals in P p and P o by the MSE of all m surrogate for uncertainty selection; Step 4: combine two groups of individuals as new P p for updating. correct prediction of sample points when U(x) > 2. erefore, it is determined that the convergence condition of sample point is min

Comparison Models.
To verify the effectiveness of our proposed algorithm, we limit the number of real evaluations of the test problem to simulate the scenario of solving expensive MOPs. In addition, the existing popular surrogateassisted evolutionary algorithms are also selected to evaluate its effectiveness. Next, we will briefly describe these selected comparison algorithms.
parEGO [38]: it is the first time to use EGO to solve multiobjective optimization problems parEGO. It divides the MOP problem into several subproblems evenly. In other words, the MOP problem is transformed into a single-objective optimization problem, randomly selects one subproblem at a time, and takes the expected improvement index (EI) as the sample selection strategy to select the solution for real evaluation. In addition, parEGO also limits the size of the training set to reduce modeling time. SMS-EGO [39]: it uses the lower confidence bound to delimit the dominant relationship of the solution, and divide the obtained solution into non-ε dominant solution, ε dominant solution, and dominant solution. For non-ε dominant solution, SMS-EGO calculates its S-metric as the fitness. For ε dominant solution and dominant solution, SMS-EGO assigns a penalty value as fitness. e farther away from the nondominant solution, the greater the penalty. Based on this selection strategy, the solution for real evaluation is selected. Although SMS-EGO performs well, its running time is particularly long because it needs to calculate a lot of Smetric. MOEA/D-EGO [40]: it clusters the solutions of the evolutionary algorithm and then uses EI as the selection strategy to select a solution in each cluster for real evaluation. In addition, it also uses the fuzzy clustering method to establish the surrogate-assisted model, which reduces the modeling time while using all the real evaluated solutions. Compared with parEGO, MOEA/D-EGO runs relatively faster because it selects solutions in batches each time, rather than just selecting a single solution.
K-RVEA [41]: the more contribution of this algorithm is in the model management strategy. It assigns the candidate solution obtained this time and the candidate solution obtained last time to a group of reference vectors, respectively. If the change in the number of inactive reference vectors is less than a certain threshold, uncertainty strategy is used as the basis for solution selection. Otherwise, the angle penalty distance is used as the basis for solution selection. In addition, when updating the model, K-RVEA will filter the solution to limit the size of the training set, to reduce the modeling time. CSEA [42]: it uses the artificial neural network to predict the dominance relationship between candidate solutions and reference solutions. e uncertainty information in prediction is considered together with the dominance relationship to select promising solutions using the real objective functions.

Measurement Indicators.
When the Pareto optimal solution set is obtained by the optimization algorithm, the advantages and disadvantages need to be analyzed by comparing the performance indexes. e performance index mainly evaluates the proximity between the nondominated solution set and the real optimal Pareto-front and the distribution and diversity of the solution set [43][44][45]. erefore, some measurement indicators are proposed to evaluate the quality of the solutions obtained during optimization, to evaluate the quality of the algorithm. e performance evaluation indicators of the solution set of multiobjective optimization algorithm is mainly divided into convergence, uniformity, and spread. Convergence [46] reflects the difference between the solution set obtained by the optimization algorithm and the real Pareto-front. It is generally hoped that the solution set obtained is as close to the real Pareto front as possible. Uniformity reflects the degree of uniformity of the distribution of individuals in the solution set. Generally, it is hoped that the solution set obtained will be distributed as evenly as possible on Pareto front. Spread reflects the wide distribution degree of the whole solution set in the object space. Generally, it is hoped that the obtained solution set will be distributed on Pareto front as widely and completely as possible. erefore, the performance evaluation indicators selected in this paper are shown as follows.
Generational distance (GD) is the average minimum distance from each point in the solution set P to the real Computational Intelligence and Neuroscience solution set P * . e smaller the generational distance value, the better the convergence.

GD(P, P)
where P is the solution set obtained by the evolutionary optimization algorithm and P * is a set of uniformly distributed reference points sampled in the real Pareto front; dis(x, y) represents the Euclidean distance between the points y in the solution set P and the point x in the sample reference set P * . Inverted generational distance (IGD) [47]represents the mean value of the nearest individual from the reference point. e smaller the inverted generational distance value, the better the convergence performance.
Spacing is to measure the minimum distance standard deviation from each solution to other solutions. e smaller the spacing value, the more uniform the solution set.
where d is denoted as the mean of all d i . Diversity metric (DM) [1,3,37] is designed to measure the spread of the obtained solution set.
where d f and d l represent the Euclidean distance between the extreme solution and the boundary solution of the obtained nondominated solution-set. Hyper volume (HV) [1,3] is a performance metric for indicating the quality of a nondominated approximation set, where the super volume of the area formed by the nondominated front obtained after the optimization and the previously reference points. e larger the HV value, the better the comprehensive performance of the evolutionary optimization algorithm.

Parameter Settings.
For the sake of fairness, all comparison algorithms use the original default parameters. Since the comparison algorithms selected in this paper are based on surrogate-assisted evolutionary framework [12]. erefore, the common parameter setting is consistent, for example, the number of initial population data is set to 100. In ZDT, the number of decision variables is set to 12, while that of DTLZ is set to 10. e number of object variables of ZDT and DTLZ are set to 2 and 3, respectively. e maximum real evaluation times of ZDT and DTLZ are set to 200 and 300, respectively. e setting probability of simulated binary crossover is 1.0 and its distribution index is 20. Polynomial mutation is selected as mutation operator, its probability is 1/d, and the distribution index is 20. e number of reference vectors is 300 for two targets and 595 for three targets. Each algorithm runs 30 times independently for each test problem.
e key parameters of our proposed adaptive integrated surrogate model are set as follows. Population size is 100, and its fitness scaling factor is set to 0.05. In this section, the number of individuals for sampling η is set to 5. e maximum number of generations is set to 30. It is worth noting that the improved weighting strategy in this paper can adjust the weight to realize the prediction based on noise data, where the variance of noise is set to 0.2.
In this paper, all experiments on the test function were run independently for 10 times, and the average value and standard deviation (STD) [31] of the results were collected and compared. All algorithms are implemented on MAT-LAB 2019a and run on Intel (R) core (TM) i7-3770 CPU @ 3.40 GHz and 8 GB of RAM on personal computers.

Qualitative and Quantitative Analysis.
In the comparative experiment, the selected test function is DTLZ and ZDT [12,18,22]. ey are designed for multiobjective optimization problems. One of its most important features is the extensible adaptability dimension, such as variable dimension and objective function dimension. All the problems in this set of test problems are continuous n-dimensional multiobjective optimization problems with box constraints, which are scalable in the fitness. In addition, the Pareto front of each problem is different.

Comparison of Nondominated Solution for Different
Algorithms. ZDT1 and ZDT2 are relatively simple test problems. When the number of decisions is small (n � 10), they can achieve good convergence results. However, the performance of MOEA/D-EGO and K-RVEA decreased sharply with the increase of the number of variables. When n � 20, parEGO performs slightly better than MOEA/D-EGO and K-RVEA, as shown in Figure 3, but IGD exceeding 10 means that its convergence is still very poor. Figure 4 shows the comparison of nondominated solution sets of different algorithms to obtain the optimal IGD value on ZDT1 (n � 50). When n � 50, parEGO, MOEA/D-EGO, and K-RVEA failed to find any solution on Pareto front. On the contrary, the performance of our proposed algorithm has been very stable from n � 10 to n � 50.

Comparison of Measurement Indicators for Different
Algorithms. Tables 4 and 5 respectively, show the inverted generational distance (IGD) and hyper volume (HV) of different comparison algorithms, where the number in brackets represents the standard deviation of the index, and bold indicates that the value is the best on this test problem. In addition, we use the average results of 30 independent runs for performance analysis. e symbols "+," "−," and "≈" Computational Intelligence and Neuroscience indicate that our proposed weighted integrated surrogate model is statistically significantly superior to, inferior to, and almost equivalent to comparison models, respectively. e significance level of rank sum test is 0.05. Table 4, only our proposed algorithm reached the near convergence state on DTLZ 1, while parEGO and MOEA/D-EGO performed poorly. On DTLZ2, K-RVEA performs best, and all algorithms can converge. On DTLZ3, all algorithms fail to converge, and our proposed algorithm is close to convergence. e performance of DTLZ4 is similar to that of DTLZ2. Our proposed algorithm is close to the convergence state, while other comparison algorithms have converged; on DTLZ5, all comparison algorithms can converge, where MOEA/D-EGO performed better. In addition, all comparison algorithms failed to converge on DTLZ6.

As shown in
In Table 5, on the DTLZ1 problem, only our proposed algorithm performs best, but it fails to converge and it is relatively close. On DTLZ2, parEGO performs best and completes convergence, and our proposed algorithm is still close to convergence. DTLZ3 fails to reach the convergence state. e performance of DTLZ4 is similar to that of DTLZ2. On the DTLZ5 problem, the performance of all algorithms is similar, and only parEGO is slightly worse. All algorithms on DTLZ6 fail to converge. Overall, our proposed algorithm in this paper performs better on the five-dimensional problem, especially on the more difficult DTLZ1 and DTLZ3 problems. Compared with the other two classical algorithms, our proposed algorithm still has an order of magnitude advantage in IGD performance. Our proposed algorithm is not satisfactory on the relatively simple DTLZ2 and DTLZ4 problems. In most cases, it is inferior to K-RVEA and CSEA. It is believed that our proposed algorithm has high convergence rate in population evolution, and has high IGD in most cases, which is consistent with the iterative curve.
0/12/0 1/11/0 0/11/1 0/11/1 0/10/2 -  Figures 5 and 6 show the convergence curves of our proposed algorithm and comparison algorithms on ZDTand DTLZ problems, where the abscissa is the real function evaluation times (Fes) and the ordinate is the IGD index. It is worth noting that, due to space constraints, we have only selected some problems for analysis. For ZDT test problem, the results of IGD and HV show that our proposed algorithm performs better when the experimental settings are consistent. e convergence curve shows that the convergence effect of our proposed algorithm is better than the comparison algorithm in most cases. Only on ZDT 1-3 problems, the convergence effect of our proposed algorithm is as small as parEGO before 170 real evaluations. Since the ZDT1-3 problems are relatively simple, our proposed algorithm is easy to find a better solution than the existing population in the initial stage. If the comparison algorithms select the solution according to the convergence, the convergence speed will be very fast. However, it is relatively easy to find a better solution than the existing population in the region close to the real PF, so the model needs to be able to better simulate the region near the real PF. Different from parEGO, our proposed algorithm selects solutions based on diversity, which is more inclined to increase the diversity of solution set, so that the surrogate-assisted model can better simulate the region near the optimal solution of the current population.
erefore, the convergence effect of the first 30th times of the convergence curve of our proposed algorithm is as small as that of parEGO, MOEA/D-EGO, and CSEA, and the later convergence effect is better than it. For the DTLZ test problem, IGD, HV results and convergence curves show that our proposed algorithm performs better on most test problems when the experimental settings are consistent. For the DTLZ4 test problem, our proposed algorithm has the same effect as K-RVEA, but our proposed algorithm is not as good as MOEA/D-EGO for the DTLZ6 test problem.

Ablation Analysis for Noisy Treatment.
Based on the indicator-based multiobjective evolutionary framework, our proposed algorithm introduces the weighted combination of radial basis function and Gaussian process regression. In other words, two different surrogate-assisted models are linearly combined to improve the optimization performance. To analyze the performance of this optimization strategy, we used ablation analysis to explain this difference. e comparative experimental curve is shown in Figure 7, where RBF-EA, GPR-EA, and Both-EA denote a radial basis function, Gaussian process regression, and the weighted combination, respectively.
Since DTLZ belong to the test problems that are sensitive to noises, we only choose DTLZ to analyze the performance of our algorithm. e results are shown in Table 6. N-Both-EA is denoted as the version without noise treatment. It can be seen that the processing results of noisy data and clean data are quite different; that is to say, noise has a great impact on the performance of the evolutionary algorithm. However, our proposed algorithm has higher precision than other comparison algorithms, which is attributed to our proposed algorithm with the ability of noise removal. erefore, it is very necessary to deal with the noises for multiobjective evolutionary algorithm.  e previously mentioned analysis shows that the designed model in this paper has higher performance than that using only a single strategy, which is helpful to improve the performance. In addition, through the comparison between our proposed algorithm and the other comparison algorithms, it is found that our proposed algorithm achieves better performance than the compared algorithms on noisy DTLZ and ZDT problems.

Conclusion
Radial basis function model performs best for different degrees of nonlinear problems on small-scale and noisy training datasets but is insensitive to the increase of decision-space dimension, while Gaussian process regression model can provide prediction fitness and uncertainty evaluation. erefore, an adaptive weighted strategy based integrated surrogate models is proposed to solve noisy multiobjective evolutionary problems in this paper. Based on the indicator-based multiobjective evolutionary framework [7], our proposed algorithm introduces the weighted combination of radial basis function and Gaussian process regression, and U-learning sampling scheme is adopted to improve the performance of population in convergence and diversity and judge the improvement of convergence and diversity. Finally, the effectiveness of the proposed algorithm is verified by 12 benchmark test problems, which are applied to the hybrid optimization problem on the construction of samples and the determination of parameters. e experimental results show that our proposed method is feasible and effective [48-51].

Data Availability
e labeled datasets used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest
e authors declare no conflicts of interest.