A Kriging Model-Based Expensive Multiobjective Optimization Algorithm Using R2 Indicator of Expectation Improvement

. Most of the multiobjective optimization problems in engineering involve the evaluation of expensive objectives and constraint functions, for which an approximate model-based multiobjective optimization algorithm is usually employed, but requires a large amount of function evaluation. Aiming at eﬀectively reducing the computation cost, a novel inﬁlling point criterion EIR2 is proposed, whose basic idea is mapping a point in objective space into a set in expectation improvement space and utilizing the R2 indicator of the set to quantify the ﬁtness of the point being selected as an inﬁlling point. This criterion has an analytic form regardless of the number of objectives and demands lower calculation resources. Combining the Kriging model, optimal Latin hypercube sampling, and particle swarm optimization, an algorithm, EIR2-MOEA, is developed for solving expensive multi-objective optimization problems and applied to three sets of standard test functions of varying diﬃculty and comparing with two other competitive inﬁll point criteria. Results show that EIR2 has higher resource utilization eﬃciency, and the resulting nondominated solution set possesses good convergence and diversity. By coupling with the average probability of feasibility, the EIR2 criterion is capable of dealing with expensive constrained multiobjective optimization problems and its eﬃciency is successfully validated in the optimal design of energy storage ﬂywheel.


Introduction
Science and engineering practice possess a large number of multiobjective optimization problems (MOP) [1] whose objectives often conflict with each other and need to be optimized simultaneously, where the Pareto set is desired. Evolutionary Algorithm (EA) [2] has proven to be very suitable for MOP, some typical algorithms include NSGA-II [3], SPEA2 [4], MOEA/D [5], and MOPSO [6]. However, the success of these algorithms depends heavily on a large number of function evaluations and can only be applied in cases that function evaluation is cheap computationally. To handle some expensive MOPs, it is a desire to develop some efficient algorithms.
To reduce the number of expensive evaluation required, approximate/surrogate model [7], which mimic the inputoutput relationship of expensive function but is cheap-toevaluate, is often employed. Polynomial response surfaces [8], support vector machines [9], neural networks [10], and Kriging [11] are commonly used approximate models.
Currently, there are two ways to integrate the approximate model with EA. e first one is to first construct a static global approximate model based on all sample points already evaluated, and the EA algorithm then searches for the optimal solution. However, with limited computational resources, it is hard to guarantee the approximate model's accuracy over the entire design space. e second one utilizes an approximate model in a dynamic manner by first constructing a rough approximate model with a fraction of sample points generated by some Design of Experiments (DoE) [12] techniques, and then maximizing some infilling point criterion (IPC) [13], a new point is located by EA and is evaluated by expensive function. is process is repeated until resources are exhausted. e second manner has the advantage of high efficiency in using computation resources and is used in this paper. e infilling point criterion [14] plays a critical role in the approximate model-based optimization method. Expectation Improvement (EI), proposed in Efficient Global Optimization (EGO) [15] by Jone, may be one of the most researched method in the literature. It uses the Kriging model as an approximate model and considers not only the predicted value but also modeling uncertainty, achieving a good balance between exploration and exploitation in the optimization process. Other IPCs include statistical lower bound [16] and probability of improvement [17].
However, those criterions are proposed originally only for single optimization and have to be modified for MOPs. Knowles first introduced EI into the field of multiobjective optimization and presented ParEGO [18], where the Kriging model is used and multiple objectives are converted into a single objective through a parameterized scalarizing weight vector, and then followed by the direct application of EGO, the weight vector changed randomly as iteration proceeded. Joan pointed out ParEGO is apt to favor model accuracy rather than finding nondominated solutions and demands more iterations before convergence and hence proposed Pareto Expected Improvement index(PEI) [19], which considers both EI and the probability of being Pareto optimal solution. Multi-EGO [20] builds the Kriging model and calculates separately EI for each objective and calls MOGA [21] as a solver to output a set of candidate points, from which a point is selected as an infilling point. MOEA/ D-EGO [22] borrowed the ideas from ParEGO, and it generated not a single but a set of weight vectors that distributed uniformly in the objective space to form single objective optimization subproblems involving EI, optimizing jointly those subproblems produces multiple infill points and in this way parallelly infilling is achieved. Recently, some IPCs compatible with multiobjective optimization were proposed, such as MaxMin improvement [23] and expectation hypervolume improvement [24]. Svenson defined MaxMin improvement in the analytic form when the number of the objective function is two and suggested using the Monte Carlo method (MCM) [25] to calculate MaxMin value when the number of the objective is more than two.
In this paper, a novel IPC is proposed for expensive multiobjective optimization. e outstanding feature of the IPC is to map one point in the objective space to a set in the expectation improvement space, where the fitness of the point is defined as a Quality indicator (QI) [26] value of the set. In this way, QI plays a role in guiding optimization towards an infilling point, rather than evaluating the nondominated solution set after optimization. ere are a large number of candidate QIs in the literature, such as IGD [27,28], ε-indicator [29], SDE indicator [30], and Hypervolume (HV) [31]. However, these QIs do not meet the requirements of easy-using and low-computation cost to some extent. For example, both IGD and ϵ-indicator demand Pareto front (PF) as the reference set, which is impossible in practical problems; HV does not require PF but a reference point, for which, if set unreasonably, will mislead the optimization search direction [32]. Besides, HV possesses high computational complexity, which increases exponentially with the number of objectives. Some QIs have a specific bias, for example, ε-indicator is more convergenceoriented, and as a result, optimization process may suffer from premature convergence, while SDE prefers diversity [33] which may make the optimization period too long [34]. R2 indicator [35,36], having some desire properties in common with HV but is cheaper to evaluate, attracted widespread interest in recent years. In addition, it is weakly monotonic [37] and does not require PF to be provided; hence, it is adopted in this article. e rest of paper is organized as follows. In Section 2, the mathematical background of Kriging, EI, and R2 indicator is reviewed. en, the proposed EIR2 indicator and framework of the whole optimization algorithm are presented in Section 3. To evaluate the performance of the proposed IPC, a series of experiments over three benchmark test function suites against two competitors IPCs is conducted in Section 4, their results are measured by three performance metrics and then analyzed in Section 5, and followed by Section 6, where the proposed EIR2 is modified and applied to a real-world constrained expensive MOP, an optimization design of an energy storage flywheel. Lastly, Section 7 gives the conclusion and future research direction.

Multiobjective Optimization Problem.
Without loss of generality, a multiobjective optimization problem is formulated as follows: where F(x) represents the objective function vector which contains m objective functions, g i and h j stand for inequality and equality constraint function with the total number p and q, respectively, and x � [x 1 , · · · , x d ] T stands for design vector with lower bound x min and upper bound x max . A solution u is said to dominate another one v if and only if there is no x satisfying x ≺ x * (such x * is also referred to as nondominated by x). e set of all the Pareto optimal solutions is called the Pareto Set (PS), whose image in the objective space is called Pareto Front (PF).
In practice, objective functions and constraint functions may come from finite element simulation; hence, they have no explicit analytic expressions and are usually computationally expensive to evaluate. Under this condition, we are interested in identifying a set of nondominated solutions, called approximated Pareto Set/Front, to represent the true Preto Set/Front, meanwhile consuming as less computing resources as possible.

Kriging Model and Expectation Improvement.
Kriging model is an unbiased estimation model with smallest estimated variance, excellent high-dimensional, and nonlinear fitting capability. It assumes the relationship between input x and predicted output y as follows: where μ is a constant and z(x) ∼ N(0, σ 2 ) is the Gaussian random variable, which represent global trend and local deviation of the output separately. Having a set of known observations S � (x i , y i ), i � 1, · · · , N , our job is to search for the best values of μ and σ when predicting at a new point x. In the design space, the relationship between two different points x i and x j is characterized by correlation functions, such as the Gaussian function: where hyperparameter θ � [θ 1 , · · · , θ d ] T controls the smoothness of prediction function and is usually determined through maximizing a likelihood function of R by some intelligent optimization algorithms such as particle swarm optimization. According to optimization theory, analytical form of μ and σ are obtained: where y � [y 1 , · · · , y d ] T is known as the output vector and R is a matrix of covariance between any two points in S, that is, . Finally, at the new point x, we get the predicted value and variance: where r is a correlation vector between point x and all points in S. Unlike other approximate models such as response surface [38] and neural network [39], which can only give out the predicted value, Kriging is capable of outputting predicted variance as a byproduct, which can be regarded as a predicting uncertainty at x. With Kriging, the value at a new point x is treated as a Gaussian random variable, that is, y(x) ∼ N(μ, s 2 ). An improvement quantity is designed as I(y(x), y min ) � max(y(x) − y min , 0), where y min is the smallest y value in S. It is noted that only those y(x) that are less than y min produce improvement. e expectation of improvement (EI) [40] has a closed form, that is, where Φ and ϕ are the probability distribution and probability density functions, respectively.

R2
Indicator. e R2 indicator [41] is a unitary indicator, which was proposed to evaluate the performance of the nondominated solution set Q, given as where U is a discrete and finite set of utility functions u, for which there are many options to choose, such as weight sums and penalty-based boundary intersection [42].

Expectation Improvement R2
Indicator. e main difficulty in extending EI infilling criterion to MOPs lies in the fact that we have to specify y min before calculating the EI value, which is impractical as no such solution exists as being optimal with respect to every objective, but a set of nondominated solution found during previous optimization process. It is now highly desired to find an infilling point having large EI values with respect to all points in the current nondominated set.
is motivates us to treat each nondominated solution as current optima and then calculate the EI value in each objective, resulting in a vector function: where p � [p 1 , · · · , p m ] T is a point in current nondominated set P and EI j (x, p) � EI(y j (x), p j ). By this function, we obtain a vector consisting of all EI values for which p j is deemed as y min in objective j. Obviously, we need to build m Kriging models, each corresponding to one objective function. After continually applying function H to all nondominated solutions, we get a set composed of |P| points, defined as where q � [q 1 , · · · , q m ] T . In essence, we mapped point x in design space to a set Q(x) in another space, called Mathematical Problems in Engineering expectation improvement space, where the R2 indicator of Q(x) is defined as fitness of x being selected as the infilling point. e mapping from objective space to the EI space is illustrated in Figure 1.
Combining EI and R2 indicator, a novel indicator for MOP, Expectation Improvement R2 Indicator (EIR2), is defined as follows: As EIR2 is a scalar indicator, some single optimization algorithm can be employed to determine the optimal solution as the infilling point. Besides, EIR2 actually defines an indicator template in which other utility functions can be specified. In this paper, Tchebyche function is employed as utility function because it can tackle MOPs with convex or concave Pareto front, defined as where λ � [λ 1 , · · · , λ m ] T is weight vector satisfying λ j ≥ 0 and m j�1 λ j � 1. ese weight vectors are uniformly distributed in the objective space and are collected in the weight set Λ.
e point that maximizes the EIR2 indicator is recognized as an infilling point for expensive evaluation.

Weight Vectors Generating.
e utility function is actually an optimization subproblem parameterized by the weight vector, by which the optimization process can be controlled. If the geometry characteristics of PF are known, such as being convex/concave, a set of weight vectors can be distributed consistent with the PF in the objective space, by which a more accurate approximate PF with uniform distribution is more likely to be obtained. However, there is often no information about the PF in priori, which is common for practical MOP; hence, a better choice in this situation is to treat all objectives equally important and generate a set of weight vectors evenly distributed in the objective space.
To help generate weight vector λ, an auxiliary set is introduced where H is an userspecified positive integer. We can chose randomly m − 1 element from L as the first m − 1 elements of λ and assign the value of λ m to ensure m i�1 λ i � 1. Exhausting every possible combination, there are totally C m− 1 H+m− 1 weight vectors generated. Figure 2 illustrates all 21 weight vectors generated using parameter m � 3 and H � 5.

Comparison with Other IPCs.
e novelty of EIR2 is that fitness assessment of point is carried out not in the objective space but in the EI space, where the R2 indicator is calculated for the mapping set of the point. Since the EI indicator has a closed function form, so does EIR2 regardless of the number of objective functions. In addition, the computational complexity of EIR2 is O(m * |P| * |Λ|), which is considerably less than that of hypervolume [26].
ParEGO also uses the Chebyshev function; however, its purpose is to integrate multiple objectives into one single objective before applying the EGO algorithm. erefore, in essence, the infilling point criterion used in ParEGO is EI.
With regard to the MaxMin indicator, despite analytic form being existing, its calculation has to resort to dividing the objective space into many blocks on which tedious multivariable integration is carried out. Even worse, the number of blocks increases exponentially with the growth of the number of objective. So, in practice, we only apply this analytic form when m � 2 while using the Monte Carlo method when m > 2. Figure 3 shows contour plots of three IPCs, namely, ParEGO, MaxMin, and EIR2 applied in test function TEST2 defined in Table 1, where lighter color denotes higher value. In these plots, 20 known sample points are represented by red points, and blue points represent the true Pareto set, which are composed of three disconnected point subsets. It is easy to find that ParEGO can identify only one Pareto subset out of three, while MaxMin and EIR2 successfully locate the three Pareto subsets. In the region, where no Pareto solution exists, the function value of EIR2 is almost constant, while the function landscape of MaxMin shows a ladder-like trend.
Deutz [43] recently proposed an IPC, also called EIR2, which is however very different from ours, proposed in this article. e main difference is that the R2 indicator is calculated in different spaces. Deutz's R2 indicator is defined in objective space and the expectation of R2 indicators is carried out through multivariate integration. erefore, the analytic form can only be obtained when m � 2 but does not exist when m ≥ 3, where the Monte Carlo method has to be adopted. In contrast, EIR2 in this paper calculates the R2 indicator in the EI space and hence having the analytic form for any m. Most importantly, our EIR2 involves no multivariable integration and hence is cheap to evaluate.

Overview of the EIR2-EMOA.
Combining the devised EIR2 infilling point criterion with the Kriging model, optimal Latin hypercube sampling method [44], and particle swarm optimization [45], an optimization algorithm for multiobjective problems with expensive functions is proposed, referred to as EIR2-MOEA. e detailed procedure is shown in Algorithm 1.
In the initial stage of EIR2-MOEA, a set of sample points of small size need to be selected for establishing the initial Kriging model. As a design of experiment method, the optimal Latin Hypercube Sampling (LHS) method is utilized as it has flexibility in setting the number of sampling points and can ensure these points filling in the entire design space.
When maximizing the EIR2 value and looking for optimal hyperparameters of the Kriging model, an optimization method is required. However, the landscapes of these two functions are nonlinear, for which traditional gradient-based optimization algorithms are prone to falling into local minima if the initial point is not selected wisely and has to be rerun multiple times. Particle swarm optimization algorithm, as an intelligent optimization algorithm, does not need information about gradient and has fewer parameter settings, high efficiency of search, and lower complexity, hence is employed to find the optimal parameters.

Experiment Setup
In order to assess the performance of the proposed algorithm, we applied it to three sets of standard test function suites and compared it with another two algorithms, Par-EGO and MaxMin.

Test Problems.
ree benchmark function suites, namely, simple set, ZDT set, and DTLZ set, are selected to verify the performance of the IPCs in different aspects, whose definitions are shown in Table 1. e simple set includes three biobjective optimization problems TEST1, TEST2, and FON with the number of variables less than three, whose corresponding PFs are convex, concave, and discontinuous, respectively. e ZDT set contains three biobjective test functions, ZDT1, ZDT2, and ZDT3, with convex, concave, and discontinuous PF, respectively. We set the number of variables to 5 to test IPCs' performance on problems having more decision variables. DTLZ2, DTLZ5, and DTLZ7 form the DTLZ test set, which all have three objectives and five decision variables, and they are selected to assess the performance of the IPCs when the number of objectives is large.

Performance Metric.
We hope that the approximated Pareto front found close enough to the true Pareto front and is evenly distributed in the objective space, that is, to meet the requirement on convergence and diversity. For this purpose, we choose three performance metrics to quantify the performance of the approximated Pareto front.
(1) Inverted Generational Distance (IGD) [28]: where T and P are true and approximated Pareto set, respectively, |T| is the number of solutions in T, and d(x, p) represents the Euclidean distance between x and p in objective space. IGD measures the average distance between the true and approximated Pareto set. erefore, the smaller IGD value is preferred. (2) HyperVolume (HV) [31]: (14) where r � [r 1 , · · · , r m ] T represents reference point which is dominated by all solutions in P and Vol Mathematical Problems in Engineering stands for Lebesgue measure. Geometrically, HV is the volume enclosed by the approximated Pareto set P and the reference point r. HV can reflect both convergence and uniformity. If the approximate Pareto set performances well with respect to both aspects, then the HV value will be high.
(3) Nondominated solution ratio (NR): where |P| is the number of nondominated solutions and N max the maximum number of evaluations by expensive function. NR is used to measure the proportion of the nondominated solutions out of the total sample points evaluated by expensive functions. In other words, NR represents the efficiency of IPC in using computing resources. e larger the NR value, the more nondominated solutions found by the IPC and the higher the utilization efficiency of computing resources.

Parameter Setting.
We hope to evaluate the performance of each IPC with limited computing resources, that is, for each test function we set the maximum number of evaluations involving expensive functions N max , shown in Table 2. e size of the initial sample set is specified to be a fraction of N max , namely, αN max , where α is a proportional coefficient which is 1/3 in this paper. To ensure fairness, all three IPCs start with the same initial sample point set generated by optimal LHS in each run, and ten runs are executed for each test function. Reference points are required for calculating the HV value. All biobjective optimization problems share the same reference point [1.    Mathematical Problems in Engineering As for particle swarm optimization, the population size is set to 100, the evolution generation is 100, and the scaling factor is 1.2. Other parameters of MaxMin and ParEGO are taken as the recommended values in original papers.

Result and Discussion
For each test function, ten independent runs are performed, and the HV, IGD, and NR indicators of the resulting approximated Pareto set are calculated. Finally, their mean and standard deviation of the ten runs are calculated and listed, respectively, in Tables 3-5. For each test function, among the three IPCs, the winning one is shown in boldface with a gray background.

Simple
Tests. TEST1, TEST2, and FON are all biobjective test functions but have, respectively, convex, piecewise continuous, and concave Pareto front. e comparison of mean and std value for HV, IGD, and NR metric can be found at the top part of Tables 3-5, respectively. It is shown that ParEGO achieved the worst performance on all three performance metrics, while EIR2 and MinMax achieve similar results but both significantly better than ParEGO. Specifically, for TEST1 and TEST2, EIR2 is better than MaxMin in terms of HV and IGD but is slightly worse concerning NR. For FON, EIR2 ranked first on all three performance metrics. Figure 4 plots the approximated Pareto front obtained by the three IPCs on TEST1, TEST2, and FON test function, respectively, with its HV value being median in ten runs. It is evident that the number of nondominated solutions obtained by ParEGO is small and they are not evenly distributed, which is more obvious on TEST2 and FON. is phenomenon can be foreseen because ParEGO lacks a mechanism to guarantee the uniform distribution of the solutions, that is, its weight vector is randomly generated at each iteration. In contrast, the approximated Pareto fronts obtained by EIR2 and MaxMin seem f 1 � (1 + g)cos(πθ 1 /2)cos(πθ 2 /2) f 2 � (1 + g)cos(πθ 1 /2)sin(πθ 2 /2) f 3 � (1 + g)sin(πθ 1 /2) θ i � (π/(4(1 + g))) (

Disconnected
Mathematical Problems in Engineering 7 to have completely covered the true Pareto fronts and have good distribution performance, regardless of the nature of the true Pareto front of the test function.
Overall, EIR2 and MaxMin have comparable performance and are much better than ParEGO on simple test functions with a low number of variables.

ZDT Tests.
e middle part of Tables 3-5 presents the IGD, HV, and NR indicator comparison on ZDT test suites. Figure 5 illustrates the approximated Pareto fronts obtained by each IPC on ZDT1∼ZDT3 whose HV value being median in ten runs. It is clear that EIR2 is the best in terms of HV, IGD, and NR indicators for all ZDT test functions. Specifically, the HV value and IGD value of EIR2 are much better than that of ParEGO and MaxMin. It is worth noting that the advantages of EIR2 are quite obvious for IGD and NR indicators. For example, for ZDT3 EIR2 gets an NR value of 0.5438, while the other two algorithms only have 0.1924 and 0.3686; meanwhile, it gets an IGD value of 0.0143 while the other two 0.0544 and 0.0591.
According to Figure 5, it is seen that both ParEGO and MaxMin successfully converged to the true Pareto front of

Input:
N max : Maximum number of evaluations by expensive function α: Ratio of the number of initial samples to T max H: Integer used to generate weight vector Output: P: Nondominated solution set (1) Initialization: Obtain design variable limits x max and x min , number of objective m, etc. according to MOP to be solved. Set S � ∅ and P � ∅.
(2) Generate αN max sample points in design space [x min , x max ] using optimal Latin hypercube sampling.
(3) for k � 1 to αN max do (4) Calculate the expensive objective function values y k � F(x k ) for sample point x k (5) S � S ∪ (x k , y k ) , i.e., add sample point to S (6) end for (7) Generate c � C m− 1 H+m− 1 weight vectors in objective space and save them in set Λ � λ 1 , · · · , λ c . (8) Find non-dominated solutions in S and put them into P (9) for k � 1 to (1 − α)N max do (10) for k � 1 to m do (11) Using PSO to find the best hyperparameter θ (12) Build Kriging model of the m-th objective function based on S (13) end for (14) According to EIR2 indicator, apply PSO to find the best infilling point x inf (15) Calculate expensive objective function   Mathematical Problems in Engineering ZDT1 and ZDT2; however, they failed in making them evenly distributed. is phenomenon is even worse on ZDT3 whose true Pareto front consists of five segmented curves. e majority of solutions found by ParEGO and MaxMin are clustered in one segment, while only a few solutions located in other segments. In contrast, the approximated Pareto set obtained by EIR2 are uniformly and equally distributed among all five segments.
In summary, EIR2 is capable of finding more nondominated solutions than ParEGO and MaxMin using the same amount of computing resources and keeping them evenly distributed along the approximated Pareto front.

DTLZ Tests.
e DTLZ test function suites are used aiming at testing an algorithm's ability to three-objective optimization problems. e metric comparisons are listed at the bottom part of Tables 3-5, respectively. Figure 6 shows each IPC's results on DTLZ2, DTLZ5, and DTLZ7 with their HV value being median in ten runs. Same as the ZDT case, on the three test functions, the EIR2 is still significantly ahead of its two competitors in terms of all three performance metrics. To be more specific, take DTLZ2 as an instance, the NR value of EIR2 is 0.3529, which is nearly three times 0.1262 of MaxMin and more than twice 0.1652 of ParEGO, which shows again EIR2 is more efficient in using computation resources compared to other IPCs. e larger HV value and smaller IGD value obtained by EIR2 also indicate its capacity in achieving strong convergence ability. e same conclusion is also reached by inspecting Figure 6. Specifically, ParEGO has difficulty in converging to the true Pareto front, while MaxMin, despite converged to PF, suffered from being not able to spread its approximated Pareto front uniformly distributed along the true Pareto front which is a 1/8 sphere of radius 1, and there are only a few solutions in the middle part and most of the solutions are found in three edges. In contrast, the approximated PF obtained by EIR2 is adequate to converge to and cover entirely and evenly the 1/8 spheres surface.

Engineering Application
In addition to boundary constraints, most MOPs also contain equality or inequality constraints. We assume that constraint functions are expensive; hence, the Kriging model need to be established for each constraint. On the purpose of measuring the degree to which the solution satisfies the constraints, probability of feasibility (PoF) [46,47] is introduced as follows: where g i (x) and s i (x) is the predicted value and standard deviation of constraint function g i (x). Based on PoF, the average probability of feasibility (APoF) is defined as follows: where v is the number of all inequality constraints. e APoF value of an infeasible solution is low, but it will increase as more constraints get satisfied. Clearly, the value of APoF varies in interval [0, 1]. e closer the value of APoF is to 1, the higher the probability that the solution is feasible, and hence more preferred.
In combination with APoF, EIR2 is extended to constrained MOPs (18) In this section, the proposed CEIR2 indicator is applied to the optimization design of energy storage flywheel [48] to test its effectiveness.
e energy storage flywheel [49] uses a flywheel in highspeed rotating to store kinetic energy and convert it into electrical energy when needed. Given a constant speed, the maximum energy storage is proportional to the moment of inertia of flywheel. Larger moment of inertia can be obtained by adjusting the geometric parameters of the flywheel, but at the same time on the risk of increasing the total mass. In this paper, our job is to maximize the flywheel rotational inertia as well as minimize total mass. Figure 7 is a cross-sectional profile of an energy storage flywheel.
ere are six decision variables: r 1 ∼ r 4 and t 1 and t 2 , representing the radial sizes and thickness of the hub and spoke of the flywheel, respectively. ree inequality constraints involve restrictions on radial stress, radial deformation, and hoop stress. e overall optimization problem is expressed as follows: where x � [r 1 , r 2 , r 3 , r 4 , t 1 , t 2 ], whose upper and lower boundary are listed in Table 6. I(x) and M(x) represent the rotational inertia and total mass of the flywheel, respectively. σ max r , σ max θ , and Δ max r , respectively, represent the maximum radial stress, the maximum hoop stress, and the maximum radial deformation of the outer ring when the flywheel rotates, accompanied by their corresponding allowable values, [σ r ], [σ θ ], and [Δ r ].
Although formulas of the objective functions I(x) and M(x) can be obtained directly from the geometric parameters, the calculation of the constraint functions has to resort to finite element analysis, considering the complexity of flywheel structure and external force conditions; hence, this problem is a constrained expensive multiobjective optimization problem. e material used for the flywheel is Ti-6Al-4V titanium alloy, with corresponding allowable stress [σ r ] � [σ θ ] � 800 MPa. We set [Δ r ] � 10 mm at the working rotation speed ω � 20, 000 rpm. ANSYS is employed for calculating values of objective and constraint functions.
As for parameter setting, the maximum number of function evaluations is 360, α is set to 1/4, and other parameters remain unchanged as for the test function case. e resulting feasible nondominated solutions are obtained by applying CEIR2 to this engineering problem, as shown in Figure 8. For comparison, we use the optimal LHS method to get 900 sample points at once and then evaluate values of the expensive objective and constraint functions,   out of which the feasible nondominated solutions are also shown in Figure 8.
CEIR2 found a total of 44 feasible nondominated solutions, which have a large distribution range in the objective space, namely, I(x) ∼ [16.1, 24.7] kg and M(x) ∼ [0.68, 0.89] kg · m 2 , and uniformly distributed along the approximated PF. In comparison, the optimal LHS spend 900 expensive function evaluations but found only 11 feasible nondominated solutions, which are inferior to that of CEIR2 both in terms of convergence and diversity, indicating CEIR2's advantage in using computing resources.

Conclusion and Future Work
In this paper, a novel infilling point criterion EIR2 is proposed, which is combined with the Kriging approximate model, optimal Latin hypercube sampling, and PSO algorithm for dealing with expensive multiobjective optimization problems. e basic idea of EIR2 is to map a point in the objective space to a set in the EI space, and the R2 indicator of the set is defined as the fitness of this point. Due to its analytic form, the computation required for this criterion is very low. e proposed algorithm is applied to three suites of standard test functions. e results show that the algorithm can get more nondominated solutions with higher convergence and dispersion under the equal computing resources. Combined with the average probability of feasibility indicator, EIR2 can also be applied to constrained expensive MOPs. In the engineering optimization of energy storage flywheel, the nondominated solutions having a wider range and diversity are obtained exhausting relatively low computation cost.
Future work will have to extend EIR2 to a parallel version that is capable of finding multiple infilling points [50][51][52][53][54] in one iteration with the help of parallel computers to further accelerate the speed of the optimization process. Besides, regarding the parameterization of the utility function, we use a fixed set of uniformly distributed weight vectors. is strategy runs smoothly if the geometry of PF is relatively regular, such as a simplex-like shape, but may struggle [55] if the true PF turns out to be irregular, such as having a sharp peak of a low tail or discontinuous, because multiple weight vectors cloud corresponds to one same solution. As a result, the diversity of the nondominated solution set cannot be guaranteed. A potential promising strategy [56][57][58] is to dynamically adjust the weight vectors during the optimization process so that the solutions are directed towards each part of the PF. Future work will be incorporating this strategy to our IPC to further improve its ability to maintain diversity.