Comparing Solutions under Uncertainty in Multiobjective Optimization

. Due to various reasons the solutions in real-world optimization problems cannot always be exactly evaluated but are sometimes represented with approximated values and confidence intervals. In order to address this issue, the comparison of solutions has to be done differently than for exactly evaluated solutions. In this paper, we define new relations under uncertainty between solutions in multiobjective optimization that are represented with approximated values and confidence intervals. The new relations extend the Pareto dominance relations, can handle constraints, and can be used to compare solutions, both with and without the confidence interval. We also show that by including confidence intervals into the comparisons, the possibility of incorrect comparisons, due to inaccurate approximations, is reduced. Without considering confidence intervals, the comparison of inaccurately approximated solutions can result in the promising solutions being rejected and the worse ones preserved. The effect of new relations in the comparison of solutions in a multiobjective optimization algorithm is also demonstrated.


Introduction
Multiobjective optimization is the process of simultaneously optimizing two or more conflicting objectives.Problems with multiple objectives can be found in various fields, from product design and process optimization to financial applications.Their specificity is that the result is not just one solution, but a set of solutions representing trade-offs between objectives.
Multiobjective evolutionary algorithms (MOEAs) are known for efficiently solving these kind of problems [1].However, MOEAs can also be used for solving optimization problems with uncertain objective values.The reason for uncertainty can be noise, robustness, fitness approximations, or time-varying fitness functions.When solving uncertain optimization problems, it is better if the algorithm takes uncertainty into account.
Uncertain solutions can be represented with approximated values and variances of these approximations.From the variance, the confidence interval of the approximation can be calculated.This interval indicates the region in which the exactly evaluated solution is most likely to appear.The confidence interval width indicates the certainty of the approximation.If the confidence interval is narrow, we can be more certain about the approximation and vice versa.Since the confidence intervals offer additional information on the approximations, they can be effectively used to compare solutions and an algorithm using confidence intervals can perform better by exploiting this additional information [2].During optimization that does not consider confidence intervals, an approximated solution may be incorrectly identified as the better of the two compared solutions.Often the solution that is incorrectly determined as worse is then discarded.Similarly, a promising solution can get discarded if a worse solution is incorrectly determined as being better.In both cases good solutions are lost due to the comparison of solutions which only considers approximated values.
To prevent these unwanted effects, we propose new relations for comparing solutions under uncertainty, where, in addition to the approximated values of a solution, their confidence intervals are considered.These relations cover all possible combinations that can occur when comparing solutions represented with confidence intervals.New relations also take into consideration the feasibility of solutions including the uncertainty of feasibility due to the uncertainty of solutions.During the optimization process some solutions are exactly evaluated and others approximated; therefore, the relations under uncertainty also cover the comparison of approximated solutions with exactly evaluated solutions.The relations under uncertainty can be used to compare solutions in any multiobjective optimization algorithm dealing with solutions represented with confidence intervals.
The structure of this paper is as follows.In Section 2 we describe the existing techniques for comparing solutions under uncertainty reported in the literature.In Section 3 we recall the Pareto dominance relations for comparing the exactly evaluated solutions in multiobjective optimization.In Section 4 we generalize these relations to solutions represented with approximated values and confidence intervals.Section 5 presents the possible use of new relations under uncertainty for comparing two solutions in any multiobjective optimization algorithm.Section 6 presents an empirical proof of concept of comparing solutions under uncertainty by demonstrating that the use of new relations under uncertainty reduces the number of incorrect comparisons.Section 7 concludes the paper with a summary of the work done.

Existing Techniques for Comparing Solutions
Comparison of solutions is an essential step of the optimization process.Comparing solutions helps determine which solution is better and therefore appropriate to drive the optimization process further, and which one is worse and should be replaced with a better solution.The comparison of solutions in single objective optimization is straightforward.Either both solutions have the same objective values, or one solution is better than the other, which means that deciding which solution is better is trivial.
In multiobjective optimization, we wish to simultaneously optimize several conflicting objectives.Here, one solution can be better in some objectives and worse in others.Consequently, the comparison of solutions and therefore the whole optimization process become more challenging.
When solving real-world optimization problems, it is often not possible to determine the objective values without uncertainties.The nature of uncertainties depends on the problem.In [2], four types of uncertainty sources are mentioned.The first one is the noisy fitness functions, where the same input parameters return different objective values.The second one is the requirement for solution robustness, where the quality of the obtained optimal solutions should be robust against environmental changes or deviations from the optimal point.The third type is the approximated fitness, where the fitness functions suffer from approximation errors.The fourth and final type is the time-varying fitness functions, where the optimum of the problem to be solved changes over time and, thus, the optimizer should be able to track the optimum continuously.
Regardless of the uncertainty origin, the techniques for comparing solutions under uncertainty and determining their domination status are similar.Two different approaches are used when comparing solutions under uncertainty.The first one is to take the approximated value and variance and transform them into one value and then compare these single values.The second approach is to calculate the confidence interval and then directly compare the solutions represented with confidence intervals.
An example of the first approach can be found in [3], where probabilistic dominance is defined and, for comparing solutions, the probability of dominance is used rather than outright dominance.If the probability that one solution dominates the other is higher than the specified degree of confidence, then this solution is said to dominate the other.This probabilistic dominance allows the use of the usual deterministic elitist algorithms with certain degree of confidence in the results.The methods to calculate the probability of dominance vary, depending on the types of uncertainty.
Similarly, in [4] the authors define the dominance relation between solutions based on the probabilities of one solution objective being better than the same objective of another solution.For solutions with more objectives, the hypercuboids are defined and, similarly, comparing their volume and the center point can determine the probability of one solution being better than the other.To select diverse solutions, the paper also redefines the crowding distance defined in [5] based on the location and the volume of the hypercuboids of these solutions.
Another example of this approach is presented in [6], where each solution is inherently associated with a probability distribution over the objective space.A probabilistic model that combines quality indicators and uncertainty is created and then used to calculate the expected value for each solution.
In the second approach, the solutions represented with approximated values and confidence intervals are compared to determine the relation between them.
In [7,8], the authors tackle a noisy optimization problem with an algorithm that evaluates every solution multiple times (and if necessary performs additional evaluations to reduce the uncertainty) and calculates the mean value and standard deviation for these evaluations.A modified Pareto dominance relation is defined for comparing solutions in uncertain environments.The Pareto dominance relation is modified in a way that the solution  dominates solution  if for every objective, the mean value plus standard deviation of  dominates the mean value minus standard deviation of .If this is not the case, the solutions are nondominated.To avoid having too many nondominated solutions, the promising solutions are additionally evaluated to make the standard deviation smaller.
In [9], a robust multiobjective evolutionary algorithm was developed for solving optimization problems in which solutions should be invariant to small input changes.The uncertain parameters are represented with intervals, which results in solution objectives also being represented with intervals.The algorithm for comparing solutions then compares worst-case scenario values of objectives, that is, the values at the border of an interval.
In [10], the authors tackled noisy optimization problems with a modified NSGA-II algorithm [5] for handling solutions with uncertainty.The procedure for obtaining the rank of solutions is transformed so that it also considers the variance of solutions.Dominated solutions can also be ranked on the Pareto frontier, if the distance to any nondominated solution, calculated from the fitness values and variances of solutions, is smaller than the threshold called neighborhood restriction factor.During the optimization process, this factor becomes smaller and the number of evaluations taken for nondominated solutions increases, resulting in a smaller variance and a more precise set of nondominated solutions.
A concept of comparing solutions with uncertain objectives represented with intervals is presented in [11].The authors define the extension of Pareto dominance based on a theory of probabilistic dominance.They present a case where objective values are continuously and uniformly distributed inside the interval and by comparing the distributions the probability of dominance is calculated.The approach is then implemented in the modified SPEA [12] algorithm.
Another concept of comparing solutions under uncertainty is presented in [13], where using a possibilistic framework the new Pareto relations are defined.The solutions characterized by a particular possibility distribution are represented with triangular possibility distributions-triplet of values, most plausible value and lower and upper borders of distribution that represent the least plausible values.Based on this representation, the strong Pareto dominance, weak Pareto dominance, and indifference are defined and used on a vehicle routing problem with uncertain demands.
A more theoretical approach to the solution comparison under uncertainty is presented in [14] for optimization problems where the uncertainty of the solutions cannot be reduced by the sampling methods.The solutions are represented with intervals, and new relations are defined for comparing those intervals.The authors define certain and uncertain domination criteria for comparing intervals.On this basis, they suggest a strong Pareto dominance relation in cases when the dominance relation can be determined and weak Pareto dominance relation when the domination relation cannot be determined because of uncertainty.In this case, the expected values for every solution are assumed and these values are then compared.
In [15], a partial order approach is suggested to enable the comparison of solutions represented with confidence intervals.This approach does not differentiate between the cases in which the upper border of one interval dominates the lower border of another interval and the cases in which some part of intervals overlap.A very similar approach to handle solutions represented with intervals, called imprecise Pareto relations, is presented in [16].
Bounding boxes representing multiobjective solutions with confidence intervals are defined in [17] (they are described in greater detail in Section 4).The authors presented various comparison strategies, but in all strategies the comparison of bounding boxes is simplified to the comparison of bounding box bounds.The individuals are compared to all solutions in the population and individuals with a small probability of being competitive are rejected, while individuals with a high probability of being better are exactly evaluated.
To our knowledge, none of these methods systematically covers all cases that can occur when comparing (constrained) multiobjective solutions with confidence intervals, which is the main contribution of this paper.
Because the comparison of solutions under uncertainty is based on the comparison of solutions without uncertainty, the latter concept is described first.

Relations without Uncertainty
A constrained multiobjective optimization problem (CMOP) consists of finding the minimum of the function: subject to (i) boundary constraints: (ii) constraints on decision values: (iii) constraints on objectives: where  is the number of variables,  is the number of objectives,  is the number of constraints on decision variables, and  is the number of constraints on objectives.
The boundary constraints define the search region of an optimization problem by setting lower bounds  min and the upper bounds  max for the variables.Inside the search region, the constraints on decision values further define the feasibility of solutions.An example of such constraint would be that the sum of two variables should not exceed a predefined value.Since these constraints can be complex, the region they define can also be complex.As a result, the red contour in Figure 1 that represents this region is drawn as a complex shape.The constraints on objectives limit the feasibility of the objective values.An example of constraint on objectives would be to set a maximum budget and a minimum top speed in the optimization problem of finding a fast and cheap car.The constraints on objectives are typically not very complex; hence the region defined by these constraints is fairly simple.We call this region feasible objective value region; in Figure 1 it is surrounded by the blue and green lines.
If all constraints are satisfied, we say that the solution is feasible; otherwise it is infeasible.All feasible solutions in the decision space constitute the feasible region.The mapping of this region in the objective space is called feasible region image and this region is marked with black hatching in Figure 1.The feasible solutions of an optimization problem that are the best with regard to all objectives create a front of solutions called the Pareto optimal front, which is indicated by the green line in Figure 1.
This problem formulation is used to describe the relations between the solutions without and with uncertainty.In this section we consider the case in which all solutions of a CMOP are exactly evaluated; that is, they are without uncertainty.The abovedefined relations are usually used only when solving problems without constraints where all solutions are feasible.For cases where the feasibility of solutions is unknown, the Pareto dominance relation is slightly modified, as suggested in [5].
Definition 5 (constrained dominance).The objective vector  constrained-dominates the objective vector , ≺  , if any of the following conditions are true.
(1) Solution  is feasible and solution  is not.
(2) Solutions  and  are both infeasible, but solution  has a smaller overall constraint violation.
(3) Solutions  and  are feasible and solution  Pareto dominates solution .
The effect of using the constrained dominance principle is that any feasible solution is better than any infeasible solution and that of the two infeasible solutions the one closer to the feasibility region is better.

Relations under Uncertainty
In this section we consider the case where the objective values of the solutions are represented with the approximated values and confidence intervals for each approximation.In such a case, the standard relations described previously are not suitable and must be adapted to accommodate the uncertainty.Every solution  is represented with a vector of approximated objective values  = () = ( 1 ,  2 , . . .,   ) and a confidence vector  = ( 1 ,  2 , . . .,   ).For the objective   the confidence interval is equal to [  −   ,   +   ].In order to be able to compare the solutions represented in this way, the relations between the solutions under uncertainty are defined on the bounding boxes (BBs) of their objective values.From the vectors of the approximated values and the confidence intervals, the bounding box of an objective vector  is designed as (Figure 2) This definition of BB presumes that the confidence intervals are symmetric.This is not always the case, for example, because of nonsymmetric form of noise.Instead of considering just confidence vector , we could define lowerbound confidence vector  = ( 1 ,  2 , . . .,   ) and upperbound confidence vector  = ( 1 ,  2 , . . .,   ).For the objective   the confidence interval would then be equal to [  −   ,   +   ], and the definition of the bounding box that considers the nonsymmetric uncertainty intervals would then be However, since the relations under uncertainty are indifferent to the shape and size of the bounding boxes, we can for the sake of simplicity presume that the confidence interval is always symmetric.
In addition to bounding boxes, where every objective has its own confidence interval, multiobjective solutions with uncertainty can also be represented with ellipsoids.Representation with ellipsoids restricts all objectives from obtaining their worst-case values simultaneously.But since comparing multiobjective solutions is performed by comparing pairs of objectives, where the confidence of each objective is inspected, we adopt the approach with bounding boxes.
We handle relations under uncertainty without constraints and with constraints separately.
To test whether BB(, ) probably dominates BB(, ) it is enough to check if the corner point If it does, then BB(, )≺  BB(, ).
Finally, when none of the presented relations under uncertainty apply, two solutions are in an undetermined relation.
If  = () with confidence vector ,  = () with confidence vector , and BB(, ) ∼  BB(, ), we say that solution  is in an undetermined relation with solution  (∼  ).This means it is expected that either one solution weakly dominates the other or that the solutions are incomparable.In Figure 3,  5 is in an undetermined relation with  2 ,  3 , and  4 .
Two implications can be found between relations under uncertainty.If solution  probably dominates solution , then the solution  is also probably nondominated by the solution : Similarly, probable incomparability implies probable Pareto nondominance If all the solutions are exactly evaluated, that is, all their corresponding confidence interval widths equal zero, the relations presented in this section directly translate to those described in Section 3.

Relations under Uncertainty with Constraints.
Similarly to the Pareto dominance relations (Section 3), the relations under uncertainty without constraints (Section 4.1) are usually applied only if all solutions are feasible.To compare solutions represented with BBs where the feasibility of solutions is uncertain, we need to define a measure of feasibility for solutions represented with BBs.Since BBs are defined on the objective space, we only need to check the feasibility of BBs against constraints on objectives that define the feasible objective value region F. We assume that, before checking these constraints, the solution has already met constraints on decision values and boundary constraints.In the unlikely case of very complex constraints on objectives, it can be difficult to implement and calculate the intersection between BB and F. However, the procedure can be simplified by checking the feasibility only for the points on the vertices of the BB.If all points are feasible, we can say that the solution is probably feasible; if all points are infeasible, the solution is probably infeasible; and if some points are feasible and others are not, we can say the solution has undetermined feasibility.We can assume this simplification since the widths of the confidence intervals are relatively small and we can presume that the vertices represent the whole BB sufficiently well.
To compare feasible and infeasible solutions represented with BBs, we define the following four relations under uncertainty with constraints.
(1) The bounding box BB(, ) is probably feasible and the bounding box BB(, ) is probably infeasible.
(2) The bounding boxes BB(, ) and BB(, ) are both probably infeasible, but the objective vector  has a smaller overall constraint violation.
(2) The bounding boxes BB(, ) and BB(, ) are both probably infeasible, and both objective vectors  and  have the same overall constraint violation.
Two solutions  and  are probably constrained-incomparable when the corresponding bounding boxes are probably constrained-incomparable (‖  ).When two solutions are in an undetermined constrained relation, the three following outcomes are possible: (1) the first solution dominates the second one, (2) the second solution dominates the first one, or (3) the solutions are incomparable.We present a possible scenario to clarify why the solutions can be in an undetermined constrained relation due to their feasibility.We compare solution  with undetermined feasibility and solution  with probable feasibility and  is probably nondominated by .This means that if we were to exactly evaluate solution  and it would be infeasible, the solution  would dominate the solution .This implies that the solutions can be in any relation; hence, by definition, they are in an undetermined constrained relation.Similarly, there are also other cases in which solutions are in an undetermined constrained relation and we need to exactly evaluate at least one of the solutions.All relations for comparing solutions mentioned in this paper are summarized in Tables 1, 2, and 3.

Comparing Solutions under Uncertainty
In iterative optimization algorithms, the process of gradual solution improvement is based on solution comparisons.By comparing solutions, the algorithm finds which solutions are better and promotes them further, while those that are found worse are discarded.
In this section, we show the use of constrained relations under uncertainty for comparing two solutions represented with BBs.This comparison can be implemented in any multiobjective optimization algorithm.However, since every algorithm applies a specific search strategy, we present how the relations under uncertainty can be used instead of Pareto dominance relations.
Nevertheless, it is to be noted that straightforward use of relations under uncertainty instead of Pareto dominance relations is not always possible.When the confidence intervals (at least one) are overlapping, confidence interval reduction procedures have to be applied in order to be able to determine the result of comparison.These additional procedures can, for example, be exact evaluations, in the case of surrogatebased optimization and in the case of optimization with noisy objectives additional evaluations that result in reducing the width of the confidence interval.In cases where the width of the confidence interval cannot be changed and the relations between solutions are unknown, another approach needs to be taken, for example, comparison of the approximated values instead of comparing BBs.
When comparing solution  with confidence vector  and solution  with confidence vector , we consecutively check the four possibilities listed below.
(1) If ≺  , we can consider  and  to be in the Pareto dominance relation ( ≺ ).
Here the solution  is probably better than the solution ; therefore, no confidence interval reduction is necessary as it would probably not change the dominance relation.
In this case, solutions  and  are probably constrained-incomparable.Even if both solutions were exactly evaluated, they would probably still be incomparable and the algorithm would probably still keep both solutions.Hence, no confidence interval reduction is needed in this case.
(3) If  ̸ ≻  , the algorithm checks .If  ̸ = 0, the algorithm performs confidence interval reduction on  and compares the solutions again.If  = 0, the algorithm performs confidence interval reduction on solution  and compares the solutions again.
In this case, the solution  is probably better in at least one objective and probably not worse in the others.In order to determine whether either solution  dominates solution  or they are incomparable, (at least) for one solution the confidence interval reduction needs to be performed.Because  is more promising, its confidence intervals are checked.If their widths are different from zero, which means that the solution is approximated, the algorithm performs confidence interval reduction on  and then compares the solutions again.If the confidence interval widths are equal to zero, which means that solution  is exactly evaluated, then, in order to be able to compare the solutions, the algorithm performs confidence interval reduction on  and compares the solutions again.
(4) If ∼  , the algorithm checks the feasibility of solutions.If both solutions have undetermined feasibility, the algorithm randomly chooses one solution and performs confidence interval reduction on it.If one solution has undetermined feasibility, the algorithm performs confidence interval reduction on that solution and compares the solutions again.If both solutions are probably feasible, the algorithm checks the confidence vector of a randomly picked solution.If it is not equal to zero, the algorithm performs confidence interval reduction on this solution and compares the solutions again.If the confidence vector is equal to zero, the algorithm performs confidence interval reduction on the other solution and compares the solutions again.
In this case, the only way to find out which solution is better is to perform confidence interval reduction on (at least) one solution.Because solutions near the borders of the feasibility region are usually better, the algorithm first checks and performs confidence interval reduction on these solutions.If both solutions are probably feasible, the algorithm checks whether the first solution is exactly evaluated.If it is not, the algorithm performs confidence interval reduction on it.If it is, the algorithm performs additional confidence interval reduction on the other solution and then compares the solutions again.

Empirical Proof of Concept
In this section we test the hypothesis that by using the new relations under uncertainty the number of incorrect comparisons is reduced.In the following experiment we compared multiobjective solutions with uncertainty where the uncertainty comes from solution approximations gained with surrogate models.To be able to compare the number of incorrect comparisons, every solution comparison was performed with relations under uncertainty and with Pareto dominance relations.In addition to comparing approximated solution values, we also compared the exact solution values.This allowed us to monitor the accuracy of comparison of uncertain solutions.
Since we did not want to use random solutions for comparisons, we decided to perform solution comparisons as executed by the NSGA-II algorithm [5].In every generation the NSGA-II algorithm creates a new set of solutions, adds them to the current ones, and then performs selection on the union to select the most promising solutions.The selection procedure includes comparing every solution with all other solutions to determine its dominance status.On these comparisons we compared the relations under uncertainty with the Pareto dominance relations.
The comparison was performed on three benchmark problems.One is Poloni optimization problem [18] and two are from [1], called OSY and SRN.All of them are twoobjective problems.
Gaussian process (GP) modeling [19] was used to build surrogate models for solution approximations.For the confidence interval width of the approximation we used the two standard deviations (2), which corresponds to about 95% of the normal distribution of the approximations.To test the correlation between the surrogate model accuracy and the incorrect comparisons, five different models of increasing accuracy were built-each on larger number of solutions.
The algorithm parameter values used for testing were the same for all three problems.They were set as follows: (i) population size: 100, (ii) number of generations: 100, (iii) number of runs: 30.
For every problem and for every model we calculated the number of incorrect comparisons for each comparison technique.In addition, we calculated the average confidence interval width and for relations under uncertainty also the number of cases where, in order to be able to compare the solutions, confidence interval reduction procedures (in our case exact evaluations of approximated solutions) were performed.
The results averaged over 30 runs are presented in Tables 4, 5, and 6.These results show that by increasing the number of solutions used for building the surrogate model the accuracy of the model increases and the number of incorrect comparisons decreases.The reason for the high number of incorrect comparisons using the models built on smaller number of solutions is in the fact that the solutions used for building the surrogate models do not cover the decision space well enough.Due to the lack of information, the solution approximations can be incorrect by a large margin.This can also result in the exact solution values falling out of the bounding boxes.This reflects in some incorrect comparisons also encountered with the relations under uncertainty.
With the increasing number of solutions used for building the surrogate model the average confidence interval width also gets narrower.The narrower the confidence intervals, the smaller the bounding boxes and the number of required additional confidence interval reductions.
Examining the number of incorrect comparisons for the two relation types, we can see that by using the Pareto dominance relations the number of incorrect comparisons is from 3 to 243 times higher than by using relations under uncertainty.Regardless of the accuracy of the surrogate model, we As we can see, in order to reduce the number of incorrect comparisons, we have to perform additional confidence interval reductions.This in turn increases the total optimization time; hence a balance between the number of incorrect comparisons and the time spent performing additional confidence interval reductions needs to be found.

Conclusion
In this paper we have presented new relations for comparing solutions under uncertainty.The uncertainty can derive from noisy fitness functions, requirement for robust solutions, surrogate approximations, or time-varying fitness functions.The relations under uncertainty are defined on bounding boxes that are based on approximated values and confidence

Figure 1 :
Figure 1: The objective space of a constrained multiobjective optimization problem.

Figure 2 :
Figure 2: The bounding box of an objective vector.

Figure 3 :
Figure 3: Approximated solutions presented in the objective space using bounding boxes.

Definition 16 (
undetermined constrained relation).The bounding box BB(, ) is in an undetermined constrained relation with the bounding box BB(, ), BB(, )∼  BB(, ), if the two bounding boxes are not in any other constrained relation under uncertainty.Again, two solutions  and  are in an undetermined constrained relation when the corresponding bounding boxes are in an undetermined constrained relation ( ∼  ).
Definition 4 (incomparability).The objective vectors  and  are incomparable,  ‖ , if and only if    and   .Again, if  and  are incomparable, solutions  and  are incomparable.

Table 2 :
Relations under uncertainty without constraints.

Table 3 :
Relations under uncertainty with constraints.

Table 4 :
Comparison of newly defined relations with Pareto dominance relations on the Poloni problem (average values over 30 runs).

Table 5 :
Comparison of newly defined relations with Pareto dominance relations on the OSY problem (average values over 30 runs).

Table 6 :
Comparison of newly defined relations with Pareto dominance relations on the SRN problem (average values over 30 runs).