Multiobjective Optimization Using Cross-Entropy Approach

A new approach for multiobjective optimization is proposed in this paper. The method based on the cross-entropy method for single objective optimization (SO) is adapted to MO optimization by defining an adequate sorting criterion for selecting the best candidates samples. The selection is made by the nondominated sorting concept and crowding distance operator. The effectiveness of the approach is tested on several academic problems (e.g., Schaffer, Fonseca, Fleming, etc.). Its performances are compared with those of other multiobjective algorithms. Simulation results and comparisons based on several performance metrics demonstrate the effectiveness of the proposed method.


Introduction
Optimization is a basic tool for several decision making in engineering area.In these fields, we have many conflicting objectives to satisfy.Usually, we convert all objectives to one, single objective (SO) function.The goal is to find out the maximum/minimum of this SO function subject to maintain the physical constraints of the system.The results reflect a compromise between all objectives.The idea is to formulate the function to achieve this desired compromise.
By aggregating all objectives in a weighted function, or by transforming all objectives into only one objective, and retaining one objective which will be added to constraints, the conversion from MO to SO is done.But the weakness and the limitation of the aggregation are as follows.
(1) The requirement of a prior knowledge about the relative importance of the objectives, and the limits on the objectives that are converted into constraints.(2) The aggregated function leads to only one solution.
(3) Trade-offs between objectives cannot be easily evaluated.(4) The solution may not be attainable unless the search space is convex.
The aggregation is not recommended for the systems with conflicting objectives.Also, we need to know all possible solutions of all objectives simultaneously.In the business word it is called "the trade-off analysis".There are several areas in engineering where the performing of the trade-off analysis is necessary, such as the following.
(1) The design of controllers while reducing the cost, which are two conflicting objectives.
(2) The placement of more functional blocks on chip while minimizing the chip area and/or power dissipation.
(3) The finding of the vehicle which has the highest range while at the same time consuming minimum amount of energy.
MO problems are more difficult to solve compared to the SO problems.In case of SO optimization solution is unique.However, in the MO optimization there is a set of acceptable trade-off optimal solutions which is called the Paretofront.In fact, MO optimization is considered as analytical stage of the multicriteria decision making (MCDM) procedure; it consists of the determination of all solutions for the MO problem, which are the optimal in Pareto logic [4].The preferred solution, desired by the designer or decision maker, (DM) was selected from it.
Pareto set has several advantages.It contains the solutions that are optimum from an "overall" standpoint and allows the DM to make an informed decision by seeing a wide range of compromise (trade-off).This characteristic is helpful, because it provides better knowledge of the system according to the consequences of decisions [5].
There are many metaheuristic methods which use the Pareto ranking in order to determine the probability of replication of an individual.The basic idea is to find the set of nondominated individuals in the population.There are several meta-heuristic methods such as Niched Pareto Genetic Algorithm (NPGA) [6], Nondominated Sorting Genetic Algorithm (NSGA) [7], Strength Pareto Evolutionary Algorithm (SPEA) [8], the Nondominated Sorting Genetic Algorithm-II (NSGAII) [9], and, recently, Multiobjective Particle Swarm Optimization (MOPSO) [10][11][12].
In the determinist methods there is only one method, the Normal boundary Intersection (NBI) based [13] to generate Pareto optimal solutions for a general nonlinear multiobjective optimization.
Since 1997, the cross-entropy (CE) (or Kullback-Leibler CE) was proposed by Rubinstein for solving rare event simulation problems [14] and was afterward extended to solving combinatorial problems and continuous mono objective optimization.Several randomized optimization algorithms based on CE method have been proposed in the literature and have been shown to lead to good performances on many optimization problems, often outperforming other randomized algorithms [15].In this decade, CE method has been applied in several engineering applications (see, e.g., [16][17][18]).
In [19] the authors have extended the negative-loglikelihood (NLL) approach on combination with a multiobjective evolutionary algorithms (MOEA) to resolve an Multiobjective Optimization, and in [20] the CE method was extended to multiobjective optimization in general and to multiobjective water distribution systems design in particular; this was done by converting the MO optimization to the SO optimization via a weighting method.Hence, the Pareto set will be generated by using different weighting of objectives.Recently, in [21], an extension of the CE method to MO optimization where the CE's parameters are tuning based on information collected from clustered nondominated solution based on fuzzy c-means algorithm (FCM), and in [22] a cross-entropy used with Pareto ranking is developed.
Motivated by the satisfaction given through the applications of the CE method for the resolution of the SO optimization, in this paper an original use of the CE to resolve an MO optimization was presented.A novel approach for computing the Pareto optimal front based on rare event simulation, the nondominated sorting ranking, and a crowding operator [9] to select the elitist solutions was proposed.It is simple compared to the algorithm proposed in [21].
This document is divided into three parts.In the first part, the basics such as the rare event simulation (RES), the Nondomination, the crowded distance computation and the crowded distance operator are defined.In the second one, the proposed algorithm is presented, and, at the last, the testing application of the proposed algorithm via very known test problem was done, a comparison with other meta-heuristic methods, and concurrent MOCE approach was done.

The Multiobjective Optimization via CE Method
The CE method has been designed to solve rare event simulation problems and extended to optimize SO problems.This section proposes an new extension to MO optimization.The first part is devoted to the presentation of CE method for SO optimization and is inspired by [16].
2.1.From Rare Event Simulation to SO Optimization.The CE method is a method to evaluate the probability that a rare event occurs.Let  be a random variable taking its values in some space X with a probability density function (pdf) (⋅) and (⋅) be a real value function defined on X.The problem is to evaluate  = P ∼ (() ≥ ), where  is a given threshold.An estimator of this probability is given by (1) where ((  ) ≥ ) is the indicator function of the event and  is the number of samples: (The indicator function ((  ) ≥ ) is equal to 1 when {(  ) ≥ } = true and 0 otherwise.) If this probability is very low, this estimator requires a great number of samples.For example, in order to estimate  = 10 −6 , with a relative error  = 0.01, a sample size of  ≃ 1/ 2  = 10 10 is required.The basic idea of CE is to introduce an importance sampling pdf (⋅) to reduce the required number of samples.The new estimator given by (2) where the samples are drawn from the density (⋅) can then be used: Obviously the best estimator would be based on the ideal importance sampling pdf given by (3) since  is constant using this "ideal" importance sampling (3) would lead to an estimator (2) having a zero variance The main idea of the CE method for rare event simulation is to find inside a prior given set G of pdfs, defined on X, the element (⋅) such that its distance with the "ideal" sampling distribution is minimal.
A particularly convenient measure of distance between two pdfs (⋅) and (⋅) on X is the Kullback-Leibler distance, which is also termed the cross-entropy between (⋅) and (⋅).The Kullback-Leibler distance, which is not a "distance" in the formal sense since it is, for example, not symmetric, is defined as follows: The CE method reduces the problem of finding an appropriate importance sampling pdf to the optimization problem given by arg min ∈G D ( * , ) . ( One can show through simple mathematical derivations that solving (5) is equivalent to solve (6) which does not explicitly depend on  anymore: Using a given importance sampling with a pdf ℎ(⋅) (ℎ ∈ G) it is possible to get an estimation of the solution of ( 6) by solving its stochastic counterpart (7) where the set of samples  1 ,  2 , . . .,   is drawn according to ℎ(⋅) and () = ()/ℎ() is the likelihood ratio between (⋅) and ℎ(⋅): Under specific assumptions on X, (⋅), and G it is possible to analytically solve (7).For example, when X ⊂ R, G is the set of -dimensional exponential distributions with independent components specified by (8), where  is the vector of parameters and [] the th component of , and  also belongs to this set; the parameters of the solution of ( 7) are given by ( 9) Of course the solution of ( 7) is a better estimation of the solution of ( 6) when the number of samples such as () ≥  is high.It is then necessary to adopt an iterative approach to compute the pdf ℎ to favor the occurrence of the desired event.If  is very low it is even necessary to introduce an iteration that increases the value of .
This approach evaluates the probability that the function () is greater than or equal to a given value by drawing samples with a pdf that evolves in such a way that the number of event increases, that is, such that the value of () is great.It is then attractive to use the approach to evaluate the maximal value of the function.
Let us consider the single objective optimization problem specified by (10) where X ⊂ R  and () is a real value objective function.The basic idea of CE for SO optimization is to use a stochastic approach based on importance sampling in order to get a good evaluation of  * : Starting from a prior pdf to draw samples, the method iteratively computes series of pdf that increase the probability to draw a sample near the global optimal solution.With respect to the previous problem the main difference is that the event that is used to iteratively compute the pdf is not given by the problem but has to be chosen.Generally, this is done by choosing a given number  and considering that the relevant event is that the sample belongs to the  better samples according to the objective function ().At each step the new pdf is computed according to (11), where Elite is the set of the  best samples that can be analytically computed in some cases.For example, when X ⊂ R and -dimensional exponential distributions with independents components specified by (8) are chosen the parameters of the solution of ( 11) are given by a formula close to (9) arg max The rare event simulation is used to maximize an objective function.However, the use of this method in MO optimization amounts to resolve (11).Hence, this approach can be used in minimization via the modification of the belonging to the Elite's set.

Multiobjective Optimization.
In MO problems a set of real value objective functions (  (),  = 1 . . .) are considered.The aim is to find a set of better trade-offs between the various objectives called Pareto optimal solutions Ω [7], when the meaning of better is considered according to the nondominated criterion [23].The dominance operator (≺) that is used in this criterion is defined by (12), that is, a solution  is said to dominate the other solution  if  is better than  for at least one objective   (⋅) and is not worse for any other   (⋅).The Pareto optimal solutions are then the elements that are not dominated by any other element: The Pareto front is then defined as the subset of R  image of the Pareto optimal solutions by the objective functions (13).The aim of this part is to adapt the SO optimization procedure to find points of this Pareto front for a given MO problem: In order to apply the CE approach to MO optimization the same approach as for single objective is used.At each step a random sample is generated according to a pdf, and a new pdf is computed as the solution of (11) with respect to a subset Elite.
The nondominated criterion is used to build the Elite set.From the initial sample  (1) it is possible to compute the Pareto optimal solutions Ω 1 and then a new set  (2) by removing the element of Ω 1 from  (1) ( (2) =  (1) − Ω 1 ).The concept of Pareto optimal solutions within  (2) leads to a new set Ω 2 .By iteration, it is possible to compute sets Ω  such that each set Ω −1 dominates the set Ω  .According to the number of elements that are to select in the Elite set Input: A number of performances functions , a performances functions  1 , . . .,   : U → R where U is the set of the feasible solutions, the starting CE's parameter V 0 and two parameters:  and .Output: An element V output ∈ R + and the Pareto front  nbElite .Algorithm: Step 1. Set  equal to 1 and the components V  [] of the -dimensional vector V  equal to V 0 .Set nbElite equal to the largest integer inferior or equal to  ×  × .
Step 2. Set   equal to an empty set.
Step 3. Draw independently  ×  elements according to the Exponential pmf Exp  (⋅, V  ) and set them in   .
Step 5. Set  to 1 and the elite set of sample  nbElite to the empty set.
Step 6. Computation of  nbElite Set based on non dominance and crowding operator.
Step 7. Calculate crowding distance in   and order the elements according to the ≺  operator.
Step 8. Choose the first (nbElite −      nbElite     ) elements of   and added its to the elite set Step 9.If stopping conditions are met, then return V output = V  and stop.Otherwise go to Step 10.
for  = 1, 2, . . .,  and  ←  + 1. Go to Step 2. Algorithm 1: A cross-entropy approach for the computation of the nondominated solutions.the first sets Ω  will be included.If the addition of the last set cannot be done (exceeding the required number of samples), The nondominance operator can not be used any more.The selection is then based on the crowding operator ≺  [9].

CE for Multiobjective Optimization Algorithm
The main algorithm is very simple; it is based on rare event simulation with an original event as shown in Algorithm 1.
The selection of the Elite solutions is based on two steps.At the first one, The nondominated sorting is applied to generate all possible fronts the best fronts are totally added to the Elite set  nbElite .At the second step, the remaining elements to reach the total size of  nbElite are chosen from the next front based on crowding operator.More detail about these operators (crowding distance, nondominated sorting) can be founded in [9].
The complexity of this method is the overall of the below algorithm complexity, which is ( 2 ).

Simulation Results
In this section, we first describe the cases studies used to compare the CE approach to compute the Pareto front with the method based on CE proposed by [21] and the other meta-heuristic methods such as NSGA-II, the Pareto Archived Evolution Strategy (PAES), and Strength Pareto Evolutionary Algorithm (SPEA).The CE with RES approach was implemented in two steps.First, the goal is to tune the CE's parameters; these parameters are obtained by trial and error.The obtained best parameters are  ×  = 1000 and  = 0.01.In the second step, we use  ×  = 10000 and  = 0.01, where the goal is the building of a significant nondominated solutions curves.All runs are repeated 10 times and the starting points are changed.The outcomes correspond to the best solutions.For the problems which have constraints or boundaries, the acceptance-rejection method is used.

Performance Metrics.
Since our approach is a Paretobased approach, which will obtain a set of nondominated  solutions Ω.Here, we introduce several metrics based on the obtained nondominated solutions to measure the searching quality of the algorithm, which can be used for comparison between different algorithms [25].

Overall Nondominated Vector Generation (ONVG).
For an obtained nondominated solution set Ω, the metric ONVG is defined as |Ω| which is the number of distinct nondominated solutions.

C Metric (CM).
A CM is used to compare two nondominated solution sets Ω and Ω  obtained by two algorithms, which maps the ordered pair (Ω, Ω  ) to the interval [0, 1] to If all the solutions in Ω  are dominated by those in Ω, then (Ω, Ω  ) = 1.Conversely, if all the solutions in Ω are dominated by those in Ω  , then (Ω, Ω  ) = 0.It should be noticed that the sum of (Ω, Ω  ) and (Ω  , Ω) is not always equal to 1, because there may be some solutions in Ω and Ω  that are not dominated to each other.

Maximum Spread (MS)
. This metric is developed to measure how well the true Pareto front is covered by the obtained nondominated solutions in the set Ω through the hyperboxes formed by the extreme function values observed in the optimal Pareto front and Ω: where  max  and  min  are the maximum and minimum values, respectively, of the th objective in the true Pareto front.

Average Quality (AQ)
. This metric was proposed to measure the quality of the solution set, which was originally expressed in the form of weighted Tchebycheff function  ∞ .But that function may hide certain aspects about the quality of solution set because poor performance with respect to proximity could be compensated by good performance in the distribution of solutions.Therefore, diversity indicators of spread and space are added to the formulation to overcome where

Running Time (𝑡).
To reflect the efficiency of a multiobjective optimization algorithm, running time is also used as a metric.

Computation Results.
To evaluate the performance of the results, we use the convergence metric and the diversity metric parameters as described in [9].Table 2 shows the convergence metric Υ obtained using six algorithms, CE with RES, CE with FCM, NSGA-II, SPEA, and PAES.The data for CE with FCM is provided from [21] and the data of the four meta-heuristic algorithm are provided from [9].CE with RES is able to converge better in the problems SCH, ZDT1, ZDT4, and ZDT6.
Table 3 shows the diversity metric Δ using the five algorithms.It is deduced that the diversity of the proposed method is small compared to the diversity of other methods because it uses an exponential pdf.The better diversity is obtained by the CE based on FCM which is assured using clustering.We project in the future works the use of other distribution function as normal distribution.
In Table 1, a comparison of our method with the NSGA-II method is shown.To properly evaluate both algorithms and highlight the contribution of our approach, the parameters of the previous subsection are calculated (we choose the algorithm NSGA-II for comparison, given its obvious success).We conclude that both approaches have almost identical performance.However, in computation time, our method shows its efficiency (calculated with Inter processor Core i3 CPU M370 @ 2.40 GHz).The comparison of the proposed method with the true Pareto front is yield in Figures 1-10.Our method gives solutions which are closes to the true Pareto fronts.

Conclusion
In this paper a simple original algorithm for solving a multiobjective optimization based on the cross-entropy method (particularly the rare event simulation) is developed and tested through several cases.This method is fast and very simple to implement in economic and engineering fields.

Table 1 :
Other metric comparison based.

Table 2 :
Mean (first row) and variance (second row) of the convergence metric Υ.