Multi-Objective Service Composition Using Enhanced Multi-Objective Differential Evolution Algorithm

. In recent years, the optimization of multi-objective service composition in distributed systems has become an important issue. Existing work makes a smaller set of Pareto-optimal solutions to represent the Pareto Front (PF). However, they do not support complex mapping of the Pareto-optimal solutions to quality of service (QoS) objective space, thus having limitations in providing a representative set of solutions. We propose an enhanced multi-objective diferential evolution algorithm to seek a representative set of solutions with good proximity and distributivity. Specially, we propose a dual strategy to adjust the usage of diferent creation operators, to maintain the evolutionary pressure toward the true PF. Ten, we propose a reference vector neighbor search to have a fne-grained search. Te proposed approach has been tested on a real-world dataset that locates a representative set of solutions with proximity and distributivity.


Introduction
Service composition became popular after introducing service-oriented architecture (SOA), as it allows complex and distributed software systems to be composed of web services through open standards.QoS attributes [1] (e.g., reliability or throughput) provide the quality criteria for selecting and composing web services, thus establishing QoS-aware service composition (QOSC).Since QoS requirements usually involve multiple conficting objectives, QOSC is a multi-objective optimization problem (MOP) to fnd a set of Pareto-optimal solutions.
Existing work [2][3][4] has explored multi-objective evolutionary algorithm (MOEA) that allows a set of feasible solutions to approximate the Pareto-optimal set, based on analyzing a set of non-dominated solutions after one run and maintaining a good solution diversity during the search [5,6].However, this approach includes elitist preservation in the selection strategy.Terefore there may be a lack of evolutionary pressure to explore optimal solutions, especially as the number of service requests increases.Second, this approach does not explicitly consider fne-grained search.Tis can lead to overlapped or unstructured searches, resulting in uneven distribution of Pareto-optimal solutions.
In this work, we tackle the issue by proposing an enhanced multi-objective diferential evolution (EnMODE) algorithm for searching for a representative set of feasible solutions to approximate the Pareto-optimal solutions in terms of proximity and distributivity.Our main contributions are summarized as follows: (1) Sufcient evolutionary pressure -We propose a dual strategy to adjust the usage of "rand/2/bin" and "current-to-best/1/bin" as the iteration evolves."rand/2/bin" expands the evolution ability by performing new exploration around two diferent solutions.At the same time "current-to-best/1/bin" improves the evolution robustness by performing guided exploration around the current best.Te dynamic execution of "rand/2/bin" and "current-tobest/1/bin" provides sufcient evolutionary pressure as the population evolves.
(2) Fine-grained search -Te reference vectors are used as neighborhood axis to divide MOP, and accumulate the non-dominated compositions around the nearest reference vector and the dominated compositions around the essential non-dominated compositions.Te reference vector neighbor search simplifes MOP by systematically breaking it down into similar sub-problems and directing the search to the Pareto-optimal set.
We have evaluated the enhanced multi-objective diferential evolution algorithm on a real-world dataset, which shows that the proposed method has better proximity and distributivity than the baselines.Te rest of the paper is organized as follows.In Section 2, we compare it with related work.Section 3 introduces the multi-objective service composition model.We outline the proposed approach for the MOP in Section 4. Section 5 evaluates the proposed method in terms of proximity degree and uniformity.Finally, Section 6 concludes the paper and outlines future work.

Related Work
Tis section outlines the related work on MOSC.Section 2.1 focuses on Pareto-based multi-objective service composition (PMOSC), which fnds a set of Pareto-optimal solutions based on the hypothesis that users cannot accurately predefne weights or priorities for multidimensional objectives.Section 2.2 presents the utility-based multi-objective service composition (UMOSC), which computes the utility value of a solution based on the hypothesis that the weights or priorities can be accurately specifed.Te summary of the related work is presented in Section 2.3.

Pareto-Based Multi-Objective Service Composition.
Te intuitive method is to explore all Pareto-optimal solutions exhaustively.Since the Pareto-optimal set may include all possible solutions that exponentially grow in the sizes of service requests, the optimization cost of such a method would be prohibitive.
To solve this problem, Guo et al. [7] proposed a computationally efcient dropout neural network as a computationally scalable alternative of the Gaussian process model for assisting the solution of expensive high-dimensional multiobjective and many-objective expensive optimization problems.Li et al. [8] built the energy-efcient job-shop scheduling problem to a many-objective model with fve objectives, i.e., makespan, total tardiness, total idle time, total worker cost, and total energy.Tey adopted a novel ftness evaluation mechanism based on fuzzy correlation entropy to solve this manyobjective optimization problem.Cruz et al. [9] proposed an evolutionary algorithm-based search strategy for choosing an efcient design of an ensemble of Convolutional Neural Networks (CNNs), which includes not only the networks architecture but also the voting policy.During the running of the search strategy, not only the combination of CNNs with different architectures is taken into consideration, but also the most suitable policy used by the ensemble for generating the unifed response.
Zhou et al. [3] proposed a multi-population diferential artifcial bee colony optimizer for PMOSC.Te optimization problem is divided into several sub-problems to reduce the search scale.Diferent search behaviors are considered in the artifcial bee colony algorithm to select the solutions set toward the Pareto-optimal set.Te work of [10] integrated hyper-heuristics with genetic programming to solve the multi-objective dynamic service composition optimization.A set of Pareto-optimal solutions are provided to satisfy varied preferences.Wang et al. [11] proposed an improved whale optimization algorithm to divide the population into several populations.A pareto strategy is presented to improve the optimization.Yang et al. [12] adopted a multiobjective immune algorithm to implement PMOSC.Te global ranking is incorporated into the evolution of multiple populations to obtain better generations.
Chen et al. [13] proposed an objective space partitionbased adaptive multi-objective evolutionary algorithm to maintain diversity during strength convergence.Te proposed approach defnes the forward population distance as a metric to dynamically identify efcient subspaces and adaptively allocate computational resources to each subspace.In [14], an enhanced decomposition-based evolutionary many-objective optimization algorithm is proposed to solve irregular many-objective optimization problems.Te local search is performed on external archives to alleviate the adverse efects of inappropriate weight vectors and strengthen the performance.Dai et al. [15] proposed a problem-specifc multi-objective evolutionary algorithm where a decomposition scheme decomposes PMOSC into multiple scalar sub-problems.Te evolutionary operators search Pareto solutions in terms of maximizing the service quality and minimizing the overhead.Seada and Deb [16] developed a unifed evolutionary optimization algorithm U-NSGA-III to solve mono-, multi-, and many-objective optimization problems.Te ability of U-NSGA-III to solve diferent types of problems equally efciently and sometimes better, with the added fexibility brought in through population size control, remains a hallmark achievement.Dhiman et al. [17] proposed a novel hybrid many-objective evolutionary algorithm named Reference Vector Guided Evolutionary Algorithm (H-RVEA).It decomposed the optimization problem into several sub-problems by reference vectors, and used an adaptation strategy to adjust the reference vector distribution.
Lin et al. [18] proposed an adaptive immune-inspired multi-objective algorithm.Tis method embeds three differential evolution (DE) strategies with distinct features into multi-objective immune algorithms.At each generation, one of them is adaptively selected to be used based on the current search stage.Tis adaptive DE strategy selection efectively cooperates with three DE strategies, signifcantly improving search capability and population diversity.Kumar et al. [19] proposed a diferential evolution and sine cosine algorithmbased new hybrid optimization method.Tis method adapted multi-objective versions of evolutionary optimization-based methods to mine the reduced highquality numerical association rules automatically.Altay and Alatas [20] proposed an enhanced version of the multi- We can conclude that the multi-operator variant [18][19][20] implemented the task of enhancing the diversity of the candidate solutions.Te diference is that the execution probability for the multi-operator variant is not a quantitative benchmark that accurately refects the current search stage.At the same time, we calculate the probability as the search continues.Tis diference leads to problems because the multi-operator variant has difculty tackling MOPs with diferent characteristics, and our proposed method solves these problems.

Utility-Based Multi-Objective Service Composition.
Diferent approaches are developed to fnd the composition with the best utility, such as graph search [21,22], evolutionary algorithms [23,24], and so on.Rodríguez-Mier et al. [25] used a Service Match Graph to represent all matches between the relevant services.On this basis, they proposed a hybrid local/global search to fnd the optimal solution.Siebert et al. [26] transformed the service composition problem into the subgraph isomorphism problem.A message-efcient localized algorithm is proposed to compose the component services according to the information from the collaboration candidate.Tere are also existing works on evolutionary algorithms for fnding the optimal solution.For example, Hossain et al. [27] extended the particle swarm optimization algorithms to improve global and local optimization.Te particles search the service space with guidance from extreme individual value and population extreme value.Martín et al. [28] proposed an ant colony optimization algorithm, in which a set of ants fnd the shortest path according to the pheromone mechanism.Some works used machine learning technologies to fnd the optimal solution.Wang et al. [29] integrated reinforcement learning with multi-agent techniques for fnding the optimal solution.Game theory and a fctitious play process are combined to help improve performance.Peng et al. [30] and Wang et al. [31] used a restricted Boltzmann machine to learn the probability information of the global optimization contribution of concrete service.Te information helps guide the search for solutions.Palade and Clarke [32] adopted collaborative agent communities to approximate the optimal solution.

Summary.
Te existing work for UMOSC can fnd an optimal or near-optimal solution efectively by maximizing or minimizing the utility value, which is computed under the basis of specifying weights.However, it is not easy to determine weights in practice.Te reason might be that the information on user preference for multiple attributes is lost.Even if they know the user preference, it is hard to provide accurate quantitative values.
Te work for PMOSC fnds a set of Pareto-optimal solutions under the assumption of unknown weights, but the performance, such as proximity and distributivity, needs to be improved.In this paper, we focus on the work for PMOSC, and provide a hybrid approach to search for a smaller set of solutions with proximity and distributivity.Defnition 2. (QoS vector for a composition) A composition is represented as cs � (ws 1 , ws 2 , . . ., ws n ), where ws i (1 ≤ i ≤ n) is the concrete service of specifying the instantiation of the i th abstract service.Te QoS vector for cs is defned as

Multi-Objective Service Composition Model
, where f m (cs) is the aggregation value of the m th QoS attribute for all concrete services in cs.
As shown in Table 1, the aggregation value is computed based on the aggregation function.Other QoS attributes share similar aggregation functions, e.g., the cost computation in the case of sequential execution has a similar summation aggregation function.

Pareto-Optimality
Defnition 3. (Pareto-dominance) Given two compositions cs a and cs b , their QoS vectors are denoted by .cs b is said to Pareto-dominate cs a if i) for every attribute cs b has a better QoS value than cs a or equivalent value like cs a , and ii) for some attributes, cs b has better QoS values than cs a , i.e.
For brevity, the relation that cs b dominates cs a is denoted by cs a ≺cs b .Trough the notion of Pareto-dominance, we can also determine that cs b and cs a are non-dominated by each other if neither cs a ≺cs b nor cs b ≺cs a .Defnition 4. (Pareto-optimality) Given a set of compositions CS, a composition cs ∈ CS is a Pareto-optimal solution if it is feasible and not strictly dominated by any other feasible composition cs ′ ,i.e.∄cs ′ ∈ CS: cs ′ ≻cs. ( Consider a set of feasible compositions, and we have the relations cs 4 ≻cs 1 , cs 4 ≻cs 2 , cs 4 ≻cs 3 , cs 6 ≻cs 5 , cs 6 ≻cs 7 , and Computational Intelligence and Neuroscience cs 6 ≻cs 8 .Since other compositions do not dominate cs 4 and cs 6 , they are Pareto-optimal.

Problem Statement.
Te MOP of service composition can be defned as follows: where cs represents a composition (ws 1 , ws 2 , . . ., ws n ), CS is the set of composite services.Considering multiple QoS attributes, service composition optimization is regarded as MOP.Due to conficting objectives and unavailable preferences, fnding a solution with the best values for all objectives is complex.An intuitive method to address this problem is to explore all Pareto-optimal solutions.However, the solution space size grows exponentially as service requests increase.Finding all Pareto-optimal solutions will cost a lot.Terefore, it is more desirable to approximate the Paretooptimal set to allow runtime multi-objective service composition.Tis work aims to solve multi-objective service composition by seeking a representative set of solutions with good proximity and distributivity in QoS objective space.

The EnMODE for Multi-Objective Service Composition
4.1.Initialization.Te initialization of the proposed approach is conducted from two aspects: the population and the reference vectors.
An individual corresponds to a composite service composed of several concrete services for the population.Te concrete service is randomly chosen from the service candidates.A unique identifer id(ws) is used to identify the concrete service ws.After identifying the concrete service, an individual cs is represented as cs � (id 1 (ws), id 2 (ws), . . ., id D (ws)), where D is the number of abstract services.Te QoS vector of cs is represented as Q(cs) � f 1 (cs), f 2 (cs), . . ., f M (cs)  , where M is the size of the objectives.
For the initialization of the reference vectors, the key steps are listed.First, the reference point is generated by sampling points on a hyperplane.Ten the reference points are mapped on the PF to generate the reference vectors.A reference vector is a vector that starts from the origin point in the objective space and ends in the reference point.Let H be the parameter that controls the division on the objective axis, a reference vector λ i � (λ i1 , λ i2 , . . ., λ iM ) is generated by selecting λ im from 0/H, 1/H, . . .H/H { } and satisfying  M m�1 λ im � 1. Usually, the number of reference vectors equals the population size N. Te initial reference vector is stored in λ 0 .Terefore, the initial reference vectors are represented as λ 0 � (λ 0 1 , . . .λ 0 N ).

Ofspring Creation with Dual
Strategy.According to references [33], the state of the search space varies with the evolutionary process.At the early stage, the available information (e.g., the individuals with better QoS) about the search is limited.Te available information about the search is accumulated with the increase of the iterations.Entering the latter stage, many individuals might be close to the true PF, so it is less likely to fnd better individuals even if taking longer.Such a change would leave the population without evolutionary pressure.To solve this problem, a dual strategy is needed to tweak the use of creation operators to provide sufcient evolutionary pressure in the evolutionary stage.First, two creation operators, namely, "rand/2/bin" and "current-to-best/1/bin", are selected to manipulate individuals.Te "rand/2/bin" shows better disturbance and creates ofspring based on diferent individuals.Tese characteristics make it capable of expanding its evolution ability.Te "current-to-best/1/bin" creates the ofspring by searching around the current best, which improves the evolutionary robustness.Tese two creation operators regard the abstract service as the operational dimension to generate a new individual.Te new individual cs ′ � (id 1 (ws ′ ), . . ., id D (ws ′ )) is produced as follows.
(2) current-to-best/1/bin where r 1 , r   n i�1 q(ws i ) max n i�1 q(ws i ) k * q(ws i ) max n i�1 q(ws i ) Troughput min n i�1 q(ws i )  n i�1 q(ws i ) min n i�1 q(ws i ) max n i�1 q(ws i )

Computational Intelligence and Neuroscience
Due to the fniteness of the service candidates, the identifer of the concrete service might exceed its limit.Terefore, the identifer needs to be reset, which is conducted according to the following formula. with where id d i (ws) max and id d i (ws) min represent the upper and low bounds of the identifer, respectively.
Second, these two operators are adjusted by a dual strategy, which controls the execution probability of "rand/ 2/bin" and "current-to-best/1/bin" in the evolutionary stage.Tere is no clear division between the various evolutionary stages throughout the evolution process.Terefore, we defne the ratio of the current iteration to the total iteration to distinguish between the diferent evolutionary stages.On account of the ratio, we represent their probabilities as p rand and p best , and compute them as follows.
where i is the number of the current iteration and iter max is the number of total iteration.To exhibit the whole dynamic changes of p rand and p best more clearly, their tendencies are illustrated in Figure 1.At the early stage, the "rand/2/bin" is run with a high probability of exploring more high-quality individuals.As the evolution continues, the execution probability of the "rand/2/bin" dynamically reduces.In contrast, the execution probability of the "current-to-best/1/ bin" dynamically increases.Te exploration of the "rand/2/ bin" and the exploitation of the "current-to-best/1/bin" are used to speed up convergence and prevent premature convergence.Entering the latter stage, the "current-to-best/ 1/bin" is preferred to exploit the local information.

Reference Vector Neighbor
Search.Tis search frst unfolds the two-stage clustering to implement a fne-grained search under the guidance of the reference vectors and the elites in the non-dominated individuals, and then carries out the selection of the solutions set towards the true PF.

Two-Stage Clustering.
In the two-stage clustering, the frst-stage clustering uses the reference vector as the anchor to gather the non-dominated individuals.Te second-stage clustering groups the dominated individuals under the direction of the elites in the frst cluster.
For the frst-stage clustering, we frst search the nondominated individuals from the union of the ofspring and the current population.Ten, the non-dominated individuals are clustered by computing their closeness degree with the reference vectors.Te closeness degree can be measured by the perpendicular distance from the individual cs to a reference vector λ.Te perpendicular distance is computed as follows.
where d 1 represents the distance along λ, which is computed by d 1 � Q(cs)λ T , and ‖ • ‖ 2 represents the l 2 norm of the vector.Each non-dominated individual can be attached to the nearest reference vector by comparing the distance values.Reference vectors attached by non-dominated individuals are labeled as active, while reference vectors without attached individuals are labeled inactive.Since there may be more than two non-dominated individuals attaching to one reference vector, we need to sort them within a cluster.We evaluate one individual from its proximity and distributivity.Te proximity is refected by the distance along the closest λ.Te smaller the value, the closer the individual is to the true PF.Te distributivity is measured by the perpendicular distance between the nondominated individual cs and λ.Tis distance represents the distribution error between cs and λ.Te smaller the value, the closer cs is to λ. Tese two criteria are integrated into the following formula.

P(cs)
where α is a parameter that controls the proximity and distributivity.We can sort the non-dominated individuals in ascending order based on the compromise value of cs.
For the second-stage clustering, we also need guidance for the search of the dominated individuals towards proximity and distributivity.Some characteristics of the secondstage clustering are summarized below.Computational Intelligence and Neuroscience (iii) Te dominated individuals in the same cluster are sorted in ascending order based on the comparison of their closest value to the center.

Population Selection.
We propose a cyclic selection to distribute a new population close to the true PF evenly.Te idea behind this is that, by selecting the feasible individuals having the best values in diferent clusters, we have a higher chance of obtaining a set of solutions with good proximity and distributivity.More specifcally, (i) For each frst-stage clustering, the feasible head of the sorted non-dominated individual's list is selected.(ii) Te feasible head of the sorted dominated individuals' list is selected in order.(iii) If the number of the selected individuals is less than the population size, other feasible non-dominated individuals are selected in order.

Experiment Design
5.1.1.Dataset.Given a workfow with a set of tasks, there are concrete services with similar functions but diferent QoS values for each task.Terefore, there are a large number of composition instances.For each test case, the concrete services are randomly assigned using the QWS dataset (https://qwsdata.github.io/),which records the QoS measurements of real-world web services.We focus on the response time, availability, throughput, successability, and reliability attributes.All experimental results are collected on a 3.4 GHz PC with 8 GB RAM.

Comparative Approaches
(i) MODE: it is a basic multi-objective diferential evolution algorithm for verifying the impact of the dual strategy and fne-grained search on the performance of our proposed algorithm.(ii) MOGP: A multi-objective genetic programming algorithm is proposed in [34].It is a powerful evolutionary metaheuristic to fnd the best tradeofs between more than two objectives.(iii) MS-DABC: An improved artifcial bee colony algorithm is proposed in [3].It has a competitive performance produced by cooperating with a synergistic mechanism, a diversity maintenance strategy, and a well-maintained external achieve with the artifcial bee colony algorithm.(iv) NSGA-III-DDR: An improved evolutionary multiobjective optimization algorithm using referencepoint based non-dominated sorting approach is proposed in [35].It provides a distance dominance relationship in NSGA-III.Te algorithm not only considers the diverse solutions but also retains good convergence.

Parameter Setting.
Te shared parameters for all algorithms are as follows: the number of epochs is set to 200; the population size N is set to 100.Te individual parameters for each algorithm are as follows.For the EnMODE, the values of F 1 and F 2 are respectively set to 0.8; the values of F 3 and F 4 are respectively set to 0.4; the parameter that controls the division on the objective axis is H � 8; the parameter that controls the proximity and distributivity is α � 5; Te maximum number of iterations is itermax � 200.For the comparative algorithms, the crossover rate and mutation rate are set to 0.7 and 0.3, respectively.[36] measures the proximity degree of the obtained solutions set toward the true Paretooptimal set.It is computed using the quadratic mean of the Euclidean distances from N compositions in the obtained solutions set to the closest composition in the true Paretooptimal set.Based on [37], we identify the true Paretooptimal set by selecting N compositions from the union of the obtained solutions from the EnMODE and the comparative algorithms.Te formula of GD is computed as follows.

GD CS, CS
where CS and CS * represent the obtained solutions set and the true Pareto-optimal set, respectively.d(cs i , CS * ) is the Euclidean distance from cs i to the closest composition in CS * .Te smaller the value of GD, the better the proximity degree.SP [38] measures the uniformity of the obtained solutions set on distribution.It is computed using the distance variance between compositions in the obtained solutions set.

SP(CS)
where d(cs i , cs j ) represents the Euclidean distance from cs i to the closest cs j , with cs j ∈ CS and j ≠ i. d � ( N i�1 (d(cs i , cs j ))/N).Te smaller the value of SP, the more uniform the obtained solutions set.
Te size of the evaluation metrics is afected by the number of abstract services and concrete services.Tus, diferent parameter confgurations are given as follows: the size of abstract services varies from 5 to 50 with a step of 5, and the size of concrete services varies from 100 to 1000 with a step of 100.Te experimental results on diferent test cases show that the EnMODE has the best GD values by comparing the competing approaches MOGP, NSGA-III-DDR, and MS-DABC.Te reason for our analysis may be that as populations evolve, there is less evolutionary pressure.However, the dual strategy proposed in Section 4 provides sufcient evolutionary pressure for the population at diferent stages.

Analysis of the
In the early stage, the exploration ability of "rand/2/bin" is developed with a high probability.More and more services would be utilized to compose diferent value-added services, which result in a more high-quality composition with a higher chance of success.As the number of iterations increases, "current-to-best/1/bin" is gradually utilized to search around the potential high-quality region.Te optimized information guides the exploration toward the potential optimized region, which makes EnMODE possible to converge to true PF. "rand/2/bin" is used simultaneously to explore more new compositions.Te usage of "current-tobest/1/bin" would be enlarged during the posterior stage so that the creation operator uses more information to generate high-quality compositions.Even if the search status changes with the increase of concrete services and abstract services, the EnMODE provides sufcient evolution pressure to reduce the infuence.Because MOGP, NSGA-III-DDR, and MS-DABC ignore the evolutionary pressure, they have worse proximity than the EnMODE.

Analysis of Uniform Distribution Problem.
Te SP results of each algorithm over 10 test cases for diferent numbers of concrete services are shown in Figure 4. We can see that the EnMODE has optimal SP values on test cases of 300, 400, 500, 700, 800, 900, and 1000 concrete services, and the NSGA-III-DDR has optimal SP values on test cases of 100, 200, and 600 concrete services.In general, the EnMODE algorithm has the strongest competitiveness.In contrast, the baseline MODE, MOGP, NSGA-III-DDR, and MS-DABC have worse SP values.As the number of concrete services grows, the SP values for each algorithm increase, but the growth rate of the EnMODE is less than other algorithms.Te SP results of each algorithm over 10 test cases for diferent numbers of abstract services are displayed in We can conclude from the experimental results on diferent test cases that the EnMODE has a more uniform distribution in terms of SP values by comparing the baseline MODE.It can be seen that the EnMODE with the dual strategy and fne-grained search can better improve the performance of the algorithm, especially in the uniform distribution problem.
From the experimental results, we also conclude that the EnMODE has the best uniform distribution in terms of SP values.We also found out that the numbers of concrete services and abstract services would infuence the sizes of SP values.Still, the EnMODE has a narrower range of variation in SP values than other algorithms.As stated in Section 4, the reference vector neighbor search uses two-stage clustering to downsize the problem and then conducts a fne-grained search.Te frst stage of clustering divides the nondominated compositions under the guidance of the reference vector to cause similar and distinct compositions to gather.Te second stage of clustering assembles similar and distinct dominated compositions around the elites in the frst stage.Two-stage clustering achieves natural-organized decomposition.A cyclic selection is used to make the distribution of the new generation close to the true PF evenly.Even if the sizes of concrete services and abstract services grow, the EnMODE provides a fne-grained search to make the distribution of the obtained solutions set over the whole extent of the current PF more uniform.Te reason why the NSGA-III-DDR algorithm has slightly poor uniformity may have a good distance dominance relationship.Te MOGP and MS-DABC achieve an evolutionary process by the genetic operators, which makes them difcult to make the solutions set close to the true PF evenly.

Summary of Results.
Based on the evaluative results of the experiments, we have verifed that the EnMODE algorithm fnds a smaller set of solutions with better proximity and distributivity.Compared with MODE, NSGA-III-DDR, MS-DABC, and MOGP, the reference vector neighbor search gives EnMODE better proximity and distributivity.In addition, with the increase of concrete and abstract services, the infuence of EnMODE is less than other algorithms.

Conclusion and Future Work
Tis paper proposes a novel multi-objective diferential evolution algorithm as the search scheme.Te proposed approach implements the natural-organized decomposition of MOP, and guides the search of multiple sub-problems around the active reference vector and high-quality nondominated compositions.Experimental results verify that the proposed approach is more likely to fnd a representive set of solutions with proximity and distributivity.
Tis work is expected to investigate the impact of the number of QoS objectives on the optimization problem of MOSC.Furthermore, the proposed approach is improved to adapt to the workfow change.

Data Availability
Te experimental data used to support the fndings of this study are available upon request to the author.

3. 1 .
QoS Vectors Defnition 1. (QoS vector) Assuming a web service ws has M attributes, the quality of ws is described by M attributes that are considered an M-dimensional QoS vector.Tus, the QoS vector for ws is defned as Q(ws) � a 1 (ws),  a 2 (ws), . . .a M (ws)}, where a m (ws) represents the m th QoS attribute value of ws (for 1 ≤ m ≤ M).
(i) For each cluster in the frst stage, the top individual is taken as the center of the second-stage clustering.(ii) Te dominated individuals are assigned to the closest center according to the Euclidean distance between the individuals and the centers.

Figure 1 :
Figure 1: Te tendencies of p rand and p best .

Figure 2 :Figure 3 :Figure 5 .
Figure 2: GD value on test cases with diferent numbers of concrete services.

Figure 4 :Figure 5 :
Figure 4: SP value on test cases with diferent numbers of concrete services.

Table 1 :
Examples of QoS aggregation functions.
Proximity Problem.Figure2shows the GD values obtained by the fve algorithms on 10 test cases with diferent numbers of concrete services.From the fgure, we can see that the GD values obtained by the EnMODE on all test cases are smaller than the GD values obtained by other algorithms.More specifcally, Te EnMODE has