Multi-Objective Approach for Energy-Aware Workflow Scheduling in Cloud Computing Environments

We address the problem of scheduling workflow applications on heterogeneous computing systems like cloud computing infrastructures. In general, the cloud workflow scheduling is a complex optimization problem which requires considering different criteria so as to meet a large number of QoS (Quality of Service) requirements. Traditional research in workflow scheduling mainly focuses on the optimization constrained by time or cost without paying attention to energy consumption. The main contribution of this study is to propose a new approach for multi-objective workflow scheduling in clouds, and present the hybrid PSO algorithm to optimize the scheduling performance. Our method is based on the Dynamic Voltage and Frequency Scaling (DVFS) technique to minimize energy consumption. This technique allows processors to operate in different voltage supply levels by sacrificing clock frequencies. This multiple voltage involves a compromise between the quality of schedules and energy. Simulation results on synthetic and real-world scientific applications highlight the robust performance of the proposed approach.


Introduction
Cloud computing presents an interesting technology that facilitates the execution of scientific and commercial applications. It provides, on demand, flexible and scalable services to customers through a pay per use basis. It can usually provide three kinds of services: IaaS (Infrastructure as a Service), PaaS (Platform as a Service), and SaaS (Software as a Service). These services are offered with different service levels so as to meet the needs of various customer groups. Although many cloud services have a similar functionality (e.g., computing services, storage services, network services, etc.), they differ from each other by non-functional qualities termed QoS (Quality of Service) parameters, such as service time, service cost, service availability, service energy consumption, service utilization, and so forth.
These QoS parameters may be defined and proposed by different SLAs (Service Level Agreements). An SLA specifies the QoS requirements of negotiated resources, the minimum expectations and limits that exist between consumers and providers. Applying such an SLA represents a binding contract. Lack of such agreements can lead applications to move away from the cloud and will compromise the future growth of cloud computing.
Several scientific applications such as those of bioinformatics, chemistry and astronomy contain a great number of tasks that have precedence constraints. They can be defined by DAGs (Directed Acyclic Graph). These scientific workflows typically involve complex data of different sizes and long term computer simulations. They need high computation power and the availability of large infrastructures that grid and more recently cloud computing environments provide with different QoS levels.
Due to the importance of workflow applications, several research projects have been conducted to develop workflow management systems with scheduling algorithms. The projects: Condor Dagman [1], Gridbus toolkit [2], Iceni [3], Pegasus [4], and so forth, are designed for grids, whereas cloudbus toolkit [5], SwinDeW-C [6], VGrADS [7], and so forth, are developed for clouds. These systems can be viewed as a type of platform service facilitating the automation of scientific and commercial applications on the grid and cloud by masking their orchestrations and executions. 2 The Scientific World Journal In order to effectively schedule the tasks and data applications on these cloud environments, workflow management systems require more elaborated scheduling strategies to meet QoS constraints and the precedence relationships between workflow tasks. The study of workflow scheduling is becoming an important challenge in the area of cloud computing.
The workflow scheduling in the cloud is a difficult problem. This problem is even more difficult when there are several factors to be considered namely, (1) the various QoS requirements of customers like service response time, service cost, and so forth; (2) the heterogeneity, dynamicity and elasticity of cloud services; (3) the various ways of combining these services to execute workflow tasks; (4) the transfer of large volumes of data, and so forth. However, the workflow scheduling problem is seen as a combinatorial problem, where it is impossible to find the globally optimal solution by using simple algorithms or rules. It is well known as an NP-complete problem [8] and depends on the problem size.
The workflow scheduling problem has been widely studied in many previous works [9][10][11][12]. Most of these works have concentrated only on two QoS parameters namely, the deadline and budget. In this paper, we extend these works to handle multiple QoS requirements. We address the QoS-based workflow scheduling which aims to minimize the cost and total time execution of user applications as specified in the SLA. Furthermore, the scheduler must also be able to schedule workflow tasks so as to maximize the provider profits by minimizing energy consumption while preserving the users QoS preferences. We achieve this by using an iterative method called Multi-objective Discrete Particle Swarm Optimization (MODPSO) combined with the Dynamic Voltage and Frequency Scaling (DVFS) technique. This last one allows a compromise between system performance and energy consumption.
The proposed approach is assessed by simulation runs on a set of synthetic and real-world scientific applications. Simulation results showed that this new multi-objective algorithm significantly improves the performance of related approaches.
The remainder of this paper is organized as follows. Section 2 reviews several related works. Section 3 presents the problem modeling of the QoS based workflow scheduling. Section 4 describes in detail our scheduling heuristic called DVFS-MODPSO. Section 5 shows an experimental evaluation of our heuristic. Section 6 concludes the paper and discusses some future works.

Related Work
The workflow scheduling problem in heterogeneous computing systems is an NP-hard optimization problem [8], meaning that the amount of computation needed to find optimum solutions increases exponentially with the problem size. Previous works have proposed many heuristic, and meta-heuristic based approaches [13][14][15][16] to solve this problem. One of the most widely used heuristics for scheduling workflow application is the Heterogeneous Earliest Finish Time (HEFT) algorithm developed by Topcuoglu et al. [17]. HEFT is a static scheduling algorithm that attempts to minimize execution time (makespan). It preserves the workflow precedence constraints and produces a good schedule length.
Most of these previous works have focused on minimizing the workflow execution time without considering the users' budget constraint. However, with the marketoriented business model in cloud computing environments, where users are billed for their consumption of resources, several works that consider users' budget and deadline have been proposed [18][19][20][21]. In [22], a study indicating how to schedule scientific workflow applications with budget and deadline constraints onto computational grids using genetic algorithms is presented. Authors in [6] proposed an improved cost-based scheduling algorithm for making efficient scheduling of tasks to available resources in cloud. In [9], a particle swarm optimization (PSO) based heuristic is used to minimize the execution cost of scheduling workflow applications to cloud resources.
Besides makespan and cost, energy consumption is becoming more and more important in the cloud computing environments. However, cloud providers must adopt measures not only to meet the user' QoS requirements, but also to ensure that their profit margin is not dramatically reduced due to high energy consumptions. The energy efficiency can conflict with the other QoS requirements (makespan, cost). Incorporating the energy consumption into the workflow scheduling adds another layer of complexity. Therefore, recent works have concentrated on developing energy-aware scheduling algorithms. They have examined various techniques such as dynamic power management, Dynamic Voltage and Frequency Scaling (DVFS) or resource hibernation [23][24][25][26]. Authors in [27] presented an online dynamic power management strategy with many powersaving states. They proposed a min-min based energy-aware scheduling algorithm to minimize energy consumption in heterogeneous computing systems. In [26], a dynamic slack allocation technique which tries to use idle (slack) time slots of processors to lower supply voltage (frequency/speed) is presented. These slack time slots occur, due to earlier completion and/or dependencies of tasks. Several DVS-based approaches for slack allocation have been proposed for both independent [28][29][30][31][32][33][34][35], and precedence-constrained [36][37][38][39][40][41][42] tasks. In [43], an energy-aware scheduling algorithm and detailed discussion of slack time computation are presented. This scheduling algorithm reduces voltages during the communication phases between parallel tasks on homogeneous processors. In [44], an Energy Conscious Scheduling heuristic (ECS) is proposed. The heuristic is devised with relative superiority as a novel objective function, which takes into account energy and makespan. ECS is used to improve the biobjective genetic algorithm proposed in [45]. This latter has been extended in [46] to a parallel model of their approach.
All of these presented works have focused on optimizing either a single or two objectives but none of them consider the relationships between several objectives, namely, the relationship between energy, makespan and cost. They do not take into account how each one of these criteria can affect others. To deal with these misses, we propose a Multi-Objective The Scientific World Journal 3 Discrete Particle Swarm Optimization algorithm combined with DVFS technique (DVFS-MODPSO) to optimize all three objectives at the same time. Our new approach provides a set of solutions named Pareto solutions (i.e. non-dominated solutions) enabling the user to select the desired tradeoff.
To the best of our knowledge, none of the previous scheduling approaches deal with the three-dimensional makespan/cost/energy optimization, when tackling the problem of scheduling workflow applications on heterogeneous computing environments such as the cloud computing ones, which constitute our key novelties.

Problem Modeling
In this section, we describe our system model in a formal way. Our ultimate goal is to distribute workflow tasks among cloud services so as to optimize both the users' QoS criteria and cloud providers' profits by saving energy consumption of their services. Therefore, we first present the cloud computing model. Then, we describe our workflow model and the QoS parameters we deal. We conclude this section by describing the scheduling model formalized as a multi-objective optimization problem we solve.

Cloud Computing
Model. The cloud computing system used in this work is a set of resources offered by a cloud provider to run client applications. Our cloud model is inspired by the model described in [45]. We assume that the cloud is hosted in data centers composed of heterogeneous machines. These data centers deliver a variety of services hosted on thousands of IT servers, which are made available as subscription-based services in a pay-as-you-go model. In our model, the cloud computing system consists of a set of = { 1 , 2 , . . . , } heterogeneous processors which are fully interconnected. The processors have varied processing capability delivered at different processing prices (see ec of Table 1). Each processor ∈ is DVFS-enabled; that is, it can operate with different VSLs (Voltage Scaling Level, i.e., different clock frequencies). For each processor ∈ , a set of V VSLs is randomly and uniformly distributed among three different sets of VSLs (Table 1). We consider that processors consume energy during periods of inactivity; that is, when a processor is idling, it is assumed that the lowest voltage is supplied [44]. Because clock frequency transition overheads usually take a negligible amount of time, these overheads are not considered in this paper and the inclusion of such an overhead will have no bearing on the overall model of the proposed study.
Additionally, each processor ∈ has a set of links = { , 1 , , 2 , . . . , , }, 1 ≤ ≤ ; where , ∈ + is the available bandwidth-measured in Mega bits per second (Mbps)-in the link between processors and , with , = 1. We assume that a message can be transmitted from one processor to another while a task is executed on the recipient processor. Finally, communication between tasks executed on the same processor is neglected. Table 1 shows DVFS levels, Relative speeds (R.Speed) and execution costs (ec.) for three processor classes (TURION MT-34, OMAP, PENTIUM M).

Workflow Application Model.
We model a cloud workflow application as a Directed Acyclic Graph (DAG), denoted as ( , ). The set of nodes = { 1 , . . . , } represents the tasks in the workflow application, the set of arcs denotes precedence constraints and the control/data dependencies between tasks. An arc is in the form of = ( , ) ∈ , where is called the parent task of , is the child task of , is the data produced by and consumed by . We assume that a child task cannot be executed until all of its parent tasks have been completed. In a given task graph, a task with no parent is referred as an entry task, and one without any child is called an exit task. Since our algorithm involves only one entry and one exit tasks, we add two dummy tasks entry and exit which have zero execution time to the beginning and the end of the workflow, respectively. These dummy tasks are connected with zero-weight arcs to the actual entry and exit tasks, respectively. We assume that each task ∈ has an associated basic execution time which is an independent value for each machine. We denote , the basic computation time of a task on a compute resource at maximum speed and voltage (i.e., it corresponds to Level 1 in Table 1). The average execution time of the task is defined as: (1) Real computation time of the task on machine using relative execution speed is defined as: ( We also assume that every edge ( , ) ∈ , is associated with value tr , representing the time needed to transfer data from to . The transfer time can be calculated according to the bandwidth , between the resources executing these tasks ( and resp.) as follows: ( However, a communication time is only required when two tasks are assigned to different processors. That is, the communication time when tasks are assigned to the same processor can be neglected, that is, 0. In general the execution costs (ec) and transmission costs (trc) are inversely proportional to the execution times and transmission times respectively.
We define pred ( ) as the set of all predecessors of and succ ( ) as the set of all successors of . An ancestor of node is any node that is contained in pred ( ), or any node that is also an ancestor of any node contained in pred ( ).
The Earliest Start Time and the Earliest Finish Time of a task on a processor are represented as EST ( , ) and EFT ( , ), respectively. EST ( ) and EFT ( ) represent the earliest start and finish times on any processor respectively.

4
The Scientific World Journal  V( , ) is defined as the earliest time when processor will be available to begin executing task . Hence, where, = max ∈ pred ( ) ( ( , ) + tr Note that the Actual Start Time and Actual Finish Time of a task on a processor , denoted as AST ( , ) and AFT ( , ) can be different from its earliest start EST ( , ) and finish EFT ( , ) times, if the actual finish time of another task scheduled on the same processor is later than its EST ( , ) [44]. Figure 1 depicts a workflow application with 10 tasks, and the Table 2 provides its details (given in [17]). The values presented in the last column of the table represent the priority of the tasks. The priority of task represented by Pr( ) is computed recursively by traversing the DAG upward starting from the exit task exit as follows (6): where, = max ∈ succ ( ) { + Pr( )}.

QoS Parameter Models
3.3.1. Energy Model. Among the main system-level energysaving techniques, Dynamic Voltage Scaling (DVS) operates on a simple principle: decreases the supply voltage (and so the clock frequency) to the CPU so as to consume less power.
In this work, we use a model of energy derived from the power consumption model in digital complementary metaloxide semiconductor (CMOS) logic circuits [44]. Under the dynamic power model, the processor power is dominated by the dynamic power which is given by: where is the number of switches per clock cycle, ef denotes the effective charged capacitance, V is the supply voltage, and denotes the operational frequency. Equation (7) shows that the supply voltage is the dominant factor; hence, its reduction would be most influential to lower power consumption.
The energy consumption of the execution of a workflow application used in this paper is defined as: The Scientific World Journal 5 where = ef is assumed constant for a given machine; V , are the voltage supply and frequency of the processor on which task is executed, respectively, and * is the real completion time of task on the scheduled processor. In the idle time, the processor turns into sleep mode and thus the voltage supply and relative frequency are at the lowest level. So, the energy consumption during idle periods of processors is defined as: where IDLE is the set of idling slots on machine , V min ( min ) is the lowest supply voltage (frequency) on , and is the amount of idling time for idle . Then the total energy total utilized by the cloud system for completion of the workflow application can be defined as follows:

Cost Model.
As result of the marketization characteristic of current services, most cloud providers have set a price for their services. They have fixed the price for transferring basic data unit (e.g., per MB) between two services and the price for processing basic time units (e.g., per hour). The cost total of running a cloud workflow is defined in Formula (12). It consists of processing cost ex and data transfers cost tr : The processing cost for a given task depends on the real completion time of on the scheduled processor ( * ), and the hourly price of this processor (ec ). Thus, ex is given by: The data transfer cost ( tr ) is described as follows: where characterizes the output file size from task to task ; and tr represents the cost of communication from the processor where is mapped to another processor where is mapped. The cost of communication is added to the overall cost only when two tasks have data dependency between them, (i.e, > 0). For two or more tasks running on the same processor, the transfer cost is neglected.

Scheduling Model.
Given (1) A cloud provider that offers a set of heterogeneous processors and (2) a user workflow application composed of a set of tasks that have to be executed on these processors. The workflow scheduling problem is to construct a mapping of tasks to processors (without violating precedence constraints) that minimizes the following conflicting objectives: makespan, cost, and energy consumption as low as possible. Therefore the workflow scheduling problem can be formulated as a mathematical optimization problem:

Workflow Scheduling Based on Discrete Particle Swarm Optimization
This section starts with a brief overview on multi-objective combinatorial optimization and Particle swarm optimization algorithm. Afterwards, our new Multi-Objective Discrete Swarm Optimization combined with DVFS technique will be presented.
In general MOP, there is no single optimal solution with regards to all objectives. This is also the case for the multi-objective optimization problem addressed in this paper. As given in (15), there are three conflicting objectives: minimizing execution time, minimizing execution cost and minimizing energy consumption. In such problems, the desired solution is considered to be the set of potential solutions which are all optimal in some objectives. This set is known as the Pareto optimal set. We provide some definitions of the Pareto concepts used in MOP as follows: (without loss of generality we suppose that the objectives are to be minimized): (i) Pareto dominance. For two decision vectors 1 and 2 , dominance (denoted by ≺) is defined as follows: The decision vector 1 is said to dominate 2 if and only if, 1 is as good as 2 considering all objectives and 1 is strictly better than 2 in at least one objective.

6
The Scientific World Journal (ii) Pareto optimally. A decision vector 1 is said to be Pareto optimal if and only if (iii) Pareto optimal set. The Pareto optimal set is the set of all Pareto optimal decision vectors.
(iv) Pareto optimal front. The Pareto optimal front is the image of the Pareto optimal set in the objective space.
1 is said to be non-dominated regarding a given set if 1 is not dominated by any decision vectors in the set.
The pareto optimal decision vector cannot be enhanced in any objective without causing degradation in at least another objective. A decision vector is said to be Pareto optimal when it is not dominated in the whole search space. [47]. It is inspired by the social behavior of insect colonies, bird flocks, fish schools and other animal societies. It is also related to evolutionary computation. PSO has attracted significant attention from many researchers due to both its simplicity of use and optimization via social behavior. In fact, PSO has good performance, requires low computational cost. It is effective and easy to implement as it uses numerical encoding.

The Standard PSO. PSO is a population-based stochastic optimization technique developed by Kennedy and Eberhart in 1995
A particle in PSO is analogous to a fish or bird moving in the -dimensional search space. All particles have fitness values indicating their performances, which are problem specific, and velocities which direct the flight of particles. Each particle position at any given time is influenced by both its best position called and the position of the best particle in a problem space referred to as . Therefore particles tend to fly towards a better search area during the search process. A particle status on the search space is characterized by two elements, namely its velocity and position, which are updated in every generation as follows: where +1 is the velocity of particle at iteration + 1, is the velocity of particle at iteration , is the inertia weight, 1 and 2 are the acceleration coefficients (cognitive and social coefficients), 1 and 2 are the random numbers between 0 and 1, is the current position of particle at the th iteration, is the best previous position of the th particle, is the position of best particle in the swarm, and +1 is the position of th particle at + 1 iteration.
The procedure for standard PSO is as follows: (1) Initialize a population of particles with random positions and velocities in the search space.
(2) Evaluate the objective values of all particles, set of each particle equal to its current position, and set equal to the position of the best initial particle.
(3) Update the velocity and the position of each particle according to (21).
(4) Map the position of each particle in the solution space and evaluate its fitness value according to the desired optimization fitness function.
(5) For each particle, compare its current objective value with its value. If the current value is better, then update Best with the current position and objective value.
(6) Determine the best particle of the current whole population with the best objective value. If the objective value is better than that of , then update Best with the current best particle.
(7) If the stopping criterion is met, then output and its objective value; otherwise, go to Step (2).
The original design of PSO is appropriate for finding solutions to continuous optimization problems. However, as the workflow scheduling discussed in this paper is both a discrete and multi objective problem in nature, we propose an effective approach to address this problem by using a discrete version of the Multi-Objective PSO (MODPSO) combined with DVFS technique. The key issues of our approach consist of: (1) defining the position and velocity of the particles based on the features of discrete variables of the workflow scheduling; (2) solving the multi-objective aspect of the problem by modifying PSO so as to generate a set of nondominated solutions satisfying the different objectives under consideration instead of one solution. For the sake of clarity, the variables and rules of DVFS-MODPSO for solving workflow scheduling can be depicted as follows:

Handling Workflow
The position of a particle represents a feasible solution to the scheduling problem. It consists of a set of ⟨task( ), service ( ), VSl ( )⟩ triplets. Each triplet means that a task ( ) is assigned to a processor ( ) with a voltage scaling level ( ). It also indicates The Scientific World Journal 7 that the position satisfies the precedence constraint between tasks.
The process of generating a new position for a selected particle in the swarm is depicted in the following formulas: The operator definitions are as follows: (i) The substract operator (⊖). the difference between two particle positions, designated as 1 and 2, is defined as a set of triplets in which each triplet ⟨task, service, VSL⟩ shows whether the contents of the corresponding elements in 1 are different from those of 2 or not. If so, that triplet gets its values (service and VSL) from the position that has the lowest value of the VSL. For those triplets that have the same content in 1 and 2, their corresponding VSLs are decreased. (Note that the scaling of VSL makes fluctuations on the energy, makespan and cost).
(ii) The multiply operator (⊗). the multiplication between number and velocity is defined as a set of triplets, where: a threshold ∈ [0,1] is defined, a random number is generated for each triplet ⟨task, service, VSL⟩; compare and : when ≥ , decrease the triplet VSL, otherwise, increase it. This operator adds the exploration ability to the algorithm.
(iii) The add operator (⊕). the addition of two positions is defined as the reservation of the dominated one.
The Pseudocode 1 outlines the general steps of the DVFS-MODPSO algorithm.
The algorithm begins by initializing the positions and velocities of particles. To obtain the position of a particle, the VSL (voltage and frequency) of each resource is randomly initialized firstly then the HEFT algorithm is applied to generate a feasible and efficient solution minimizing the makespan. The process is repeated several times to initialize the positions of all particles of the swarm. Initially, the velocity and the best position for each particle are attributed as the particle itself. The algorithm maintains an external archive EA to store non-dominated particles found after the evaluation process (based on the pareto dominance using the objectives mentioned in 3.3). After all these initializations, in the main loop of the algorithm, the new velocity and position of each particle are calculated respectively after selecting the best overall position in the external archive and eventually perform a mutation, then the particle is evaluated and its corresponding is updated. The external archive is updated after every iteration. Once the termination condition is reached, the external archive containing the Pareto front is returned as the result.

Experimental Evaluations
In this section, we describe the overall setup of our experimentation effort and the results we have obtained from it to validate the new proposed approach.
In our experiments, we simulate a synthetic workflow application described in [17] (see Figure 1) and two common workflow structures: parallel and hybrid structures. We chose two well known real world applications, namely a neuroscience workflow [48] for our parallel application and a protein annotation workflow [49] developed by London e-Science Centre for our hybrid workflow application. Figure 2 shows their simplified representations. The number indicated in the parentheses next to each node represents the length of the task (the number of instructions to execute) in Millions of Instructions (MI). The input and output files of each task range from 10 MB to 1024 MB.
We have performed experiments on 3, 4, 5, 6 and 12 resources whose characteristics are shown in Table 1. We choose the pricing model associated with Amazon EC2 (http://aws.amazon.com/ec2) for the processing costs of the different classes of resources, and the pricing model given by Amazon CloudFront (http://aws.amazon.com/cloudfront/) for the costs of transferring data unit between resources.
As for DVFS-MODPSO, the parameter settings are: Swarm size, = 50 particles and the maximum number of iterations, = 100.
We evaluated the performance of our proposed DVFS-MODPSO on all the workflow instances described previously. Due to the lack of works considering both the heterogeneous configuration of our cloud and the three QoS metrics (makespan, cost, energy) at the same time, we compared our results with those of the HEFT heuristic we have implemented. HEFT is one of the most widely used heuristics for DAGs in distributed heterogeneous computing systems which optimize the makespan. As can be seen from these figures, unlike the HEFT algorithm which gives one solution, our proposed approach provides a set of non dominated solutions. These results illustrate the basic multi-objective definitions provided in Section 4.1. Now, for comparing these two approaches, and in order to analyze the effectiveness of our proposition, in terms of the values of makespan, cost and energy consumption, we have been inspired by the methodology described in [43] where we compare the solution provided by HEFT to only one solution of the Pareto front set computed by our proposed multi-objective DVFS-MODPSO.
The evaluation process follows the next steps. For each workflow instance, we perform a first resolution using HEFT in order to get one solution . After, a second resolution is computed using our proposed approach to obtain a set EA of Pareto solutions. Next, we select one solution from the set EA of Pareto solutions. This solution is the closest one to in the sense of Euclidean distance. Finally, a comparison is done between and .  Tables 3 and 4. They show the improvement achieved by our proposed approach according to the structure of the workflow applications and the number of processors. Table 3 and Figure 6 show that our proposed approach improves on average the result obtained by HEFT for the synthetic workflow instance. The makespan is reduced by 0.05%, the cost is reduced by 9.93% and the energy consumption is reduced by 26.40%. Figure 6 shows detailed improvements of the proposed approach according to the number of processors. Table 4 and Figure 7 illustrate the gain obtained by our approach according to the kind of real world applications. Table 4 also shows the detailed improvements when scaling the number of processors. As can be seen, the QoS metrics are improved over HEFT in average by 0.95% for the makespan, 10.8% for cost and 8.12% for the energy consumption when using the hybrid workflow application, and average improvements of 2.95% for makespan, 22.15% for cost and 20.9% for energy consumption when using the parallel workflow.
These evaluations confirm the results obtained from the synthetic workflow. This means that by using our DVFS-MODPSO, we are able to improve not only the cost and energy but also the makespan on which HEFT algorithm is supposed to provide good results. These results remain true for all kind of workflow applications.
In our experiments, we limited the number of resources to 12, as we found that, for these kinds of workflow applications, some resources are not used at all. Figure 8 shows the resource loads when scheduling the synthetic workflow on 12 resources. Furthermore, even though the number of resources increased, the total cost and total energy consumption could not always decrease as illustrated in Figure 9. From this figure, we can see that the QoS metrics (1) firstly decreases as the number of resources increases until achieving the number 6. This can be explained by the fact that when increasing the number of resources, there are fewer tasks executed in a resource, therefore, tasks can extend their execution times and the resource have more of chances to scale down their voltages and frequencies which can be very effective in reducing total energy consumption.   (2) After achieving 6 resources, the QoS metrics begin to rise; the reason for this is that time executions are dominated by interprocessors communications, hence decreasing the opportunities for scaling down voltages and frequencies of resources. Consequently, the threshold numbers of resources that minimized the QoS metrics could be obtained.

Conclusion
In this paper, we propose a new algorithm called DVFS-Multi-Objective Discrete Particle Swarm Optimization (DVFS-MODPSO) for workflow scheduling in distributed environments such as cloud computing infrastructures. DV-FS-MODPSO simultaneously optimizes several conflicting objectives namely, the makespan, cost and energy in a discrete space. It produces a set of non-dominated solutions, thus providing more flexibility for users to assess their preferences and select a schedule that meets their QoS requirements. Our approach exploits the heterogeneity and the marketization of cloud resources in order to find solutions that optimize makespan and cost. Furthermore, it uses the Dynamic Voltage and Frequency Scaling (DVFS) technique to reduce energy consumption. We have evaluated our algorithm by simulating it with both synthetic and real world scientific workflow applications having different structures. The results show that the proposed DVFS-MODPSO is able to produce a set of good Pareto optimal solutions. The results also show that our approach provides significant improvement not only in terms of the   cost and the energy consumption but also in term of the makespan for which HEFT algorithm is supposed to give better results. Multi-objective optimization in cloud workflow scheduling is not a mature domain. Most of the existing works attempt to minimize either the makespan or cost. However, we plan to consider other objectives such as reliability, security in addition to the energy consumption. We also  intend to apply our algorithm in a real world cloud or integrate it in existing cloud toolkits such as Cloudbus for comparing with other algorithms.