Energy-Aware Real-Time Task Scheduling for Heterogeneous Multiprocessors with Particle Swarm Optimization Algorithm

Energy consumption in computer systems has become a more and more important issue. High energy consumption has already damaged the environment to some extent, especially in heterogeneous multiprocessors. In this paper, we first formulate and describe the energy-aware real-time task scheduling problem in heterogeneous multiprocessors.Then we propose a particle swarm optimization (PSO) based algorithm, which can successfully reduce the energy cost and the time for searching feasible solutions. Experimental results show that the PSO-based energy-aware metaheuristic uses 40%–50% less energy than the GA-based and SFLA-based algorithms and spends 10% less time than the SFLA-based algorithm in finding the solutions. Besides, it can also find 19% more feasible solutions than the SFLA-based algorithm.


Introduction
Multiple processing in heterogeneous computing platforms adapts to different types of computing needs.Using multiple processing platforms will improve the system performance and satisfy the increase in energy consumption.However, assigning real-time tasks to a multiprocessor implementation proves to be an NP-hard problem.The problems of real-time task allocation in a heterogeneous environment have been studied extensively in the existing references.However, most of the studies focus on the performance metrics of how to minimize the maximum utilization and these problems can be mapped to the traditional makespan problem [1].
Energy consumption has become a major problem in computer systems; the processor consumes most of the energy, especially in embedded systems, where the excessive energy consumption will cause serious pollution and waste of resources in the natural environment [2,3].Therefore, how to reduce processor energy consumption becomes a widespread concern.We need to focus on the problem from reducing the maximum utilization to energy consumption under the premise of meeting the specified task deadlines.
Although it is an NP-hard problem, there are many approximation algorithms for solving the problem of realtime task allocation in a heterogeneous processor environment, including traditional real-time task scheduling algorithms such as deadline-monotonic (DM) algorithm [4], rate-monotonic (RM) algorithm [5], least-laxity-first (LLF) algorithm [6], earliest-deadline-first (EDF) algorithm [5], and linear programming-based (LP) algorithm [7] and the swarm intelligence algorithms such as ant colony optimization (ACO) [8], genetic algorithm (GA) [9][10][11], and shuffled frog-leaping algorithm (SFLA) [12,13].In these studies, most algorithms do not consider energy consumption factors.Besides, the number of feasible solutions and energy saving are in conflict.Therefore, we need to find a new algorithm to solve this multiobjective optimization problem.
The heuristic algorithm in [14] is an adaptive algorithmic structure; it can be used to adapt to a series of relatively wide range of issues.Though many heuristic algorithms exist, the particle swarm optimization (PSO) algorithm emerges as a novel heuristic algorithm in recent years.This algorithm is inspired by the social behavior of a group of migratory birds that try to reach an unknown destination.Each bird is referred to as a particle.Each particle has a fitness value determined by the function to be optimized and a speed that determines their flight direction and distance; then the particle with the current optimal particle to search the optimal solution is chosen in the solution space.Compared with the genetic algorithm, the PSO algorithm has no processes such as reproduction, crossover, and mutation; it is only through simple operation for evolution, which is easy to achieve, and the efficiency is better.
The aim of this work is to propose a new algorithm to solve the problem of real-time task scheduling in a heterogeneous processor environment, under the premise of meeting all task deadlines to reduce energy consumption.
The main objective of the work is as follows.
(1) Formulate the real-time task scheduling problem based on energy awareness and add more constraints.Put the energy consumption as the utility into the constraint condition.
(2) Based on the PSO algorithm, propose one algorithm that can solve the problem of real-time task scheduling based on energy awareness, which can find as many feasible solutions as possible before the specified deadlines and minimize the energy consumption.
(3) Through a series of comprehensive experiments, make a comparison of the proposed algorithm with the existing traditional algorithms and the other heuristic algorithms, to improve the algorithm so as to achieve the purpose of the optimization.
This paper is organized as follows.Section 2 presents the state of the art of the task scheduling problem in multiprocessor platforms.Section 3 formulates the problem of real-time task scheduling in heterogeneous processors based on energy awareness.Overview of the PSO algorithm and the proposed energy-aware real-time task scheduling algorithm based on the PSO algorithm is introduced in Section 4. We analyze the performance and results of the proposed algorithm in Section 5.The final part gives the conclusion and summary and provides directions for future work.

Related Works
Baruah [15] has made a study on task scheduling in heterogeneous multiprocessor platforms, and some improvements have been made for the ACO heuristic algorithms and the improved algorithm performs very well in finding the feasible solutions under time constraint.Braun et al. [14] conclude 11 heuristic algorithms that can be applied to task scheduling in heterogeneous multiprocessor platforms to reduce the execution time.However, Braun et al. assume that each task on each machine has accurate execution time and has no time constraints.In the experimental results, the GA has a good performance.So far, there are available heuristic algorithms such as GA, ACO algorithm, SA algorithm, PSO algorithm, and SFLA algorithm [12,13], and these algorithms have been applied to task scheduling in multiprocessor platforms.Baruah [16] converts the task scheduling problem in multiprocessor platforms into an ILP problem and proposes an approximate polynomial time algorithm.However, the LP problem has a lot of feasible solutions; a polynomial time algorithm is not guaranteed to find the fundamental solution.Similarly, Leung and Whitehead [4] convert the task scheduling problems in multiprocessor platform, in which tasks can be divided and have priority to LP.However, they believe that each task can be arbitrarily segmented, but this assumption is limited in practice.
Baruah [17] puts forward a polynomial time algorithm for the task scheduling problem in multiprocessor platforms in which tasks are preemptive and transitive.Its purpose is to achieve task scheduling under a series of real-time tasks constraints in heterogeneous processor platforms.However, they ignore the communication overhead and that a task is dividable.
The task scheduling problem in multiprocessor platforms not only needs to solve the increasing number of the feasible solutions but also reduces the energy consumption of each found feasible solution.At present, there are a lot of researchers studying task scheduling in multiprocessor platforms to reduce energy consumption.But, in general, the PSO algorithm has not been used in these subjects.
Cheng et al. [8] propose an improved ACO algorithm in multiprocessor platforms task scheduling which can find sufficient feasible solutions, while satisfying the time constraints.But this algorithm is not PSO related and does not make a certain improvement in energy consumption.Baruah [17] in multiple processing scheduling applications describes a non-ACO algorithm.The algorithm can effectively reduce energy consumption and reduce scheduling time, but the algorithm needs a premise: all the tasks must have the same computation time.In addition, the algorithm is not PSO-based.Aydin and Yang [18] propose the worst-fit-decreasing algorithm in multiprocessor task scheduling to reduce energy consumption and meet the deadline.However, this heuristic algorithm is not a PSO-based algorithm.Zhu et al. [19] put forward the corresponding effective algorithm in multiprocessing task scheduling, but it is not PSO-based algorithm, either.
The evolutionary algorithms thick swarm intelligence optimization algorithm as the goal, such as ACO algorithm, evolution strategy, and GA, can solve the problem of multiobjective combinatorial optimization and obtain a better solution, but the algorithm is complex and has low efficiency.So, looking for a more effective task scheduling and allocation algorithm is very important.
Particle swarm optimization (PSO) algorithm [20,21] is a new global optimization algorithm, the same with the other swarm intelligence algorithm; all belong to the group of intelligent evolutionary computation technology.Randomly initialize population and then evaluate it according to the fitness function, so as to determine whether to have further search.However, the PSO-based algorithm has no operation such as reproduction, crossover, and mutation, only works through simple arithmetic for evolution, and is simple and easy to achieve.As an important tool of optimization, the PSO-based algorithm can be applied in cloud computing and information retrieval [22,23].The learning factor of the particle 5  2 The learning factor of the particle 6  The inertia weight of the particle 7 Rand() The random function between 0 and 1 8 Fitness The quality evaluation standard of particle

Problem Formulation
Each task is assigned to a particular processor and does not exceed any of the computing capacity of the processor without exceeding the deadline of the task.In general, the computation time and deadline for each task are known.But for now, some real-time tasks are dynamically changed.A series of periodic tasks is assigned to the series of heterogeneous processor and does not exceed the deadline.The problem is an NP-hard problem.We solve this problem based on particle swarm optimization (PSO).

Heterogeneous Multiprocessors Platforms. HMP
in each clock cycle executes only one command and determines speed according to the type of task. , is the clock frequency and is the speed to perform a specific task   . , refers to the execution time of   on the   ,  , =   / , , where   is the clock cycles needed for the execution of   task.

Periodic Task
Set. PTS = { 1 ,  2 , . . .,   } consists of  real-time tasks.  is made up of a binary group (, ), where  presents WCET (it is estimated as the worst case execution time);  is the task period.  generates an infinite sequence of tasks; each task is at most  time units, and the interval is  time units.The deadline of each   is  time units after the arrival of   .

Real-Time Task Scheduling, Energy Utilization, and Energy
Consumption.We build a task scheduling situation matrix  × (see Table 2).Matrix element  × indicates whether task   can be assigned to processor   .The value of element  × is 0 or 1, respectively, which indicates that task   is not assigned to the processor   and task   is already assigned to processor   .
The energy consumption matrix in the real-time task scheduling problem on heterogeneous processors is presented by  × ; its element  , is computed as  , =  , /  which shows the energy consumption it takes to execute task   on processor   . , is a real number whose range will be set (0, 1) ∪ +∞; if task   cannot run on processor   , then  , is set +∞.
Energy consumption of   in processor   on each cycle is as follows: where   and  are constants.Thus,  , ∝   ×  , , and the energy consumption is linear.The total energy consumption on the  processors is Here we define the theoretical maximum energy consumption value as (3)

Calculation of the Fitness Function.
The fitness function is defined as the ratio of actual energy consumption and theoretical energy consumption.By (1), we assume that   and  are constants and are set as 1.The theoretical maximum energy consumption is calculated as MaxEC =   ×  The first one is to look for each task assigned to a specific processor and makes the utilization of each processor that does not exceed its maximum utilization.The second one is the energy consumption, which is to find a feasible solution to minimize the energy consumption on the corresponding processor.

PSO Algorithm for Energy-Aware Real-Time Task Scheduling Problem
4.1.Introduction to the PSO Algorithm.The PSO algorithm [21] was first proposed by Eberhart and Shi.It is a kind of evolutionary computation theory.The PSO algorithm is inspired by a social behavior of a group of migrants trying to reach an unknown destination.In the PSO algorithm, each solution is a group of birds and each bird is said to be a particle.All particles have a fitness value which is determined by the function to be optimized and each particle has a speed which determines its flight direction and distance and then the particle searches the optimal solution in solution space with the current optimal particle.The PSO algorithm and GA are both based on the iterative method.A particle is similar to a chromosome in the GA.But unlike the GA, an evolutionary process does not generate new members from the parent member in the PSO algorithm but only changes its own social behavior according to the process of moving towards the destination.
In fact, the PSO algorithm imitates the communication of the birds when they are flying together.Each bird moves towards a certain direction; when in communication, it determines the best position.Therefore, each bird depends on the current position at a particular speed towards the best birds.Then, each bird forms its new location to view their search space and repeats the process until the bird reaches the desired destination.It is important to note that the process also involves the interaction and intelligence in the community, in order to learn from their own experience (local search) and from the surrounding particles experience (global search).
The PSO algorithm is initialized in the initial time for a group of random particles.The th particle is presented as the position of an -dimensional space as a point and  is the number of variables.In the entire process of the PSO algorithm, each particle  displays three variables: the current position of the particle CP  , the best position of the previous iteration of the loop the particle has reached BP  , and flight speed of the particle   .These three variables are represented with a component form as follows: CP = ( 1 ,  2 , . . .,   , . . .,   ) , BP = ( 1 ,  2 , . . .,   , . . .,   ) , In each time period, the best position CP  of particles  is calculated as all of the best adaptations.Therefore, each particle updates its own speed   to catch up on the best particle  as follows: According to the above formula and making use of the new speed, we update the position of the particle as follows: in which − max ≤   ≤  max .We called  1 ,  2 the learning factors which are two constants; rand() and Rand() are two random functions which range in [0, 1];  max is the maximum velocity limit of the particle;  is an inertia weight used to affect the current speed.In the formula (6), the second component presents the thought of its current position and the best position.On the other hand, represented by the formula (1), the third component is the cooperation between the particles, comparing the current position of a particle and the best position.

Applying the PSO Algorithm to eRTSP 4.2.1. Building Energy Matrix and Time-Consuming Matrix.
The eRTSP problem can be represented as a bipartite graph.There are two types of nodes: PTS and HMP.A task is mapped to a node of PTS, and a processor is mapped to a node of HMP.If and only if a task can be assigned to the corresponding processor and does not exceed the maximum computing power limit, there is an edge between the two nodes.This assignment consumption directly relates to the energy consumption of the task on the processor.
Therefore, in general, we construct an × energy matrix:  represents the  tasks,  represents the  processors, and  , is represented by the energy utilization of the task  in the th processor.Each value of the matrix is set as (0, 1) ∪ +∞; if no tasks are assigned on the particular processors, we set the corresponding value of the element in the matrix +∞.Now, we define the constraints: in each row there can only be an element to be visited; accumulated value of the energy of each column cannot exceed 1.
The same as energy consumption matrix, we can build a matrix recording the running time of a task in the corresponding processor.Each element in the matrix

The Update of the Velocity, Position, and Inertia
Weight of the Particle.The velocity of the particles is the critical factor for the positions of the particles.The velocity of the particles will affect the overall convergence of the PSO algorithm and will affect the efficiency of the algorithm's global searching.We consider (6) as a speed profile.The particle's position updates present the next position of the task.As the particle position updates, we have mentioned formula (7) in the third section, CP   = CP  +   .When    > 0, it indicates that it needs to adjust the number of the processor, and then, CP   =    ; otherwise, the position of the particle remains unchanged; that is, CP   = CP  .The parameter  in the PSO algorithm plays a balanced role in global searching and local searching.And over time, the number of iterations increases gradually while  linearly reduces.The formula of updating  is where  is the number of iterations and  max is the total number of iterations.

Optimization of the Energy Consumption.
When we find a feasible solution by the PSO algorithm, we often need to optimize the feasible solution to achieve the second objective: energy consumption target, that is, forthcoming a feasible solution with high energy consumption through a task assigned to other processor or exchanging their corresponding processor running two tasks to reduce the overall energy consumption.
In the initial state, for a processor if its utilization is greater than 1, we extract the task with maximal energy consumption in the processor, run this task in the processor with the lowest utilization, and compare the utilization of the processor to see whether it is greater than 1.If it is not greater than 1, then the corresponding coordinate of this task is updated.
Thereafter, in accordance with the calculated corresponding local and global optimum position of each task and from formula (2), the speed of the particles has to be updated.Subsequently, we check the speed of particles, in case utilization is less than the upper limit of the maximum utilization of each processor; if the speed  is greater than  max ,  will be assigned to  max ; if the speed is less than 0, then the speed is set as 0.
In the optimization, first we backup and then analyze the following three cases.
(1) Particle.v> 0 and Particle.v≤  max .Let Particle.xequal Particle.vand calculate the corresponding utilization of the processor, in case guaranteed utilization is less than 1, recalculating fitness value.If the energy consumption ratio has been decreased, we modify the original solution and update the value of Particle.x.If there is no reduction of the energy consumption ratio, we do not change the original solution.
(2) Particle.>  up .We will let the value of Particle.xbe  up and recalculate the corresponding processor utilization of  up , in case guaranteed utilization is less than 1; we recalculate the fitness value and observe whether the energy consumption ratio has been decreased.If declined, we will alter the original plan; if not, we will not change the original plan for the assignment.
(3) Particle.< 0. The general idea is the same with the second case; we will assign Particle.xto 0 and recalculate the utilization of their corresponding processor, in case of utilization is less than 1; we recalculate the fitness value and observe the ratio of the energy consumption to see whether it has been declined; we will change the original plan for the scheduling if so; if not reduced, we will not change the original plan for the assignment.
We assume that if the fitness value does not decrease or remain the same; we quit the iteration and return after iteration 1000 in the PSO algorithm.

Experiment and Result Analysis
In this section, at first, as for the PSO algorithm, we want to determine its parameters in resolving eRTSP.After that, we solve the eRTSP problem with the PSO algorithm and analyze the comparison of the performance of the PSO algorithm, GA, and SFLA in eRTSP with the solution quality and energy consumption.We will get the results from a large number of randomly generated problem sets with the PSO algorithm.There are a lot of different situations in problem sets and each issue is initialized as  processors and  tasks.

The Parameter of the PSO Algorithm.
According to the PSO algorithm, there are three parameters ,  1 , and  2 , which impact the performance of the PSO algorithm (see Table 1). denotes the inertia weight heavy;  1 and  2 denote the acceleration.The following experiments are set to determine the best combination of the three parameters.The results are shown in Table 4.
As seen from the results in Table 4, the results of the different parameters of the PSO algorithm running the same eRTSP problem are not the same; the parameters for  = 1,  1 = 2, and  2 = 2 in this group when solving eRTSP get the largest number of feasible solution and its running time is the shortest.Therefore, in the subsequent experiments, we will select the parameter in this group in solving the eRTSP and compare the performance with the GA and SFLA.

The Comparison of the Results among the PSO Algorithm, GA, and SFLA in eRTSP.
To show a wider range of heterogeneous environments, the use of matrix values is varied.For a periodic task   , the definition of the task frequency is the average speed of execution of the tasks before deadline and is defined as   /  .In the PTS, the variance of the frequency of the task is defined as task heterogeneity.In the HMP, for (4) Configure utilization of an  ×  matrix, whose element is TB()/Si().Accordingly, the size of the elements is affected by the degree of task and processor heterogeneity.The element size range is [0, 1/  ×  ].
In order to obtain the true and objective evaluation of the performance of each algorithm, the characteristics of the utilization matrix are task heterogeneity, processor heterogeneity, and consistency.Therefore, we generated a combination of eight kinds of experiments according to the above features: using the matrix of the high and low task heterogeneity, high and low processor heterogeneity, and being consistent or nonconsistent.High task heterogeneity is represented as 100; low task heterogeneity is expressed as 5. Highly heterogeneous processor is represented as 20; low heterogeneous processor is represented as 5.When the processor   performs any task shorter than the processor   , we use of matrix consistency.A consistent utility matrix is generated by sorting each vector, and  0 has the fastest processing speed of all the processors and processor  (−1) is the slowest.In contrast, nonconsistent utility matrix is that processor   processes fast on certain tasks than the processor   , but the processing speed is slow in the other tasks.It is an unsorted matrix randomly generated.Above 8 experimental category combinations are shown in Table 5.
From Figure 1, on the energy consumption aspect, the energy consumption of the GA and SFLA is higher than the PSO algorithm, wherein the PSO energy consumption using the different test dataset is about 40-50% of the GA and SLFA in energy consumption.
Figure 2 and Table 3 show that the three algorithms running time is different in the same environment of eRTSP; the running time of the GA is the longest, followed by the SFLA.The PSO algorithm is the fastest in 8 groups for the test in general.In particular, we focus on comparing the PSO algorithm and SLFA in the first problem set, finding that the PSO algorithm running time is 10% of the running time of the SFLA.Under consistency utility matrix conditions, the  running time of SFLA and the PSO algorithm is essentially the same.
In Figure 3, we compare the three algorithms to find feasible solution volume.The number of feasible solutions found by SFLA and PSO algorithm is less than the GA (see Table 6).The PSO algorithm in the ability to find feasible solution is slightly worse than the GA.Not all of feasible solutions in the GA in the fourth set of experiments are found; however, in other conditions all are found, but the PSO algorithm has a stronger ability to find the feasible solutions than the SFLA.The feasible solution number of the PSO algorithm is 50% more than that of the SFLA algorithm.Therefore, from the above results we can get that energy consumption and running time of the PSO algorithm on the eRTSP are relatively small, compared with SFLA; GA and has certain advantages.The GA mainly uses the crossover and mutation method and the running time of looking for a feasible solution is much slower and the energy consumption is larger.Running time with SFLA is slower and the number of feasible solutions is less than the PSO algorithm, and the PSO algorithm can find a feasible solution in most cases.When operating time between SFLA and PSO algorithm is similar, energy consumption with the PSO algorithm uses lower energy consumption than SFLA.Therefore, considering the above several function tests, it can be said that the PSO algorithm outperforms GA and SFLA.

Conclusions
This paper has a formal description of the real-time task scheduling problem in a heterogeneous environment based on energy consumption and puts forward a new heuristic algorithm based on the particle swarm optimization algorithm to solve the problem.The proposed algorithm not only finds much more feasible solutions within the specified time but also optimizes the energy consumption.According to the results of extensive experiments, the PSO algorithm has a better performance in reducing energy consumption and running time and increasing the number of feasible solutions.The energy consumption is only 40-50% of GA and SLFA.In addition, in finding the feasible solution volume, the PSO algorithm finds a total of 19% more feasible solutions than SFLA in 7 out of 8 sets of test data and finds about 8% less feasible solutions than GA.At running time, the PSO algorithm is faster than GA and SFLA and about 10% faster than SFLA.
The current study in this paper focuses on independent, nonpreemptive, and periodic tasks, but many other factors are not taken into account in real-time task scheduling problem in a heterogeneous processor environment.In the future, we will reduce the constraint conditions and study the problem of real-time task scheduling with priority and communication between tasks.

Table 1 :
The parameter of the PSO algorithm.

Table 2 :
The energy utility matrix of  processors with  tasks.

Table 5 :
Eight utilization matrix scales of PSO's parameter test.

Table 6 :
Average runtime, number of feasible solutions, and energy consumption with GA, SFLA, and PSO algorithm.