The amount of energy needed to operate high-performance computing systems increases regularly since some years at a high pace, and the energy consumption has attracted a great deal of attention. Moreover, high energy consumption inevitably contains failures and reduces system reliability. However, there has been considerably less work of simultaneous management of system performance, reliability, and energy consumption on heterogeneous systems. In this paper, we first build the precedence-constrained parallel applications and energy consumption model. Then, we deduce the relation between reliability and processor frequencies and get their parameters approximation value by least squares curve fitting method. Thirdly, we establish a task execution reliability model and formulate this reliability and energy aware scheduling problem as a linear programming. Lastly, we propose a heuristic Reliability-Energy Aware Scheduling (REAS) algorithm to solve this problem, which can get good tradeoff among system performance, reliability, and energy consumption with lower complexity. Our extensive simulation performance evaluation study clearly demonstrates the tradeoff performance of our proposed heuristic algorithm.
1. Introduction
For a long time, energy consumption has simply been ignored in the performance evaluation in large-scale parallel computing systems. However, Intelligence (DCDi) Industry Census reported that the amount of electricity consumed by global data centers ran up to 40 GW in 2013, and it was also with a 7% increase [1]. According to the latest world’s Top 500 supercomputers Ranking, the power consumption of first supercomputer “Tianhe-2” is 17.808 MW and average power consumption for Top 10 systems in Ranking list is 6.2939 MW, respectively [2]. Thus, it is obvious that high energy cost is a key feature of designing and applying heterogeneous systems.
On the other hand, computing systems are a group of heterogeneous processors connected via a high-speed network that supports the execution of parallel applications. For example, the Top supercomputer “Tianhe-2” in Top 500 lists consists of Intel Xeon® E5-2692 12C 2.200 GHz and Intel Xeon Phi 31S1P (MIC) [2]. For each processor, the number of transistors integrated into today’s Intel Xeon EX processor reaches to nearly 2.3 billion and its power consumption over 130 W [3]. This implies the possibility of worsening single processor reliability, eventually resulting in poorness of the whole heterogeneous system reliability. Furthermore, the modern large-scale computing systems usually have a lot of processors, such as “Tianhe-2” with 3,120,000 cores and “Titan” with 560,640 cores [2]. One of the main problems existing in this situation is system reliability, which drastically decreases as the number of processor cores increases [4]. Even when the single processor’s one-hour reliability becomes very high, such as 0.999999, as the system size approaches 10,000 cores, the system’s MTTF (the Mean Time to Failure) drops to less than 10 hours [4]. This also allows us to focus primarily on the main problem of this paper, which is the simultaneous management of system performance, reliability, and energy consumption.
In recognition of this, we first build a reliability and energy aware task scheduling architecture including precedence-constrained parallel applications and energy consumption model on heterogeneous systems. Then, we propose the single processor failure rate model based on DVFS technique and deduce the application reliability of systems. Finally, to provide an optimum solution for this problem, we propose a heuristic Reliability-Energy Aware Scheduling (REAS) algorithm, which adopts a novel scheduling objective RE. The overall objective of this paper is trying to get good tradeoff among performance, reliability, and energy consumption.
The rest of the paper is organized as follows: the related work is summarized in Section 2. We describe the task scheduling system model in Section 3. In Section 4, we provide a system reliability model. To solve this problem, a heuristic reliability and energy aware task scheduling algorithm is proposed in Section 5. In Section 6, we verify the performance of the proposed algorithm by comparing the results obtained from performance evaluation. Finally, we summarize the contributions and make some remarks on further research in Section 7.
2. Related Work
The high-performance parallel application running on computing systems is usually composed of intercommunicated tasks, which are scheduled to run over different processors in the systems. In most cases, the main objective of scheduling strategies is to map the multiple interacting program tasks onto processors and order their executions so that task precedence requirements are satisfied and, in the meanwhile, the minimum schedule length (makespan) can be achieved. The problem of finding the optimal schedule is NP-complete in general [5–9]. There are many scheduling algorithms that have been proposed to deal with this problem, for example, dynamic-level scheduling (DLS) algorithm [6] and heterogeneous earliest-finish-time (HEFT) algorithm [5, 8, 10, 11].
As the energy consumption has become important issue in designing large-scale computing systems in the last few years, many techniques including dynamic voltage-frequency scaling (DVFS), dynamic powering on/off, slack reclamation, resource hibernation, and memory optimizations have been investigated and developed to reduce energy consumption [12–14]. DVFS, which is a technique in which a processor runs at a less-than-maximum frequency when it is not fully utilized in order to conserve power, is perhaps the most appealing method for reducing energy consumption [14, 15]. Most of the early DVFS-enabled researches focused on the single processor of embedded and real-time computing systems [14, 16, 17]. Recently, there has been a significant amount of work on task scheduling for heterogeneous systems using DVFS-enabled techniques. For instance, Rountree et al. focused on energy optimization of MPI program in HPC environment and proposed a linear programming (LP), which incorporates allowable time delays, communication slack, and memory pressure into its scheduling using DVFS (i.e., slack reclamation) [18]. Rizvandi et al. proposed a method to find the best frequencies of processor to obtain the optimal energy consumption [19]. Lee and Zomaya addressed the problem of scheduling precedence-constrained parallel applications on multiprocessor computer systems and their scheduling decisions are made using the relative superiority metric (RS) devised as a novel objective function [20]. In [21], Zong et al. proposed two energy-efficient scheduling algorithms (EAD and PEBD) for parallel tasks on homogeneous clusters based on duplication strategy.
All of this work demonstrated that dynamic adjusting the processor’s voltage and frequency can effectively reduce system energy consumption. However, recent researches have illustrated that scaling the processor’s voltage and frequency has negative impact of nanoscale semiconductor circuits’s cosmic ray radiations, electromagnetic interference, and alpha particles, which enforce the unreliability of processor [22–24]. Thus, it is a good way to incorporate the reliability into energy aware scheduling based on DVFS. Recently, Zhu etc. focused on reducing energy consumption while preserving the system reliability for periodic real-time tasks [25, 26]. They proposed a reliability model that the processor’s reliability decreases as scaling their voltage and frequency from max to min and incorporated the reliability requirements into heuristic energy aware task scheduling strategies. However, their techniques are not suitable for precedence-constrained parallel applications on heterogeneous systems based on DVFS-enabled processors.
Many researches had dealt with the reliability on heterogeneous systems. For example, Dogan and Özgüner introduced three reliability cost functions that were incorporated into making dynamic level (DL) and proposed a reliable dynamic level scheduling algorithm (RDLS) [27]; the goal was to minimize not only the execution time but also the failure probability of the application. In our previous work [8], we propose a scheduling algorithm which considers the task’s execution reliability. Qin and Jiang investigated a dynamic and reliability-cost-driven (DRCD) scheduling algorithms for precedence-constrained tasks in heterogeneous clusters [28]. Unfortunately, those works did not consider the energy consumption and the reliability of scaling the processor’s voltage and frequency. In recognition of this, we focus on the reliability and energy consumption on DVFS-enabled heterogeneous systems.
3. System Models3.1. Scheduling Architecture
Various task scheduling architectures are proposed in literature [5, 8, 9, 14, 28, 29]. However, the energy consumption and system reliability are not effectively incorporated into scheduling. In this paper, we propose a reliability and energy aware task scheduling architecture, as depicted in Figure 1(a). It is assumed that all parallel applications, along with information provided by user, are submitted to system by a special user command. First, the parallel applications are divided as a task DAG by Task DAG Model. Then, the estimate energy consumption of tasks, which are executed on the DVFS-enabled heterogonous processors, is computed by the Eneregy Consumption Estimator. At the same time, reliability analysis computes the processors’ reliability according to different frequency to get the whole system reliability. Finally, the Scheduler schedules tasks based on the above task energy consumption and system reliability.
(a) The reliability and energy aware task scheduling architecture. (b) A parallel application task graph.
3.2. Heterogeneous Systems
The target system used in this work consists of a set of P={p1,p2,…,pm} heterogeneous processors/machines [5, 8, 9, 14, 29], which are connected by high-speed interconnects, such as Infiniband and Myrinet. Each DVFS-enabled processor pk∈P can adjust its operational voltage and frequency [14]. Therefore, they can be executed on discrete set of frequency-voltage pairs, (fk,l,Vk,l), in which (fk,1<fk,2<⋯<fk,Mk) and (Vk,1,Vk,2<⋯<Vk,Mk), where Mk is processor pk’s operation level [14, 30]. For example, the quad-core AMD Phenom II supports 4 different frequencies (0.8 GHz, 2.1 GHz, 2.5 GHz, and 3.2 GHz) and voltages ranging from 0.85 V to 1.425 V [30]. Since clock frequency transition overheads take a negligible amount of time (e.g., 10 us–150 us), these overheads are not considered in our study.
The heterogeneous processor’s failure is assumed to follow a Poisson process and each processor has a constant failure rate λ [8, 9, 29]. For example, λk denotes a processor pk failure rate when it works at normal voltage and frequency [8, 9, 27, 29]. These failure rates can be derived from system’s profiling, system log, and statistical prediction techniques [31]. For demonstration purposes, we illustrate two heterogeneous processors, one has 3 frequency levels and the other has 2 frequency levels, and the parameters are listed in Table 1.
The parameters of heterogeneous processors.
λk
Ps[k]
α[k]
(fk,l, Vk,l)
1
2
3
p1
1.4 × 10^{−4}
73.6
3.663 × 10^{−8}
(0.8 × 10^{9}, 0.93)
(2.1 × 10^{9}, 1.23)
(3.2 × 10^{9}, 1.43)
p2
1.62 × 10^{−4}
57.1
4.95 × 10^{−8}
(2.3 × 10^{9}, 0.85)
(3.0 × 10^{9}, 1.36)
3.3. Applications Model
The precedence-constrained tasks of parallel application are usually denoted as a Directed Acyclic Graph (DAG) G=V,R,[di,j,wi,k,l] [5, 8–10, 29], where V={v1,v2,…,vn} is the set with n tasks that can be scheduled to any available DVFS-enabled processors [5, 8–10, 29]; R represents the precedence relation that defines a partial order on the task set V, such that viRvj implies that the task vi must be finished, before vj can start execution [5, 8–10, 29]. [di,j] is n×n communication matrix that denotes the communication time between tasks vi and vj for 1≤i, j≤n. [wi,k,l] is n×m×Mmax computation matrix in which each wi,k,l gives the estimated time to execute task vi on processor pk at frequency fk,l. Here, Mmax is the maximal operation level on systems. The communication cost and computation cost can be evaluated by building a historic table and using code profiling or statistical prediction techniques [31]. Figure 1(b) shows a parallel application DAG, Table 2 lists the tasks execution time on two heterogeneous DVFS-enabled processors listed in Table 1, and the communication time among these tasks is listed in Table 3.
Task estimated execution matrix [wi,k,l].
Task
p1
p2
p1,1
p1,2
p1,3
p2,1
p2,2
v1
11.12
2.28
2.87
3.89
3
v2
36.29
13.82
9.12
12.7
9.78
v3
15.46
5.91
3.92
5.45
4.21
v4
5.33
2.01
1.4
1.94
1.49
v5
66.77
25.44
16.77
23.28
17.83
v6
13.82
5.3
3.53
4.84
3.75
v7
7.43
2.86
1.89
2.68
2.04
v8
8.48
3.19
2.09
2.91
2.31
Estimated communication matrix [di,j].
Task
v2
v3
v4
v5
v6
v7
v8
v1
6.99
15.48
6.69
v2
10.86
v3
1.25
12.56
v4
6.93
0.3
v5
0.11
v6
6.535
v7
6.2
Generally, the common objective of task scheduling is to map tasks with precedence constrained onto processors and get a minimum schedule length (which is also called makespan) [10, 11]. Before presenting the schedule length, it is necessary to define the scheduling attributes EST and EFT of task vi. EST(vi,fk,l) denotes the earliest execution starting time of task vi∈V on DVFS-enabled processor pk∈P at frequency fk,l, which is constrained by tasks precedence relation and the available time of processor pk [5, 8–10, 29]. EFT(vi,fk,l) is the earliest execution finish time of task vi on processor pk at frequency fk,l, which is described as(1)EFTvi,fk,l=ESTvi,fk,l+wi,k,l.
In this paper, let Xi,kl=1 denote the task vi scheduled on processor pk at frequency fk,l; otherwise Xi,kl=0. Thus, the schedule length is defined as follows:(2)makespan=Max1≤i≤n,1≤k≤m1≤l≤MkXi,klEFTvi,fk,l.
3.4. Energy Model
The major energy consumption of computing systems depends on its memory, disks, CPUs, and other components. This paper only considers DVFS-enabled CPUs, which consume the largest proportion of energy on systems [14, 19, 20, 32]. The power consumption of DVFS-enabled microprocessor based on complementary metal-oxide semiconductor (CMOS) logic circuits mainly consists of static power and dynamic power dissipation, which can be modeled as [25, 26](3)P=Ps+∅Pd,where Ps is the static power, which is a constant and the power used to maintain basic circuits and keep the clock running, and frequency-independent active power. ∅ denotes the processor’s model, if processor is at execution model, ∅=1; otherwise, ∅=0. Pd is the most significant factor of processor power consumption and can be estimated as [14, 16, 19, 20, 32](4)Pd=αV2f,where α represents the switched capacitance, V is the supply voltage, f represents processor’s working frequency, and σ stands for circuit dependent constant. The example of such processor parameters is listed in Table 1.
Let EN(vi,fk,l) be the energy consumption caused by task vi running on DVFS-enabled processor pk at frequency fk,l, of which it is determined by task execution time and processor power consumption:(5)ENvi,fk,l=wi,k,l×Pk=wi,k,l×Psk+wi,k,l×Pdfk,l,where Pd(fk,l) denotes dynamic power dissipation of processor pk at frequency fk,l (see (4)). Thus, for an application G, the energy consumption EN(V) is the summation of all tasks of energy consumption:(6)ENV=∑1≤i≤n,1≤k≤m1≤l≤MkXi,klENvi,fk,l=∑1≤i≤n,1≤k≤m1≤l≤MkXi,klwi,k,l×Psk+Xi,klwi,k,l×Pdfk,l.
At the same time, for heterogeneous systems, all processors are power-on; they are sleep or execution model. That is to say, all processors of systems consume staticpower all the time. Thus, the computing systems energy consumption EN(P) is the summation of all processors static power and dynamic power dissipation of application energy consumption:(7)ENP=makespan×∑k=1,2,…,mPsk+∑1≤i≤n,1≤k≤m1≤l≤MkXi,klwi,k,l×Pdfk,l.
Obviously, systems energy consumption EN(P) is greater than application energy consumption EN(V). In this paper, one of our main objectives is to minimize systems energy consumption EN(P).
4. System Reliability Analysis and Problem Statement
In this section, we first provide the single DVFS-enabled processor failure rate model. Then, we analyze heterogeneous systems reliability. At last, we formulate the reliability and energy aware task scheduling as a linear programming problem.
4.1. Single DVFS-Enabled Processor Failure Rate
Among various sources of unreliability in a semiconductor circuit processor, it is predicted that the failure rate due to cosmic ray radiation-induced soft errors dominates all other reliability issues [24]. Transient fault occurs when a high energy particle such as alpha or neutron strikes a sensitive region in a semiconductor device and flips the logical state of the struck node [33]. Most of the modern DVFS-enabled processor is the integration of multibillion transistors on a single chip leading to increasing number of sensitive devices in submicron technologies which is vulnerable to soft error and consequently raises the Soft Error Rate (SER) [34]. These phenomena become more and more serious with the continued scaling of processor’s voltage and frequency [23, 25].
Traditionally, the modern DVFS-enabled processor’s reliability has been modeled as the following Poisson distribution with a failure rate λ when it works at normal voltage and frequency [8, 9, 27, 29, 35]. Moreover, it has been shown that DVFS has a direct and negative effect on failure rates as blindly applying DVFS to scale the supply voltage and processing frequency for energy savings, which may cause significant degradation in processor’s reliability [23, 25, 26]. Therefore, for the DVFS-enabled heterogonous processor pk∈P to be considered in this paper, the failure rate at a reduced frequency fk,l (and the corresponding voltage Vk,l) can be modeled as(8)λkfk,l=λk·Hkfk,l,where λk is the failure rate corresponding to the normal processing frequency fnm (and corresponding to normal voltage Vnm). Prior researches which studied the effect of normal voltage on processor’s reliability have revealed that the failure rates generally increase with scaled processing frequencies (and supply voltages) away from normal voltage [24, 36]. On the other hand, the fault rates are exponentially related to the circuit’s critical charge (which is the threshold voltage). Thus, we have the following equations:(9)Hkfk,l=eψkVtk10ξkfk,l-fnm/fmax-fminfnm≤fk,l≤fmaxeψkVtk10ξkfnm-fk,l/fmax-fminfmin≤fk,l≤fnm,where the exponent ψk is the parameter of threshold voltage and ξk is a constant, representing the sensitivity of fault rates to frequency scaling, and fmin and fmax denote the minimum and maximum frequency, respectively.
In order to get the precision value of parameters ψk and ξk, we use least squares curve fitting method [37]. Therefore, the natural logarithm of both sides for (9) is(10)lnHkfk,l=ψkVtk+ξkln10fk,l-fnmfmax-fminfnm≤fk,l≤fmaxψkVtk+ξkln10fnm-fk,lfmax-fminfmin≤fk,l≤fnm.Let y=ln(Hk(fk,l)), A=ψkVtk, B=ξkln10(1/(fmax-fmin)), and C=ξkln10(fnm/(fmax-fmin)). Then, (10) becomes(11)y=A+Bfk,l-Cfnm≤fk,l≤fmaxA-Bfk,l+Cfmin≤fk,l≤fnm.Thus, we can get the parameters ψk and ξk approximation value by using least squares linear fitting method.
4.2. Application Reliability Analysis
Assume that the task v processing time has taken place during the time interval [A,B] on heterogeneous DVFS-enabled processor pk at frequency fk,l, where A denotes the task start execution time and B denotes the task finish time [5, 8, 9, 29]. Thus, the task execution reliability can be given by(12)Pv=PXB-XA=0=PXB-A+A-XA=0=exp-λkfk,lB-A.
For a task vi of application G on processor pk at frequency fk,l, its reliability P[vi,fk,l] is equal to all of its immediate parent tasks and its execution reliability, which can be defined by(13)Pvi,fk,l=∏vj∈predviPvj×exp-λkfk,l×wi,k,l,where pred(vi) denotes all direct predecessors of vi and P[vj] is the reliability of task vj that is equal to the reliability of task vi executing on processor pk at frequency fk,l(14)Pvj=∑1≤k≤m1≤l≤MkX1,klPvj,fk,l.
For the entry task v1 of application, which is executed on processor pk at frequency fk,l and pred(v1)=ϕ, its reliability(15)Pv1,fk,l=exp-∑1≤k≤m1≤l≤MkX1,klλkfk,l×w1,k,l.
Generally, application G has one exit task vexit. The reliability of application P[G] is equal to the exit task vexit:(16)PG=Pvexit=∏vj∈predvexitPvj×Pvexit,fk,l.
This is the other objective of this paper, in which we try to improve the application reliability P[G]. From the above analysis, we know that allocating tasks with less execution times to more reliable processors might be a good heuristic to increase the reliability.
4.3. Problem Statement
As simultaneous management of scheduling performance, system reliability, and energy consumption is the main problem of this paper, we formulate it as follows:(17)MinimizemakespanMinimizeENPMaximizePGs.t.Xi,kl=1 Or Xi,kl=0∑1≤k≤m1≤l≤MkXi,kl=1∀vi∈VviRvj∀vi,vj∈V.
This section presents a Reliability-Energy Aware Scheduling algorithm on heterogeneous systems called REAS, which aims at achieving lower energy consumption, high reliability, and shorter schedule length. Its scheduling decisions are made using the hybrid metric including energy consumption, reliability, and schedule length, devised as a novel objective function. The pseudocode of the algorithm is shown in Algorithm 1. The algorithm is complete in three main phases as described in the following sections.
<bold>Algorithm 1: </bold>The pseudocode for REAS algorithm.
Input: The task DAG of parallel applications
Output: The scheduling of task-processor pairs
(1) Calculate each task b_level of DAG
(2) Sort tasks in a scheduling list by non-increasing order of b_level
(3) whilethe scheduling list isnot emptydo
(4) Remove the first task vi from the scheduling list
(5) Set minF(vi), minE(vi) as maximum value
(6) foreach processor-frequency fk,l in systemsdo
(7) Compute the earliest finish time EFT(vi,fk,l) use (22)
(8) ifminF(vi)>EFT(vi,fk,l)then
(9) minF(vi)=EFT(vi,fk,l)
(10) end
(11) Compute task energy consumption EN(vi,fk,l) use (5)
(12) ifminE(vi)>EN(vi,fk,l)then
(13) minE(vi)=EN(vi,fk,l)
(14) end
(15) end
(16) Set minRE(vi) as maximum value
(17) foreach processor-frequencyfk,lin systemsdo
(18) Compute the earliest finish time EFT(vi,fk,l) use (22)
(19) Compute task energy consumption EN(vi,fk,l) use (5)
(20) Compute metric RE(vi,fk,l) use (24)
(21) ifminRE(vi)>RE(vi,fk,l)then
(22) minRE(vi)=RE(vi,fk,l)
(23) end
(24) end
(25) Assign task vi to the corresponding processor-frequency
(26) Update the processor execution finish time
(27) end
(28) “Slack reclamation
(29) foreach task in scheduling task-processor pairsdo
(30) Compute task slack time Slack(vi) use (25)
(31) foreach frequency of processor kdo
(32) Compute the optimal frequency fk,l use (26)
(33) end
(34) Reassign task vi and update corresponding data
(35) end
(36) Compute the schedule length, application reliability P[G], systems energy consumption EN(P)
5.1. Task Priorities Phase
This step is essential for list scheduling algorithms. A task processing list is generated by sorting the task by decreasing order of some predefined rank function, such as t_level, b_level, Rank, CP, and DL [5, 6, 8–10, 29]. Here, we use the average computation capacity, which is defined as(18)wvi¯=∑1≤k≤m1≤l≤Mkwi,k,l∑1≤k≤mMk.
In this research, we use b_level as the rank function. The b_level of task vi is the sum of the path weight from task vi to exit task. We can compute this value recursively traversing DAG from exit task, and it is defined as follows:(19)b_levelvi=wvi¯+Maxvj∈succvidi,j+b_levelvj+RCvi,where succ(vi) is the set of immediate successors of task vi. RC(vi) is the average reliability overhead of task vi and can be computed by(20)RCvi=1-exp-∑1≤k≤m1≤l≤Mkλkfk,l∑1≤k≤mMk×wvi¯×wvi¯.
For the exit task vexit, the b_level is equal to(21)b_levelvexit=wvexit¯+RCvexit.
Basically, b_level(vi) is the length of the critical path from task vi to the exit task, including the average computation cost and reliability overhead of task vi. For example, considering the application DAG in Figure 1(b), heterogeneous systems parameters in Table 1, task execution time matrix in Table 2, and communication matrix in Table 3, the task b_level value which is recursively computed by (19) and (21) is shown in Table 4.
The b_level value of task.
Task
v1
v2
v3
v4
v5
v6
v7
v8
b_level
73.4
61.4
42.4
26
34.15
16.6
13.7
3.8
Seq
1
2
3
5
4
6
7
8
5.2. Task Assignment Phase
In this phase, tasks are assigned to the processors with earliest execution finish time EFT(vi), high reliability, and minimum task energy consumption EN(vi). However, for heterogeneous systems, these performance metrics are conflicted most of the time. Here, we introduce a novel objective as RE, which can get good tradeoff among these metrics. We first redefine task vi earliest execution finish time on processor pk at frequency fk,l as(22)EFTvi,fk,l=ESTvi,fk,l+wi,k,l+ROvi,fk,l,where RO(vi,fk,l) is the reliability overhead of task vi on processor pk at frequency fk,l and is computed by(23)ROvi,fk,l=1-Pvi,fk,l×wi,k,l.
On the other hand, we let MinF(vi), MinE(vi) denote the earliest execution finish time and minimum task energy consumption on all processors of heterogeneous systems. Thus, the novel metric RE of task vi on processor pk at frequency fk,l is(24)REvi,fk,l=θ×EFTvi,fk,l-MinFviEFTvi,fk,l+1-θ×ENvi,fk,l-MinEviENvi,fk,l,where θ is the weight of task earliest execution finish time. If the task execution time is more important than energy consumption, we can give higher value to θ; otherwise, θ value is lower. Moreover, the scheduling objective of this problem is minimum in both schedule length and energy consumption. Thus, in each task assignment step, we try to get the minimum RE(vi,fk,l) and assign task vi to the corresponding processor frequency.
5.3. Slack Reclamation
Tasks of parallel application may have some slack time for their execution due primarily to communication events, for example, “multidimensional” intertask communication (or intertask data dependencies), and these processor slacks are an obvious source of energy wastage. Slack reclamation was studied to reduce energy consumption using the slack left by some completed task instances. The idea behind the slack reclamation for the reducing of energy consumption is to exploit the slack time to slow down the execution speeds of the remaining tasks [12, 20]. In this paper, we adopt this technique to reduce energy consumption after making the scheduling decision. The slack time of task vi is defined by(25)Slackvi=Minvj∈succviSchvj,sT-Schvi,fTif vi,vj on same processorSchvj,sT-di,j-Schvi,fTotherwise,where Sch(vi,sT) is the task vi earliest start time in scheduling processor-frequency pairs and Sch(vi,fT) is the earliest finish time.
If task slack time Slack(vi)>0, we can scale down the execution frequency to save energy consumption. Thus, the optimal frequency fk,l is satisfied with(26)wi,k,l+ROvi,fk,l<Schvi,fT+Slackvi,ENvi,fk,l<ENvi,fk,orig,where fk,orig is the original scheduling processor-frequency pairs. At last step, we reassign task vi to the optimal frequency fk,l.
6. Experimental Results and Discussion
In this section, we compare the performance, energy consumption, and system reliability using our REAS algorithm with three existing scheduling algorithms: DLS [6], RDLS [27], and ECS [20]. The experiments are performed on the synthetic randomly generated precedence-constrained parallel application graphs as described below. The performance metrics chosen for the comparison are the schedule length (2) and (22), systems energy consumption EN(P) (7), and application reliability P[G] (16).
To test the performance of these algorithms, we have developed a discrete event simulation environment of heterogeneous systems with 8 DVFS-enabled processors using C++. This simulator includes 2 Intel® Core™ Duo, 2 Intel Xeon, 2 AMD Athlon, 1 TI DSP, and 1 Tesla GPU, mostly based on Intel processor. The systems are interconnected by Infiniband, which is a switched fabric communications link primarily used in high-performance computing. For the Infiniband configuration, the switch considered is Mellanox InfiniScaleTM III SDR and NIC is Mellanox ConnectXTM IB Dual Copper Card [21]. Other parameters of the model are set as follows. The failure rates of processors are assumed to be uniformly distributed between 1×10-4 and 1×10-5 failures/hr [8, 9, 28]; the transmission rates of links are assumed to be 1000 Mbits/sec.
6.1. Randomly Generated Application Graphs
These experiments use three commonly DAG characteristics to generate parallel application graphs [5, 8, 9, 29]:
DAG Size (v). It is the number of tasks in the application DAG.
Communication-Computation Ratio (CCR). It is the ratio of communication time to computation time. A small CCR value means the application is computation-intensive; a large CCR value indicates that the application is communication-intensive [5, 8–10, 29].
Out-Degree. It is out-degree of a task node.
In experiments setting, DAG are generated based on the above parameters with the number of tasks 50 and 100. Task weights are generated randomly from uniform distribution [1×109,9×1011] execution cycles to be around 4.5×1010; thus the average task execution cycles are 4.5×1010. We also generated edge weights with a uniform distribution based on a mean CCR. Different objective parallel applications can be produced as giving various CCR values [5, 8–10, 29]. In these experiments, we varied CCR in a reasonable range of 0.1 to 10.
6.2. Various Weight <inline-formula>
<mml:math xmlns:mml="http://www.w3.org/1998/Math/MathML" id="M339">
<mml:mrow>
<mml:mi>θ</mml:mi></mml:mrow>
</mml:math></inline-formula> of REAS Algorithm
In the first experiments, we evaluate the performance of weight θ to REAS algorithm. Figure 2 shows the simulation results of scheduling 50 and 100 tasks with CCR = 1 by varying weight θ from 0 to 1, in steps of 0.2. We observe from Figure 2 that the schedule length and energy consumption decrease and the application reliability almost at the same level as the REAS algorithm weight θ increases. It is reasonable that the REAS algorithm with high θ is mostly based on task execution time and makes its schedule length shorter and consumes less energy. However, as the weight θ over 0.4, the performance of REAS is not much distinguishable. Thus, in the below experiments, we let θ=0.5.
The experimental results of REAS algorithm with various weights θ. (a) Schedule length. (b) Energy consumption. (c) Application reliability.
6.3. Random Task Performance Results
For the set of randomly generated parallel applications, the results are shown in Figures 3 and 4, where each data point is the average of the data obtained in 1,000 experiments. In this set of experiments, we assume the weight θ=0.5 of metric RE (see (24)) in REAS algorithm. In other word, the REAS algorithm has the same weight on task execution time and energy consumption. In the next section, we will examine the performance by various weights θ.
The experimental results. (a) 50-task schedule length. (b) 50-task energy consumption. (c) 50-task application reliability. (d) 100-task schedule length. (e) 100-task energy consumption. (f) 100-task application reliability.
The experimental results of 100 tasks for 4 Intel Xeon and 4 AMD Athlon. (a) Schedule length. (b) Energy consumption. (c) Application reliability.
We observe from Figure 3(a) that REAS is over RDLS and ECS with respect to schedule length, and the schedule length increases as the CCR increases. The average schedule length of the REAS algorithm is shorter than that of the RDLS and ECS by 2.6% and 1.9%, respectively. This improvement becomes more obvious as CCR increases, for CCR = 5 and REAS over RDLS and ECS by 7.5% and 2.6%, respectively. However, the REAS is inferior to DLS in terms of schedule length. Figure 3(b) reveals that REAS saves more average energy consumption than RDLS by 15.3%, ECS by 3.7%, and DLS by 16%, respectively. Figure 3(c) shows that REAS outperforms RDLS, ECS, and DLS by 0.3%, 2%, and 0.7% in terms of the average application reliability.
This is mainly due to the fact that REAS algorithm schedules tasks according to the novel objective RE, which can get effective tradeoff among task execution time, energy consumption, and task execution reliability. However, DLS algorithm only focuses on optimizing the task execution time and its actual execution time including the task scheduling time and reliability overhead. Thus, the scheduling solution generated by DLS can get optimal schedule length. However, it consumes more energy and has lower reliability. RDLS algorithm schedules tasks considering their execution reliability and ignoring task energy consumption. ECS algorithm is a solution for optimizing both schedule length and energy consumption, but this solution needs more task execution reliability overhead. Thus, REAS algorithm outperforms RDLS, ECS, and DLS in terms of the schedule length, energy consumption, and reliability. Other interesting experimental phenomena are that RDLS and DLS are better than ECS in terms of reliability. This is mainly due to the fact that tasks of solutions RDLS and DLS are always executing on the normal frequency of processor, which has the high reliability in all processor frequency.
The improvements of scheduling performance also could be concluded from Figures 3(d), 3(e), and 3(f) for 100 tasks. These results also show REAS over RDLS and ECS by 4.9% and 3.5% in terms of the average schedule length. And, REAS is also over RDLS, ECS, and DLS by 8.93%, 4.53%, and 8.24% in terms of the average energy consumption and by 1.86%, 6.28%, and 2.1% in terms of the average application reliability, respectively.
We also simulate heterogeneous systems with 4 Intel Xeon and 4 AMD Athlon; the other configurations are the same as before. Figure 4 shows the results of 100 randomly generated tasks on this heterogeneous computing platform. The results show REAS over RDLS, ECS, and DLS in terms of average schedule length and energy consumption. However, REAS is inferior to RDLS in terms of the application reliability.
6.4. Application Graphs of Real-World Problem
Using real applications to test the performance of algorithms is very common [5, 8–10, 29]. In this section, we also simulate a real-world digital signal processing (DSP) problem, and the detail can be seen in [5, 8–10, 29]. From Figure 5, we can conclude that REAS is also better than RDLS, ECS, and DLS.
The experimental results of real-world DSP problem. (a) Schedule length. (b) Energy consumption. (c) Application reliability.
7. Conclusions and Future Work
In the past few years, with the rapid development of heterogeneous systems, the high price of energy, system performance, reliability, and various environmental issues have forced the high-performance computing sector to reconsider some of its old practices with an aim to create more sustainable system. In this paper, we attempt the simultaneous management of system performance, reliability, and energy consumption. To achieve this goal, we first built a reliability and energy aware task scheduling architecture, which mainly includes heterogeneous systems, parallel application DAG model, and energy consumption model. Then, we proposed a relationship between execution reliability and processor’s voltage/frequency and deduced its parameters approximation value by least squares curve fitting method. Thirdly, we established parallel application execution reliability model and formulated this reliability and energy aware scheduling problem as a linear programming. Finally, to provide an optimum solution for this problem, we proposed a heuristic Reliability-Energy Aware Scheduling (REAS) algorithm based on a novel scheduling objective RE, which is synthetic considering the task execution time, energy consumption, and reliability.
The performance of REAS algorithm is evaluated with an extensive set of simulations and compared to three of the best existing scheduling algorithms for heterogeneous systems: the RDLS, ECS, and DLS algorithms. The comparison is also performed on the synthetic randomly generated precedence-constrained parallel application DAG. The simulation experiment results clearly confirm the superior performance of REAS algorithm over the other three, particularly in energy saving.
This work is one of the first attempts to consider the simultaneous management of system performance, reliability, and energy consumption on high-performance computing systems. Future studies in this domain are twofold. Firstly, it will be interesting to extend our model to multidimensional computing resources, such as interconnections, memory access, and I/O activities. Secondly, in this paper, the failures occurring on resources of systems are assumed to follow Poisson process. Other reliability models can also be used in further studies.
Competing Interests
The authors declare that they have no competing interests.
Acknowledgments
This research was partially funded by the National Science Foundation of China (Grant no. 61370098), Hunan Provincial Natural Science Foundation of China (Grant no. 2015JJ2078), National High-Tech R&D Program of China (2015AA015303), Key Technology Research and Development Programs of Guangdong Province (2015B010108006), and a project supported by the Science Foundation for Postdoctorate Research from the Ministry of Science and Technology of China (Grant no. 2014M552134).
http://www.datacenterdynamics.com/focus/archive/2014/01/dcd-industry-census-2013-data-center-powerhttp://www.top500.org/lists/2015/11/RusuS.TamS.MuljonoH.StinsonJ.AyersD.ChangJ.VaradaR.RattaM.KottapalliS.VoraS.A 45 nm 8-core enterprise xeonr processorCharng-DaL.TangX.LiK.ZengZ.VeeravalliB.A novel security-driven scheduling algorithm for precedence-constrained tasks in heterogeneous distributed systemsDongF.SelimG.Scheduling algorithms for grid computing: state of the art and open problems20062006-504XuY.LiK.HuJ.LiK.A genetic algorithm for task scheduling on heterogeneous computing systems using multiple priority queuesTangX.LiK.LiR.VeeravalliB.Reliability-aware scheduling strategy for heterogeneous distributed computing systemsTangX.LiK.LiaoG.An effective reliability-driven technique of allocating tasks on heterogeneous cluster systemsTopcuogluH.HaririS.WuM.-Y.Performance-effective and low-complexity task scheduling for heterogeneous computingXuY.LiK.HeL.TruongT. K.A DAG scheduling scheme on heterogeneous computing systems using double molecular structure-based chemical reaction optimizationWangY.LiK.ChenH.HeL.LiK.Energy-aware data allocation and task scheduling on heterogeneous multiprocessor systems with time constraintsVenkatachalamV.FranzM.Power reduction techniques for microprocessor systemsLiK.TangX.YinQ.Energy-aware scheduling algorithm for task execution cycles with normal distribution on heterogeneous computing systemsProceedings of the 41st International Conference on Parallel Processing (ICPP '12)September 2012Pittsburgh, Pa, USA404710.1109/icpp.2012.252-s2.0-84871132786DuZ.SunH.HeY.HeY.BaderD. A.ZhangH.Energy-efficient scheduling for best-effort interactive services to achieve high response qualityProceedings of the 27th IEEE International Parallel and Distributed Processing Symposium (IPDPS '13)May 2013Boston, Mass, USA63764810.1109/ipdps.2013.262-s2.0-84884827373HanJ.-J.LinM.ZhuD.YangL. T.Contention-aware energy management scheme for NoC-based multicore real-time systemsMeiJ.LiK.LiK.Energy-aware task scheduling in heterogeneous computing environmentsRountreeB.LowenthalD. K.FunkS.FreehV. W.de SupinskiB. R.SchulzM.Bounding energy consumption in large-scale MPI programsProceedings of the ACM/IEEE Conference on Supercomputing (SC '07)November 2007Reno, Nev, USA1910.1145/1362622.1362688RizvandiN. B.TaheriJ.ZomayaA. Y.Some observations on optimal frequency selection in DVFS-based energy consumption minimizationLeeY. C.ZomayaA. Y.Energy conscious scheduling for distributed computing systems under different operating conditionsZongZ.ManzanaresA.RuanX.QinX.EAD and PEBD: two energy-aware duplication scheduling algorithms for parallel tasks on homogeneous clustersChielleE.Lima KastensmidtF.Cuenca-AsensiS.Tuning software-based fault-tolerance techniques for power optimizationProceedings of the 24th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS '14)October 2014Palma de Mallorca, Spain1710.1109/patmos.2014.69518712-s2.0-84916911959ErnstD.DasS.LeeS.BlaauwD.AustinT.MudgeT.KimN. S.FlautnerK.Razor: circuit-level correction of timing errors for low-power operationFirouziF.YazdanbakhshA.DorostiH.FakhraieS. M.Dynamic soft error hardening via joint body biasing and dynamic voltage scalingProceedings of the 14th Euromicro Conference on Digital System Design: Architectures, Methods and Tools (DSD '11)March 2011Oulu, Finland38539210.1109/dsd.2011.532-s2.0-80055006395ZhuD.MelhemR.MosséD.The effects of energy management on reliability in real-time embedded systemsProceedings of the IEEE/ACM International Conference on Computer-Aided DesignNovember 2004IEEE35402-s2.0-16244394835ZhuD.AydinH.Reliability-aware energy management for periodic real-time tasksDoganA.ÖzgünerF.Matching and scheduling algorithms for minimizing execution time and failure probability of applications in heterogeneous computingQinX.JiangH.A dynamic and reliability-driven scheduling algorithm for parallel real-time jobs executing on heterogeneous clustersXuY.LiK.HeL.ZhangL.LiK.A hybrid chemical reaction optimization scheme for task scheduling on heterogeneous computing systemsSpiliopoulosV.KaxirasS.KeramidasG.Green governors: a framework for continuously adaptive DVFSProceedings of the International Green Computing Conference (IGCC '11)July 2011Orlando, Fla, USAIEEE1810.1109/igcc.2011.60085522-s2.0-80053203126QiuM.ShaE. H.-M.Cost minimization while satisfying hard/soft timing constraints for heterogeneous embedded systemsGargS. K.YeoC. S.AnandasivamA.BuyyaR.Environment-conscious scheduling of HPC applications on distributed Cloud-oriented data centersBaumannR.The impact of technology scaling on soft error rate performance and limits to the efficacy of error correctionProceedings of the International Electron Devices Meeting (IEDM '02)December 2002San Francisco, Calif, USA32933210.1109/IEDM.2002.1175845FardA. M.GhasemiM.KargahiM.Response-time minimization in soft real-time systems with temperature-affected reliability constraintProceedings of the CSI Symposium on Real-Time and Embedded Systems and Technologies (RTEST '15)October 2015Tehran, Iran10.1109/rtest.2015.7369850SongS.CoitD. W.FengQ.PengH.Reliability analysis for multi-component systems subject to multiple dependent competing failure processesDegalahalV.LiL.NarayananV.KandemirM.IrwinM. J.Soft errors issues in low-power cachesLeiZ.TianqiG.JiZ.ShijunJ.QingzhouS.MingH.An adaptive moving total least squares method for curve fitting