Task Classification Based Energy-Aware Consolidation in Clouds

We consider a cloud data center, in which the service provider supplies virtual machines (VMs) on hosts or physical machines (PMs) to its subscribers for computation in an on-demand fashion. For the cloud data center, we propose a task consolidation algorithm based on task classification (i.e., computation-intensive and data-intensive) and resource utilization (e.g., CPU and RAM). Furthermore, we design a VM consolidation algorithm to balance task execution time and energy consumption without violating a predefined service level agreement (SLA). Unlike the existing research on VM consolidation or scheduling that applies none or single threshold schemes, we focus on a double threshold (upper and lower) scheme, which is used for VM consolidation. More specifically, when a host operates with resource utilization below the lower threshold, all the VMs on the host will be scheduled to be migrated to other hosts and then the host will be powered down, while when a host operates with resource utilization above the upper threshold, a VM will be migrated to avoid using 100% of resource utilization. Based on experimental performance evaluations with real-world traces, we prove that our task classification based energy-aware consolidation algorithm (TCEA) achieves a significant energy reduction without incurring predefined SLA violations.


Introduction
Nowadays, cloud computing has become an efficient paradigm of offering computational capabilities as a service based on a pay-as-you-go model [1] and many studies have been conducted in diverse cloud computing research areas, such as fault tolerance and quality of service (QoS) [2,3].Meanwhile, virtualization has been touted as a revolutionary technology to transform cloud data centers (e.g., Amazon's elastic compute cloud and Google's compute engine) [4].By taking advantage of the virtualization technology, running cloud applications on virtual machines (VMs) has become an efficient solution of consolidating data centers because the utilization rate of data centers has been found to be low, typically ranging from 10 to 20 percent [5].In other words, a single host (physical machine) can run multiple VMs simultaneously and VMs can be relocated dynamically by live migration operations, leading to high resource utilization.Another issue of data centers is high energy consumption, which results in substantial carbon dioxide emissions (about 2 percent of the global emissions).A typical data center consumes as much energy as 25,000 households do [6].In this regard, an efficient energy consumption strategy in nonvirtualization environments (smart grids) has been carried out [7].
As the virtualization technology [8,9] has become popular widely, organizations or companies began to build their own private cloud data centers using commodity hardware.In this regard, there exists a need for designing more efficient and effective VM consolidation techniques to reduce energy consumption in cloud data centers.The simplest way to achieve energy reduction in cloud computing environments is to minimize the number of physical machines (PMs) by allocating more VMs in a PM.However, this solution may lead to a high degree of service level agreement (SLA) violations when each VM requires the host's limited resources.Moreover, the relationship between CPU utilization and power consumption is not linear as shown in Figure 1.The power consumption of CPU increases more than linearly as utilization increases.More importantly, when the CPU utilization is above 90%, the power consumption jumps up quickly due to the architectural design and turbo boost feature.In other words, the performance to power ratio [10] exhibits sublinear growth, and therefore, just putting many VMs to a PM utilizing 100% of CPU is not always the best solution in terms of performance, energy consumption, and SLA violations.We take Intel i5 and i7 CPUs in our experiments, rather than server class CPUs in Figure 1, because, for small and medium sized companies, using commodity hardware like Intel i5 or i7 to build a private cloud is more affordable and accessible [11].
In this paper, we present a new VM and task consolidation mechanism in cloud computing environments.The proposed method is based on task classification, in which we divide cloud tasks into two categories: computationintensive and data-intensive tasks.A computation-intensive task refers to a computation-bounded application program.Such applications devote most of their execution time to fulfill computational requirements as opposed to I/O and typically require small volumes of data, while a data-intensive task refers an I/O-bounded application with a need to process large volumes of data.Such applications devote most of their processing time to I/O, movement, and manipulation of data.The basic idea of our approach is twofold.One is that when we need to migrate cloud tasks due to a migration policy, we favor a computation-intensive task for migration rather than a data-intensive task since the migration time for computation-intensive tasks is shorter than that of dataintensive tasks.In order to migrate data-intensive tasks, it is necessary to move data for processing as well, and this transferring of data generates communication overheads.Then, we prefer the target VM with no computationintensive tasks because data-intensive tasks consume less CPU resources, thereby providing a comfortable executing environment for the computation-intensive task.The other is to use a double threshold approach (i.e., upper threshold and lower threshold) for VM migrations and optimization.When a VM's utilization is either above the upper threshold or below the lower threshold, the VM is scheduled for migration.Our double threshold approach is different from previous work in that no algorithm is proposed to use the upper and lower thresholds simultaneously in an effective way to the best of our knowledge.With an extensive measurement observation, we identified that there is much room for optimization by balancing performance and energy consumption.
Our work differs from traditional scheduling algorithms in the literature by designing and implementing a novel consolidation mechanism based on a task classification approach.We develop corresponding task scheduling and VM allocation algorithms for cloud tasks executed in virtualized data centers.
The major contributions of this paper are summarized as follows: (i) We designed an energy-aware cloud data center consolidation mechanism based on task classification, while preserving performance and SLA guarantee.
(ii) We developed a cloud task scheduling and VM allocation algorithms that solve problems about when and how to migrate tasks and VMs in an energy efficient way.
(iii) We formulated a double threshold algorithm for further optimization to improve the performance to power ratio.
(iv) We undertook a comprehensive analysis and performance evaluation based on real-world workload traces.
The rest of this paper is organized as follows.Section 2 describes our research motivation and our intuition for consolidation in virtualized clouds.In Section 3, the task classification based energy-aware consolidation scheduling mechanism and the main principles behind it are presented.The experiments and performance analysis are given in Section 4. The related work in the literature is summarized in Section 5. Finally, Section 6 concludes the paper.

Motivation and the Basic Idea
As the virtualization technology has been widely used, it is easily possible to construct a private cloud computing environment with open-source infrastructure as a service (IaaS) solutions and commodity hardware (e.g., desktop-level CPUs and peripherals).Figure 2 shows execution time of a matrix multiplication benchmark program and its performance to power ratio with CPU utilization for Intel i5-3570 and i7-3770 CPUs.With CPU utilization below 50%, the performance gain from the CPUs is noticeable as CPU utilization increases as the performance to power ratio indicates.However, when CPU utilization is above 50% the performance to power ratio grows sublinearly.This means that using high CPU utilization is not always an energy-efficient way to perform tasks.Even when we use a turbo boost feature, one of dynamic voltage and frequency scaling (DVFS) techniques, the performance gain of high frequency of CPU operations is not big considering the performance to power ratio.
Hence, we devise another approach using a threshold of CPU utilization so that a host that manages a couple of VMs does not exceed a predefined CPU utilization threshold.When a host exceeds the threshold, our consolidation algorithm determines to migrate one of the tasks or VMs on the host to another as depicted in Figure 3.Each task is categorized as C-task (computation-intensive task) or D-task (data-intensive task) and is assumed to use 25% of resources or utilization for a VM for simplicity in this example.Note that the task categorization mechanism of C-task and D-task is explained in the next section.Assuming that the threshold is 75% for a VM, tasks in VM 1 and VM 8 should be migrated to underutilized VMs.For Case A, in which there are C-tasks and D-tasks in a VM, our consolidation algorithm chooses a C-task to be migrated and preferentially selects a target VM with no C-tasks since migrating a C-task takes much shorter time compared to a D-task and migrating a D-task introduces a major I/O bottleneck in the host.For Case B, in which there are only D-tasks but C-tasks, we only consider underutilized VMs for target, disregarding the category of tasks running on the target VM.For task migration, there are many prevalent software and management technologies, such as openMosix, which is a Linux kernel extension that allows processes to migrate to other nodes seamlessly.
On the other hand, choosing a proper threshold value is an important factor that influences the overall performance and there is a tradeoff between the threshold value and SLA violation.Figure 4 shows the tradeoff with various migration policies.Obviously, lowering the threshold value leads to lower energy consumption, but it causes SLA violations, meaning a user's request for tasks cannot guarantee to be succeeded in preagreed metrics.In a condensed situation, where there is no host that can afford additional VMs and the ratio of PM to VM is low, it is more desirable to use a higher threshold value, whereas, in a sparse situation, where there are many free hosts available for additional VMs and the ratio of PM to VM is high, it allows having a lower threshold value but it is energy consuming and wastes resources.As far as the latter case is concerned, we use a double threshold approach to reduce energy consumption more, while incurring the overall SLA violation as little as possible.The resource types for a system are CPU, memory, storage, network, and so forth.Among them, CPU is the most dominant factor that influences energy consumption [12].In this paper, we focus on CPU utilization for migration policies and leave integrating other types of resource into the migration policies as future work.

Task Classification Based Energy-Aware Consolidation Algorithm (TCEA)
As shown in Figure 5, we consider a typical cloud data center with a cloud portal.When a user submits a task to the cloud portal, TCEA first performs a task classification process based on configurations of the task and historical logs.The task is categorized as either computation-intensive or dataintensive.Then, with this task classification information, we assign the task to an appropriate VM and consolidate VMs in the data center in an energy-aware way.After that, TCEA periodically checks hosts with a predefined threshold value so that unnecessary hosts are powered down after migrating their VM to others, while maintaining SLA.The detailed description of our proposed algorithms is given below.
(A) Double Threshold Scheme.Our consolidation algorithms are based on the double threshold scheme.In order to save energy consumption of a cloud data center, one may consider using the minimum number of hosts by utilizing CPU as much as possible for VMs.However, this approach is not an energy efficient solution because it disregards the performance to power ratio.Thus, TCEA uses the upper threshold to prevent heating CPUs up.On the other hand, when many of the hosts are easygoing as a whole, it is necessary to minimize active hosts to save superfluous energy consumption by consolidating VMs.For that purpose, we employ the lower threshold.With the lower threshold, TCEA periodically checks hosts and VMs whether it requires VM or task consolidation.For example, if a host operates with CPU utilization below the lower threshold, we migrate VMs on the host to other hosts as long as there are available hosts to accommodate the VMs without restricting VMs' liberty.With these in mind, it is important to choose proper values for the double threshold scheme, that is, the upper threshold and lower threshold, considering the tradeoff between performance and energy consumption.To determine the conditions of suitable threshold values, we conduct several experiments in Section 4.
(B) Task Classifier.Unlike previous work, we consider a task's characteristics in consolidating a cloud data center.Towards this end, we place a task classifier module to categorize tasks into computational-intensive or data-intensive tasks.When a user submits a task, it examines history log files to check whether it has been performed before.If so, TCEA uses the previous classification information without performing the task classification process.If not, it performs the task classification process as shown in Algorithm 1.
The criteria of classifying tasks in the task classifier function are based on the communication to computation ratio [13].By examining the execution time and task transfer time of a task, it puts the task to the corresponding queue.In other words, when computation time is greater than task transfer time of a task, the task classifier makes the task resident in   .Otherwise, the task is considered as data-intensive.The classification information of the task is also stored in the storage for future use.
(C) Task Assignment.The next step after performing the task classification process is to assign tasks to appropriate VMs.When assigning a task, TCEA first tries to find a host whose utilization is relatively low as shown in Algorithm 2.Then, it checks all the VMs in the host by counting the number  of computation-intensive tasks.Out of the VMs, a VM that has the least number of computation-intensive tasks can be a candidate when the task is computation-intensive.After iterating this phase, the task assignment function selects a VM for the task.
When the type of a task is data-intensive, TCEA does not care about the types of tasks for finding target VMs.The only consideration is the number of tasks running in VMs.Thus, it finds a VM that runs the minimum number of tasks in order to balance the load.For optimization, the task assignment function migrates a task to another VM.At this stage, we favor computation-intensive tasks for migration because migrating data-intensive tasks is inefficient.In other words, migrating data-intensive tasks takes more time than migrating computation-intensive tasks since it is necessary to move the data of data-intensive tasks as well.When finding an overutilized host, TCEA prefers a VM that runs the largest number of computation-intensive tasks for migration.This is based on the fact that migrating a computation-intensive task is more efficient than migrating a data-intensive task with regard to the number of migrations and utilization shifting.Once a task is chosen for migration, the next step is to choose a target VM.There are two conditions for choosing a target VM.One is CPU utilization and the other is the number of computation-intensive tasks.Among VMs whose host's CPU utilization is low, a VM that runs the least number of computation-intensive tasks will be chosen for the target VM.Then, the task is scheduled to be migrated accordingly.
(D) Consolidation of VMs.For VM consolidation, it is essential to handle and manage VMs and hosts chosen by the double threshold scheme.Algorithm 3 shows the VM consolidation in TCEA in detail.When a host's utilization is above the upper threshold (i.e., overutilized hosts), TCEA chooses a VM to be migrated considering the number of computation-intensive tasks.The more computation tasks a VM has, the more likely the VM is to be a source for migration.Once a source VM is selected, a target host selection phase is performed.Since a source VM will occupy a large portion of utilization, it is preferable to choose a target host whose utilization is relatively low.Therefore, the chosen target host may have fewer numbers of computationintensive tasks than others.On the other hand, when managing underutilized hosts chosen by the lower threshold, all the VMs in the host will be migrated to hosts whose utilization is normal across the data center.The reason why TCEA chooses normally utilized hosts as migration targets is to exploit the performance to power ratio.Choosing a host of full utilization as a target will result in more energy consumption and consolidation management overheads.For example, when a host becomes overutilized and is chosen as a target host, TCEA will perform redundant load balancing operations.
(E) Task Classification Based Energy-Aware Consolidation Algorithm (TCEA).Algorithm 4 covers our overall consolidation and scheduling scheme.Note that the procedure of lines (1)-( 6) is triggered upon receipt of a set of tasks and that of lines ( 7)-( 18) is performed periodically.The task classifier function and the task assignment function are responsible for consolidation and management of tasks in TCEA.TCEA monitors VMs and hosts in the cloud data center for status updates.With the predefined values including the upper and lower thresholds, TCEA maintains   ,   , and  lower of hosts.To balance performance and energy consumption, VMs in   and  lower will be migrated to   .It is worth noting that choosing the proper values of the upper threshold, lower threshold, and the number of (1) if   has no historical log file (2) if VM execution time is greater than data movement time (3)   ←   ∪   ; (4) else (5)   ←   ∪   ; ( 6) endif ( 7) else // The   has historical log file (8) Retrieve information from the configuration file; (9) Classify data type using obtained information; (10)  VMs to be migrated influences the performance of TCEA.In the next section, we validate TCEA for energy efficiency and performance with these parameters.

Performance Evaluation
In this section, we present experimental results that demonstrate the performance of TCEA for reducing energy consumption by managing VM consolidation while achieving SLA satisfaction.As input, we use real task traces (Intel Netbatch logs [14]) and artifact task logs for a fixed combination of computation-intensive tasks and data-intensive tasks.For experiments, we assume that there are 50 hosts and 100 VMs running in the cloud data center unless specified otherwise.A host is equipped with a quad-core CPU (i7-3770) with 4 GB of RAM and gigabit Ethernet.A user can specify the type of a VM such as the number of vCPU, RAM, and storage capacity.Otherwise, a default VM setting with 1 GB of RAM and 1 vCPU is used.
In this experiment, we analyze the runtime of TCEA with varying upper thresholds from 100% to 60%.We conduct this experiment for the real world datasets mentioned above.In Figure 6, -axis denotes the upper threshold and axis represents the energy consumption, the number of VM migrations, and the number of host shutdowns.The number of VM migrations and the number of host shutdowns are constantly going down as the upper threshold decreases.With decreased upper threshold, the available hosts tend to remain alive because VMs should reside in hosts whose utilization is below the upper threshold, and therefore, the number of VM migrations is reduced as well.For energy consumption, 90% is optimal.This indicates that (1) although hosts with 100% of upper threshold maintain more VMs, 100% is not the best threshold due to the performance to power ratio, (2) even though the number of host shutdowns peaks with  100% of upper threshold, the energy reduction of using the lower threshold (90%) dominates that of the number of host shutdowns, and (3) the number of VM migrations decreases with lower upper threshold because the probability of finding satisfactory target VMs gets lower too.For the rest of experiments, we use 90% of upper threshold unless specified otherwise.
For a sparse situation, where there are many free hosts available for additional VMs and the ratio of PM to VM is high, we devise an optimization algorithm to migrate VMs from underutilized hosts to others and shutdown the hosts, thereby reducing energy consumption.To this end, we use a lower threshold such that VMs in a host below the lower threshold are scheduled to be migrated to other hosts, and then the host gets shutdown.Figure 7 shows energy consumption, the number of VM migrations, and the number of host shutdowns with varying lower thresholds (e.g., 0.8 of -axis means that 20% of hosts are chosen by the lower threshold).Comparing with default (no task classification is performed), TCEA consumes 14.05% less energy on average.When the lower threshold is 50%, the difference between default and TCEA reaches a peak.With respect to energy consumption, the number of VM migrations, and the number of host shutdowns, we use 50% of lower threshold for the rest of experiments unless specified otherwise.
To verify the effectiveness of lower thresholds, we conduct another experiment showing energy consumption, the number of VM migrations, and the number of host shutdowns with VM ratios by increasing the number of VMs and hosts (1x means a default setting of 100 VMs and 50 hosts).Note that, in this experiment, 0.9 of VM ratio means that 10% of hosts whose utilization is below the lower threshold are scheduled to be powered down by migrating their VMs.As shown in Figure 8, around 50% of the VM ratio suits our purpose in terms of energy consumption, the number of VM migrations, the number of host shutdowns, and SLA violations.The ratio below 0.5 leads to SLA violations; therefore we do not use ratio lower than 0.5.To investigate the respective improvement brought by TCEA's double threshold scheme, we compare the performance of TCEA (double threshold) with the single threshold scheme and default (no threshold and no task classification) setting.In this experiment, we use real task trace logs and artifact task logs for a fixed combination of computation tasks and data-intensive tasks.In Figure 9, "Job" indicates real task traces, Job c indicates only computation-intensive tasks, Job d indicates only data-intensive tasks, and Job cd indicates 50% of computation-intensive tasks and 50% of data-intensive tasks.
As shown in Figure 9, there is no difference for the results with the default setting (no threshold) in terms of energy consumption because a threshold scheme is not applicable.Nevertheless, we leave them for comparison.The double threshold scheme saves 47.6% of energy compared to the default setting.For the single threshold scheme, there is no big difference between 90% and 100% but there are more VM migration operations with 100% of upper threshold, which leads to overheads.Of job categories (Job, Job c, Job d, and Job cd), Job d shows a little performance impact with single threshold because it uses relatively less CPU utilization, and Job cd has performance improvement when the single threshold is above 80%.The result for double threshold shows similar phenomenon when the single threshold is used.However, the double threshold scheme further reduces energy consumption by 14.2% compared to the single threshold scheme.
An important requirement for achieving the optimal performance of virtualized cloud environments is to find the appropriate number of VMs per PM.In such an environment, the ratio of PM to VM affects the overall performance.To validate the effect of the ratio of PM to VM, we compare the threshold schemes (default, single, and double).The double scheme achieves the largest energy reduction, followed by the single scheme and by the default scheme as shown in Figure 10.The double threshold scheme saves energy consumption by 11.3% and 27.2% comparing with single and default, respectively.For the number of VM migrations, there are some points where the double threshold scheme exhibits more VM migrations than the single threshold scheme does, but it stabilizes when the ratio of PM to VM is 1 : 9 or more.In addition, the double threshold scheme always outperforms with respect to the number of host shutdowns.
To measure the scalability for the number of PMs and VMs, we increase the number of PMs and VMs from 1 : 2 up to 10 : 20 as shown in Figure 11.As expected, TCEA consumes less energy by 17.9% on average than the default scheme and outnumbers the default scheme in terms of the number of shutdowns.For VM consolidation, TCEA has a higher number of VM migrations.For task scalability, we compare energy consumption by increasing the task log size  up to 10 times as depicted in Figure 12.Comparing with the default scheme, TCEA consumes less energy by 15.8% on average.Obviously, TCEA has more VM migration and host shutdown operations than the default scheme has for VM consolidation.

Related Work
We summarize the related work across three perspectives: resource allocation and scheduling in data centers and clouds, threshold-based schemes with different objectives, and energy savings in data centers.To balance energy consumption and VM utilization, the authors of [10] used a performance to power ratio.It schedules VM migration dynamically and consolidates servers in clouds.They compared their proposed algorithm with three different algorithms including the DVFS algorithm using real trace log files.The authors of [13] proposed a criterion to divide computation-intensive tasks and data-intensive tasks using a communication to computation ratio.The rationale of this task classification is to employ resource allocation methods based on tasks or workflows to improve performance.In [15], they developed an energy-aware scheduling to reduce total processing time for VMs in a precedenceconstrained condition, while maximizing PM's utilization considering communication costs.In [16], they proposed a prediction algorithm for finding overutilized servers and a best-fit algorithm for hosts and VMs.The results show that the algorithms reduce the number of migration operations, rebooting servers, and energy consumption, while achieving SLA guarantee.A separation mechanism of I/O tasks to perform computation-intensive tasks in a batch in virtualized servers to mitigate virtualization overheads is proposed in

3 20Figure 2 :
Figure 2: Energy consumption and execution time of matrix multiplication of i5 and i7 CPUs.

Figure 4 :
Figure 4: Energy consumption and SLA violations with threshold and migration policies.

Figure 6 :
Figure 6: Performance results for upper threshold.

Figure 7 :
Figure 7: Performance results for lower threshold.

Figure 8 :
Figure 8: Performance scalability for the number of nodes with lower threshold.
The number of host shutdowns