Performance Optimization of Cloud Data Centers with a Dynamic Energy-Efficient Resource Management Scheme

,


Introduction
As a direct result of the rapid growth in the number of cloud users, some cloud providers have already built large numbers of data centers to satisfy the resources demands [1]. e consequences are massive increases in energy consumption, an excessive increase in carbon emissions, and a reduction in benefits for the cloud providers [2]. Statistical results show that the average data center can consume as much energy as 25, 000 ordinary households [3]. erefore, based on the concept of green computing, obviously, the development of a greener, more energy-efficient resource management mechanism for cloud systems is becoming more desirable [4,5]. e main contributions of this paper are summarized as follows: (i) We present a cloud architecture composed of a taskscheduling decision layer, a resource-provisioning layer, and an actual service layer. Over the multitier cloud architecture, we propose an energy-efficient resource management scheme with a synchronous sleep mechanism. (ii) We establish a queueing model composed of three subqueues to capture the proposed scheme. By using a Markov chain-based approach, we derive two performance measures: the average latency of requests and the energy-saving rate of the system.
(iii) Taking into account the trade-off between the average latency of requests and the energy-saving rate of the system, we build a system cost function and present an improved Salp Swarm Algorithm (SSA) to optimize the sleep mechanism.

Related Work
In this section, we review the related work on energy conservation research in cloud systems based on virtualization technology, sleep mode, and multitier cloud architecture. And then, we set forth the motivation for our research.

Virtualization Technology-Based Energy Conservation
Research. In recent years, for utilizing the physical resource optimally, the study of energy conservation strategy for virtual machine (VM) configuration, migration, and consolidation has become a focus of energy conservation research in cloud systems. Auday et al. considered migration and placement of VMs to enhance the energy efficiency in cloud infrastructure. In order to minimize the additional energy consumption generated by the VM migration, they proposed a distributed approach to an energy-efficient dynamic VM consolidation policy. e approach determined which VMs are migrated and where the selected VMs for migration are placed [6]. For solving the problem of under-utilization of servers in a cloud system, Zakarya et al. used VM consolidation to reduce the number of hosts in use. ey explored the impact of VM allocation on energy efficiency and proposed a dynamic VM migration approach, in which the VMs are migrated only if the migration cost could be recovered [7]. rough modeling the energy-aware allocation and consolidation, Ghribi et al. presented an optimal allocation algorithm with a consolidation algorithm relying on migration of VMs to minimize the overall energy consumption in the cloud system. e allocation algorithm was solved as a bin-packing problem aiming to minimize the energy consumption. e consolidation algorithm was based on a linear and integer formulation of VM migration to adapt the placement for released resources [8]. Aiming to save energy and minimize resource wastage, Sharma et al. proposed a multiobjective VM allocation and migration scheme, in which the allocation of VMs was carried out using a hybrid approach of a genetic algorithm and particle swarm optimization [9]. Based on the virtualization technology, the above research improved the utilization rate of the physical resources in use and contributed to the energy conservation.

Sleep Mode-Based Energy Conservation Research.
e sleep mode-based energy conservation strategy is implemented by switching the idle server to a low-power sleep state for the purpose of reducing idle energy consumption in the cloud system. Jin et al. proposed a clustered VM allocation strategy on the resource layer of the cloud system based on a sleep mode with a wake-up threshold. By establishing a queue with an N-policy and asynchronous vacations of partial servers, they derived the performance measures in terms of the average latency of requests and the energy-saving rate of the system [10]. By using a hybrid shuffled frog leaping algorithm, Luo et al. proposed a dynamic VM allocation scheme, which applied a live VM migration strategy and switched some free resource nodes into a sleep mode to reduce energy consumption [3]. Farahnakian et al. developed a dynamic VM consolidation method to solve the optimization problem for setting the number of active hosts based on the utilization of existing resources. e proposed method could make a decision on when to switch a host into the working or sleep mode [11]. Sridharshini et al. proposed an energy-aware scheduling algorithm and a live migration algorithm to efficiently utilize the resources in a cloud system. ese two algorithms were used to consolidate heterogeneous workloads to minimize the number of physical machines (PMs) and switch the idle PMs to the sleep mode to reduce energy consumption [12]. e studies mentioned above showed a certain degree of enhanced energy efficiency due to the introduction of a sleep mode.

Energy Conservation Research under a Multitier Cloud
Architecture. A multitier cloud architecture contains multiple separate parts such as an "application layer," a "management layer," and a "resource layer" [13]. Some works have appeared examining the energy consumption management in a multitier cloud architecture.
Usman et al. proposed a cloud architecture composed of four modules: broker, cloud manager, VM manager, and resource scheduler. By using an Interior Search Algorithm (ISA), they developed an energy-efficient VM allocation technique to overcome high energy consumption and reduce under-utilized resources in a cloud system [14]. Aiming to use the computing resources productively and energy efficiently, Beloglazov presented a three-tier cloud architecture composed of a global resource manager, user applications, and resource pools. He proposed a distributed dynamic VM consolidation approach utilizing fine-grained fluctuations in the application workloads to minimize the number of active physical nodes [15].
Zhu et al. proposed a cloud framework composed of four modules: application agent, VM allocation center, global scheduling center, and resource pools. In addition, they designed a resource allocation and scheduling strategy to reduce the energy consumption on both the system level and the component level [16]. In order to promote energy efficiency in a cloud system, Ghosh et al. developed a multitier cloud architecture composed of a resource provisioning decision layer, a VM deployment layer, and an actual service layer. Furthermore, for reducing the complexity of performance analysis, they developed a multilevel interactive stochastic submodel method to derive the performance measures of the system [17]. Obviously, it is more reasonable to study the energy consumption problem by considering a multitier cloud architecture.

Motivation for Our
Research. Inspired by the work mentioned above, in this paper, we propose a dynamic energy-efficient resource management scheme in a cloud system. Considering that it is more realistic to study energy conservation under a multitier cloud architecture, we present a cloud architecture composed of a task scheduling decision layer, a resource provisioning layer, and an actual service layer. It's noted that switching all the idle servers to a low-power sleep state may deteriorate the response performance. To save energy as well as to maintain the cloud user's quality of experience, we configure PMs into two pools: a hot pool and a warm pool. e PMs in the hot pool keep working continuously to provide cloud services instantly for the arriving requests. e PMs in the warm pool are turned on, but remain in a dynamic sleep mode to reduce energy consumption.
In addition, this paper also considers the provisioning process of VMs in both of the two pools. Concretely, each PM is configured with a resource search engine (RSE) that finds an available VM for each request, and the RSE is set to sleep synchronously with all the VMs on the PM to conserve energy. To analyze the proposed scheme, we establish a hybrid queueing system composed of three stochastic submodels with synchronous multiple vacations, and we study the system performance through theoretical analysis and numerical experiments. By building a system cost function, we study the trade-off between different performance measures and present an improved SSA to optimize the sleep mechanism. e remainder of this paper is organized as follows. In Section 3, by considering a multitier cloud architecture and two PM pools, we propose an energy-efficient resource management scheme with a synchronous sleep mechanism. In Section 4, we establish a hybrid queueing system composed of three submodels. In Section 5, we analyze the steady-state probability distribution of the queueing system by establishing a three-dimensional Markov chain. In Section 6, based on model analysis results, we evaluate the average latency of requests and the energy-saving rate of the system. In Section 7, we show the influence of the sleep mechanism on the performance measures by using numerical results. In Section 8, we present an improved intelligent algorithm to optimize the sleep mechanism. Finally, we summarize the whole paper in Section 9.

Scheme Description
Proper deployment of VMs is critical for the energy conservation and the Quality of Service (QoS) guarantee in a cloud system. In order to save energy and maintain the QoS, this paper proposes a dynamic energy-efficient resource management scheme, where the PMs are grouped into two pools: a hot pool and a warm pool. In the hot pool, the PMs are running continuously and the VMs hosted on a PM are always available. is means that the requests allocated to the hot pool can be served quickly so that the QoS of the cloud system can be guaranteed. In the warm pool, a synchronous sleep mechanism is introduced for the purpose of achieving a better energy-saving effect. e service provided by the warm pool can be delayed by the sleep mechanism. We call the PMs, the RSE, and the VMs in hot pool, the hot PMs, the hot RSE, and the hot VMs. And we call the PMs, the RSE, and the VMs in warm pool, the warm PMs, the warm RSE, and the warm VMs. Based on a multitier cloud architecture and a grouping approach for the PMs, we propose a novel resource management scheme shown in Figure 1.
In Figure 1, we assume that each PM is equipped with a RSE and the maximum number of VMs deployed on one PM is m. We also assume that the numbers of the identical PMs in the hot pool and the warm pool are n h and n w , respectively, where n h � 1, 2, . . . and n w � 1, 2, . . .. e life cycle of a request with the resource management scheme proposed in this paper is illustrated as follows: (1) All the requests are assumed to be homogeneous and enter a first-come, first-served (FCFS) queue in the system buffer. e request at the head of the queue firstly receives the service of the Task Scheduling Decision Engine (TSDE). As long as the hot pool is not full, the request will be allocated by the TSDE to the hot pool. Otherwise, the request will be allocated to the warm pool. (2) e request allocated to the hot pool randomly enters the FCFS queue in one of the hot PM buffers. e request at the head of the queue is processed by a RSE, which is used to find a VM on the selected PM for resource provision. If at least one idle VM exists on one of the hot PMs, the RSE provisions an available VM to the request, and the request is immediately served by the running VM. After the service is completed, the request will depart the system. (3) e request allocated to the warm pool randomly enters the FCFS queue in one of the warm PM buffers. e request at the head of the queue can have its service delayed due to the introduction of the sleep mechanism. On one of the warm PMs, once all the requests are processed, the RSE together with all the VMs enter a sleep period. Meanwhile, a sleep timer is started. When the sleep timer expires, if at least one request exists in the warm buffer, the RSE and all the VMs on the PM will wake up, otherwise they will enter the next sleep period.
en, we build a hybrid queueing system to mathematically derive the system performance measures and to solve the performance optimization problem with the proposed scheme.

System Model
In this section, we model the proposed scheme as three submodels based on the continuous-time environment as follows.
en, we obtained the continuous-time Markov chains (CTMC) of the hot PM and the warm PM, respectively.

TSDE Submodel.
In cloud systems, some practical requests are independent with each other, while other practical requests are correlated. e computing requests initiated by users are usually uncorrelated. erefore, the arrival process with Poisson distribution is considered to be appropriate for capturing the stochastic behavior of a cloud computing system with uncorrelated traffic [18].
In this research, we focus on user's initiated requests. erefore, we can make the following assumptions. In the request scheduling decision process, we assume that the arrival intervals of requests and the service times of requests are independent, identically distributed (i.i.d) random variables. Request arrivals at the cloud system presented in this paper are supposed to follow a Poisson process with arrival rate λ 0 , λ 0 > 0. e service time of a request processed by the TSDE is supposed to follow an exponential distribution with service rate δ, δ > 0. erefore, we build a single server queue for the taskscheduling decision process. We define the service intensity ρ 0 of the TSDE to be the number of request arrivals at the TSDE during the service time of a request. ρ 0 is given as follows: We define the latency W dec of a request in the TSDE buffer to be the time duration from the instant of a request arriving at the TSDE buffer to the instant of the request departing the TSDE buffer. e average latency E[W dec ] of requests in the TSDE buffer is obtained as follows: Substituting equation (1) into equation (2), we have

Hot Pool Submodel.
In this paper, we focus on a hot PM to build a queue model as a submodel of the system called the hot pool submodel and study the performance of the hot pool. Let L, L < +∞, be the capacity of the hot PM buffer. Let random variable N 1 (t) � i, i ∈ 0, 1, . . . , L { }, be the number of requests in the hot PM buffer at instant t, t ≥ 0. Let random variable J 1 (t) � j, j ∈ 0, 1 { }, be the state of the RSE, whether it is busy with provisioning a VM (j � 1) or not (j � 0). Each hot VM processes a request by loading a software environment (SE). Let random variable S 1 (t) � k, k ∈ 0, 1, . . . , m { }, be the number of hot VMs loaded with an SE at instant t. We call N 1 (t) the system level, J 1 (t) the system stage, and S 1 (t) the system phase.
(N 1 (t), J 1 (t), S 1 (t)), t ≥ 0 constitutes a three-dimensional continuous-time stochastic process with state space Ω 1 as follows: We assume that a newly arriving request is randomly allocated to one of the hot PMs. e decomposition of a

Complexity
Poisson process yields multiple Poisson processes [19]. e request arrivals at each hot PM are supposed to follow a Poisson process with arrival rate λ 1 . We have We assume that the service time of a request processed by the hot RSE follows an exponential distribution with service rate β 1 , β 1 > 0. e service time of a request processed by the hot VM loaded with SE is supposed to follow an exponential distribution with service rate μ 1 , μ 1 > 0.
We define π i,j,k as the steady-state probability distribution of the hot PM for the system level being equal to i, the system stage being equal to j, and the system phase being equal to k. π i,j,k is expressed as follows: where (i, j, k) ∈ Ω 1 . We define π i as the steady-state probability distribution vector of the system level being equal to i. π i can be given as follows: e steady-state probability distribution Π 1 of the CTMC (N 1 (t), J 1 (t), S 1 (t)), t ≥ 0 is composed of π i , 0 ≤ i ≤ L. Π 1 is given as follows:

Warm Pool Submodel.
In order to evaluate the performance of the warm pool, we focus on a warm PM to build a queue model as another submodel of the system called the warm pool submodel. We assume that the capacity of the warm PM buffer is infinite.
be the number of requests in the warm PM buffer at instant t. Unlike the hot PMs, a synchronous sleep mechanism is introduced to each warm PM. e RSE and all the VMs on one warm PM will go to sleep synchronously if possible. Let J 2 (t) � j, j ∈ 0, 1, 2 { }, be the state of the warm RSE. j � 0 means the warm RSE is asleep, j � 1 means the warm RSE is idle, and j � 2 means the warm RSE is busy with provisioning a VM for a request. Just like those in the hot pool, each warm VM also needs to load an SE for processing a request. Let S 2 (t) � k, k ∈ 0, 1, . . . , m { }, be the number of warm VMs loaded with an SE at instant t. We call N 2 (t) the system level, J 2 (t) the system stage, and S 2 (t) the system phase. (N 2 (t), J 2 (t), S 2 (t)), t ≥ 0 constitutes a three-dimensional continuous-time stochastic process with state space Ω 2 as follows: e general input flow is split into two streams, one is into the hot pool and the other is into the warm pool. In Section 4.1, the general request arrivals are assumed to follow a Poisson process, so the request arrivals at the warm pool also follow a Poisson process. We assume that a newly arriving request is randomly allocated to one of the warm PMs. e arrival rate of the requests at each warm PM is given as follows: where λ 0 is the arrival rate of the requests at the TSDE submodel and q is the probability that a newly arriving request can be accepted by the hot pool. q is calculated as follows: We assume that the service time of a request processed by the warm RSE follows an exponential distribution with service rate β 2 , β 2 > 0. e service time of a request processed by the warm VM loaded with an SE is supposed to follow an exponential distribution with service rate μ 2 , μ 2 > 0. A sleep timer is used to control the time length of a sleep period. e time length of the sleep timer is assumed to follow an exponential distribution with sleep parameter ϕ, ϕ > 0.
We define π * i,j,k as the steady-state probability distribution of the warm PM for the system level being equal to i, the system stage being equal to j, and the system phase being equal to k. π * i,j,k is given by We define π * i as the steady-state probability distribution vector of the warm PM for the system level being equal to i. π * i can be given as follows: e steady-state probability distribution Π 2 of the CTMC (N 2 (t), J 2 (t), S 2 (t)), t ≥ 0 is composed of π * i , i ≥ 0 and given as follows: Complexity 5

Model Analysis
In this section, we construct the transition rate matrixes in the context of CTMC and derive the steady-state probability distributions of the hot PM and the warm PM, respectively.

Steady-State Probability Distributions of the Hot PM.
Let Q 1 be the one-step state transition rate matrix of the CTMC (N 1 (t), J 1 (t), S 1 (t)), t ≥ 0 . Let Q u,v be the one-step state transition rate submatrix of Q 1 for the system level changing to v, v � 0, 1, . . . , L, from u, u � 0, 1, . . . , L. For the convenience of expression, we denote Q u,u− 1 as B u , Q u,u as A u , and Q u,u+1 as C u .
(1) For the case of u � 0, there are no requests in the hot PM buffer. If a new request arrives at the hot PM, the system state changes in the following two cases: (a) When the number of the hot VMs loaded with an SE is less than m and the hot RSE is idle, the newly arriving request accesses the hot RSE immediately. e system level and the system phase remain unchanged, but the system stage increases by one. e system state transfers to When the number of the hot VMs loaded with an SE is less than m, but the hot RSE is busy, or the number of the hot VMs loaded with an SE is up to m, the newly arriving request has to wait in the hot PM buffer. e system level increases by one, but the system stage and the system phase remain unchanged. e system state transfers to (1, 1, k) If a request is completely processed by the hot RSE, one of the deployed hot VMs loads the SE and processes this request. e system level remains unchanged, the system stage decreases by one, and the system phase increases by one. e system state transfers to (0, 0, k If a request is completely processed by a hot VM and departs the system, the SE is removed. e system level and the system stage remain unchanged, but the system phase decreases by one.
e system state transfers In summary, A 0 is a (2m + 1) × (2m + 1) matrix given as follows: C 0 is a (2m + 1) × (m + 1) matrix given as follows: (2) For the case of 1 ≤ u < L, there is at least one request in the hot PM buffer, and the hot PM buffer is not full.
If a new request arrives at the hot PM, the newly arriving request has to wait at the hot PM buffer. e system level increases by one, but the system stage and the system phase remain unchanged. e system state transfers to (u + 1, 0, m) from (u, 0, m) or to If a request is completely processed by the hot RSE, one of the deployed hot VMs loads the SE and provides service for this request. e system state changes in the following two cases: (a) When the number of the hot VMs loaded with an SE is less than m, the hot RSE processes the first request waiting in the hot PM buffer immediately. e system level decreases by one, the system stage remains unchanged, and the system phase increases by one. e system state transfers to When the number of the hot VMs loaded with an SE is up to m, the hot RSE becomes idle, and the requests in the hot PM buffer keep waiting. e system level remains unchanged, the system stage decreases by one, and the system phase increases by one. e system state transfers to If a request is completely processed by a hot VM and departs the system, the SE is removed. e system state changes in the following two cases: (a) When the hot RSE is idle, the first request waiting in the hot PM buffer accesses the hot RSE immediately. e system level and the system phase decrease by one, and the system stage increases by one. e system level transfers to When the hot RSE is busy, the requests in the hot PM buffer keep waiting. e system level and the system stage remain unchanged, but the system phase decreases by one. e system state transfers to Otherwise, the system state remains fixed at (u, 0, m) In summary, B 1 is an (m + 1) × (2m + 1) matrix given as follows: Let B represent B u , u � 2, 3, . . . , L − 1. B is an (m + 1) × (m + 1) matrix given as follows: Let C represent C u , u � 1, 2, . . . , L − 1. C is an (m + 1) × (m + 1) diagonal matrix given as follows: (3) For the case of u � L, the hot PM buffer is full. erefore, no new requests can join the hot PM.
If a request is completely processed by the hot RSE, one of the deployed VMs loads the SE and processes this request.
e system state changes in the following two cases: (a) When the number of the hot VMs loaded with an SE is less than m, the hot RSE processes the first request waiting in the hot PM buffer immediately. e system level decreases by one, the system stage remains unchanged, and the system phase increases by one. e system state transfers to (L − 1, When the number of the hot VMs loaded with an SE is up to m, no other hot VMs can be provisioned by the hot RSE, so the hot RSE becomes idle, and all the requests in the hot PM buffer keep waiting. e system level remains unchanged, the system stage decreases by one, and the system phase increases by one. e system state transfers to (L, 0, m) from (L, 1, m − 1) with β 1 .
If a request is completely processed by a hot VM and departs the system, the SE is removed.
e system state changes in the following two cases: (a) When the hot RSE is idle, the first request waiting in the hot PM buffer is processed by the hot RSE immediately. e system level and the system phase decrease by one, and the system stage increases by Complexity 7 one. e system state transfers to (L − 1, 1, m − 1) from (L, 0, m) with mμ 1 . (b) When the hot RSE is busy, all the requests in the hot PM buffer keep waiting. e system level and the system stage remain unchanged, but the system phase decreases by one. e system state transfers to Otherwise, the system state remains fixed at (L, 0, m) with − mμ 1 , or at (L, 1, k) Obviously, B L is an (m + 1) × (m + 1) matrix, and B L � B. A L is an (m + 1) × (m + 1) lower triangular matrix given as follows: At present, we have obtained all the submatrices in the one-step state transition rate matrix Q 1 . Q 1 can be written as follows: e steady-state probability distribution Π 1 of the CTMC (N 1 (t), J 1 (t), S 1 (t)), t ≥ 0 satisfies the following equilibrium equation and normalization condition: where e 1 is an ((L + 2)m + L + 1) × 1 vector with all elements being equal to 1. By solving equation (23), we derive the steady-state probability distribution

Steady-State Probability Distribution of the Warm PM.
Let Q 2 be the one-step state transition rate matrix of the CTMC (N 2 (t), J 2 (t), S 2 (t)), t ≥ 0 . Let Q * u,v be the one-step state transition rate submatrix of Q 2 for the system level changing to v, v � 0, 1, . . ., from u, u � 0, 1, . . .. We denote Q * u,u− 1 as B * u , Q * u,u as A * u , and Q * u,u+1 as C * u . (1) For the case of u � 0, there are no requests in the warm PM buffer. If a new request arrives at the warm PM, the system state changes in the following three cases: (a) When the warm RSE and the warm VMs are asleep, the newly arriving request has to wait in the warm PM buffer until the sleep timer expires. e system level increases by one, but the system stage and the system phase remain unchanged. e system state transfers to (1, 0, 0) from (0, 0, 0) with λ 2 . (b) When the warm RSE and the warm VMs are awake, and the number of the warm VMs loaded with an SE is up to m; no other warm VMs can be provisioned by the hot RSE. e newly arriving request has to wait in the warm PM buffer. e system level increases by one, but the system stage and the system phase remain unchanged. e system state transfers to (1, 1, m) from (0, 1, m) with λ 2 . (c) When the warm RSE and the warm VMs are awake, and the number of the warm VMs loaded with an SE is less than m, at least one VM can be provisioned. If the warm RSE is busy, the newly arriving request has to wait in the warm PM buffer. e system level increases by one, but the system stage and the system phase remain unchanged. e system state transfers to (1, 2, k) from (0, 2, k), 0 ≤ k ≤ m − 1, with λ 2 . If the warm RSE is idle, the newly arriving request accesses the warm RSE immediately. e system level and the system phase remain unchanged, but the system stage increases by one.
If a request is completely processed by the warm RSE, one of the deployed warm VMs loads the SE and processes this request. e system level remains unchanged, the system stage decreases by one, and the system phase increases by one. e system state transfers to (0, 1, k + 1) from (0, 2, k), 0 ≤ k ≤ m − 1, with β 2 . If a request is completely processed by a warm VM and departs the system, the system state changes in the following two cases: (a) When the warm RSE is idle and there is only one warm VM loaded with an SE, the used SE is removed. e warm RSE and the warm VMs enter a sleep period immediately. e system level remains unchanged, but the system stage and the system phase decrease by one. e system state transfers to (0, 0, 0) from (0, 1, 1) with μ 2 . (b) When the warm RSE is idle and there are at least two warm VMs loaded with an SE or the warm RSE is busy and there is at least one warm VM loaded with an SE, the used SE is removed. e system level and the system stage remain unchanged, but the system phase decreases by one. e system state transfers to (0, 1, k − 1) from (0, 1, k), 2 ≤ k ≤ m, or to (0, 2, k − 1) from (0, 2, k), 1 ≤ k ≤ m − 1, with kμ 2 .
If there are no new request arrivals on the warm PM, while the warm RSE and the warm VMs are asleep, once the sleep timer expires, the warm RSE wakes up and processes the first request waiting in the warm PM buffer immediately. e system level decreases by one, the system stage increases by two, and the system phase remains unchanged. e system state transfers to (u − 1, 2, 0) from (u, 0, 0) with ϕ.
If a new request arrives at the warm PM, the newly arriving request has to wait in the warm buffer. e system level increases by one, but the system stage and the system phase remain unchanged. e system state transfers to (u + 1, 0, 0) from (u, 0, 0), to (u + 1, 1, m) from (u, 1, m), or to If a request is completely processed by the warm RSE, one of the deployed warm VMs loads the SE and provides service for this request.
e system state changes in the following two cases: (a) When the number of the warm VMs loaded with an SE is less than m, the warm RSE processes the first request waiting in the warm PM buffer immediately. e system level decreases by one, the system stage remains unchanged, and the system phase increases by one. e system state transfers to (u − 1, 2, k + 1) from (u, 2, k), 0 ≤ k ≤ m − 2, with β 2 . (b) When the number of the warm VMs loaded with an SE is up to m, no other warm VMs can be provisioned by the warm RSE. erefore, the warm RSE becomes idle and none of the requests waiting in the warm PM buffer can access the warm RSE. e system level remains unchanged, the system stage decreases by one, and the system phase increases by one.
If a request is completely processed by a warm VM and departs the system, the used SE is removed. e system state changes in the following two cases: (a) When the warm RSE is idle, the first request waiting in the warm PM buffer accesses the warm RSE immediately. e system level and the system phase decrease by one, but the system stage increases by one. e system state transfers to (u − 1, 2, m − 1) from (u, 1, m) with mμ 2 . (b) When the warm RSE is busy, none of the requests waiting in the warm PM buffer can access the warm RSE. e system level and the system stage remain unchanged, but the system phase decreases by one. e system state transfers to (u, 2, k − 1) from (u, 2, k), 1 ≤ k ≤ m − 1, with kμ 2 . Otherwise, the system state remains fixed at (u, 0, 0) with − (λ 2 + ϕ), at (u, 1, m) with − (λ 2 + mμ 2 ), or at (u, 2, k), 0 ≤ k ≤ m − 1, with − (λ 2 + kμ 2 + β 2 ).
Step 2. Construct the matrices B * , A * , and C * by Step 3. Calculate W, V, and R 1 by Step 5. Output R. 10 Complexity Let C * represent C * u (u � 1, 2, . . .). C * is an (m + 2) × (m + 2) diagonal matrix given as follows: At present, we have obtained all the submatrices in the one step state transition rate matrix Q 2 . Q 2 can be written as follows: Based on the structure of the one-step state transition rate matrix Q 2 , the three-dimensional CTMC (N 2 (t), J 2 (t), S 2 (t)), t ≥ 0 of the warm PM can be regarded as a type of Quasi Birth-and-Death (QBD) process. us, we can apply the method of a matrix-geometric solution [20,21] to derive the steady-state probability distribution Π 2 of the CTMC (N 2 (t), J 2 (t), S 2 (t)), t ≥ 0 , where Π 2 � (π * 0 , π * 1 , . . .). First, we set up a matrix quadratic equation as follows: Since A * must be nonsingular, from equation (31), we have By deducing equation (32), we obtain where W � B * (A * ) − 1 and V � C * (A * ) − 1 .
In order to compute the rate matrix R, we present an iteration algorithm in Table 1.
Using the rate matrix R obtained in Table 1, we further construct a square matrix as follows: e steady-state probability distribution vectors π * 0 and π * 1 satisfy the following equation: where e * 0 is a (2m + 1) × 1 vector and e * 1 is an (m + 2) × 1 vector, respectively, with all elements being equal to 1.

Performance Measures
In this section, we present two performance measures of the cloud system: the average latency of requests and the energy saving rate of the system. e service intensity ρ 2 of the warm PM is given as follows: where λ 2 is the arrival rate of requests at a warm PM, μ 2 is the service rate of a request on a warm VM, and β 2 is the service rate of a request on the warm RSE. For the proposed scheme, the service intensity ρ of the system is given as follows: where ρ 0 is the service intensity of the TSDE. e necessary and sufficient condition for the system to be stable is ρ < 1. We evaluate the average latency of requests and the energy-saving rate of the system under the condition that the service intensity ρ < 1.
We define the latency of a request as the time duration from the instant a request arrives at the cloud system to the instant the request is about to receive the service. In this paper, the average latency of requests in the cloud system includes the average latency of requests queueing in the TSDE buffer and the average latency of requests queueing in the hot PM buffer or the warm PM buffer.

Complexity
In Section 4.1, the average latency E[W dec ] of requests queueing in the TSDE buffer has already been obtained. Next, we need to compute the average latency E[W vm ] of requests queueing in the hot PM buffer or the warm PM buffer.
Using the steady-state probability distribution Π 1 of the CTMC (N 1 (t), J 1 (t), S 1 (t)), t ≥ 0 given in Section 5.1, the average number E[N hot ] of requests queueing in the hot PM buffer can be given by For the convenience of technique, we tag one of the hot PMs. Based on Little's law, the average latency E[W hot ] of requests queueing in the buffer of the tagged hot PM is obtained as follows: where λ 1 is the arrival rate of requests at the tagged hot PM.
We also tag one of the warm PMs. Using the steady-state probability distribution Π 2 of the CTMC (N 2 (t), J 2 (t), S 2 (t)), t ≥ 0 given in Section 5.2, the average number E[N warm ] of requests queueing in the buffer of the tagged warm PM is given as follows: l 1 shown in equation (41) is a sufficiently large number satisfying the following equation: where ε 1 , called a precision factor of the average number of requests in the warm PM buffer, is a number related to the precision of the average number of requests in the warm PM buffer. e smaller the value of ε 1 is, the more precisely the average number of requests in the warm PM buffer will be given. Accordingly, the average latency E[W warm ] of requests queueing in the tagged warm PM buffer is obtained as follows: where λ 2 is the arrival rate of requests at the tagged warm PM.
Combining equations (40) and (43), the average latency E[W vm ] of requests queueing in the hot PM buffer or the warm PM buffer can be obtained as follows: where q is the probability that a newly arriving request can be accepted by the hot pool.
In summary, the average latency E[W] of requests queueing in the cloud system can be derived by Since the TSDE and the hot PMs are always running, the energy consumption there is normally constant. e energysaving rate of the system is therefore measured as the energy conservation per unit time in the warm pool.
When the warm PMs are in the active state, the energy is consumed normally just like in the TSDE and the hot PMs. Let w, w > 0, be the energy consumption per second for the warm pool in the active state. Let w 1 , w 1 > 0, be the energy consumption per second for the warm pool in the sleep state. When the warm PMs are in the sleep state, less energy will be consumed. It is obvious that w > w 1 .
For a sleeping warm PM, if a sleep period is about to expire, the RSE and the VMs need to monitor the PM buffer. erefore, additional energy will be consumed. Let w 2 , w 2 > 0, be the energy consumption for each monitoring. It is noted that additional energy is also consumed when the warm PM wakes up from a sleep state. Let w 3 , w 3 > 0, be the energy consumption for each wake up. erefore, in this paper, energy-saving rate S of the system is given as follows: where ϕ is the sleep parameter of the proposed dynamic sleep mechanism defined in Section 4.3. l 2 , shown in equation (46), is a sufficiently large number, which satisfies the following equation: where ε 2 , called a precision factor of the energy-saving rate of the system, is a number related to the precision of the energy-saving rate of the system. e smaller the value of ε 2 is, the more precisely the energy saving rate of the system will be given.

Numerical Results
To numerically analyze the average latency E[W] of requests and the energy-saving rate S of the system with the proposed scheme, we carry out experiments to provide numerical results based on MATLAB. All the experiments are carried out on a PC configured with Intel(R) Core(TM) i7-4790 CPU @ 3.60 GHz, 8.00 GB RAM, and 500G disk. e parameters set in the experiments are listed in Table 2. Figure 2 illustrates the change trend for the average latency E[W] of requests with the sleep parameter ϕ for a different number n h of the hot PMs and a different number n w of the warm PMs. Parameters Values e maximum number m of VMs deployed on one PM m � 3 Arrival rate λ 0 of requests at the TSDE λ 0 � 4.8 s − 1 Service rate δ of a request on the TSDE δ � 5 s − 1 Service rate β 1 of a request on the hot RSE β 1 � 1 s − 1 Energy consumption ω per second for the warm pool in the active state ω � 2 mJ Energy consumption ω 1 per second for the warm pool in the sleep state ω 1 � 0.3 mJ Additional energy consumption ω 2 for each monitoring ω 2 � 0.1 mJ Additional energy consumption ω 3 for each wake up ω 3 � 0.2 mJ Precision factor ε 1 of the average number of requests in the warm PM buffer ε 1 � 10 − 15 Precision factor ε 2 of the energy-saving rate of the system ε 2 � 10 − 15

Complexity 13
In Figures 2(a) and 2(b), we show the average latency E[W] of requests for the different service rates μ 1 of a request on a hot VM and the different capacities L of a hot PM buffer, respectively. In Figures 2(c) and 2(d), we show the average latency E[W] of requests for the different service rates μ 2 of a request on a warm VM and the different service rates β 2 of a request on a warm RSE, respectively.
From Figure 2, we notice that, as the sleep parameter ϕ increases, the average latency E[W] of requests firstly decreases accordingly and then tends to be fixed.
In the stage of the smaller sleep parameter ϕ, a newly arriving request has to wait for a longer time in the buffer of a sleeping warm PM. As the sleep parameter grows, the waiting time of a request in the warm PM buffer gets shorter. erefore, the average latency E[W] of requests shows a downtrend.
is implies that the influence of the sleep mechanism on the response performance of the system is greater in the case of a smaller sleep parameter.
When the sleep parameter ϕ gets larger and grows to a certain value, the time length of a sleep period is close to zero. erefore, a warm PM has little chance to go to sleep. As a result, the average latency E[W] of requests tends to be fixed as the sleep parameter increases. is implies that the proposed sleep mechanism has little effect on the response performance of the system when the sleep parameter is large enough.
For the same sleep parameters ϕ in both Figures 2(a) and 2(b), we notice that the average latency E[W] of requests goes up as the capacity L of a hot PM buffer increases. e larger a hot PM's buffer capacity is, the longer the requests wait in the hot PM buffer. is gives rise to an increase in the average latency of requests. We also notice that, as the service rate μ 1 of a request on a hot VM increases, the average latency of requests gets reduced. e higher the service rate is, the less time a request occupies the hot VM. erefore, the average latency of requests shows a downtrend.
Comparing Figures 2(a) with 2(b), we find that, for the same capacity L of a hot PM buffer, the same service rate μ 1 of a request on a hot VM, and the same sleep parameter ϕ, as the number n h of the hot PMs increases, the average latency E[W] of requests becomes lower. e more the PMs are deployed in the hot pool, the earlier the requests arrive at the hot pool receive service. erefore, the average latency of requests shows a downtrend. In addition, we also find that when the sleep parameter is smaller, the downtrend for the average latency of requests gets slighter as the number of the hot PMs increases. is implies that the more the PMs are deployed in the hot pool, the weaker the influence of the sleep mechanism on the response performance of the system becomes.
For the same sleep parameters ϕ in both Figures 2(c) and 2(d), we observe that the average latency E[W] of requests rises up as the service rate μ 2 of a request on a warm VM increases. When the service rate of a request on a warm VM is higher, the probability of the warm RSE and the warm VMs being idle is greater. erefore, the warm PM is more likely to be asleep, which causes the request to wait longer in the warm PM buffer. Accordingly, the average latency of requests gets larger. We also observe that, as the service rate β 2 of a request on a warm RSE increases, the average latency of requests is reduced. e higher the service rate of a request on a warm RSE is, the less time a request occupies the warm RSE. is leads to a lower average latency of requests.
Comparing Figures 2(c) with 2(d), we find that, for the same service rate μ 2 of a request on a warm VM, the same service rate β 2 of a request on a warm RSE, and the same sleep parameter ϕ, a greater number n w of the warm PMs gives rise to a lower average latency E[W] of requests. e more the PMs are deployed in the warm pool, the earlier the requests arrive at the warm pool receive service. erefore, the average latency of requests gets reduced. In addition, we also find that the downtrend for the average latency of requests becomes sharper as the number of the warm PMs increases in the case of a smaller sleep parameter.
is implies that the more the PMs are deployed in the warm pool, the stronger the influence of the sleep mechanism on the response performance of the system becomes. Figure 3 shows the trends for the energy-saving rate S of the system with the sleep parameter ϕ for a different number n h of the hot PMs and a different number n w of the warm PMs.

Numerical Results for the Energy-Saving Rate of the System.
In Figures 3(a) and 3(b), we show the energy-saving rate S of the system for the different service rates μ 1 of a request on a hot VM and the different capacities L of a hot PM buffer, respectively. In Figures 3(c) and 3(d), we show the energy-saving rate S of the system for the different service rates μ 2 of a request on a warm VM and the different service rates β 2 of a request on a warm RSE, respectively.
From Figure 3, we notice that, as the sleep parameter ϕ increases, the energy-saving rate S of the system shows a downward trend. In the stage of a smaller sleep parameter, the energy-saving rate of the system is initially higher. e smaller the sleep parameter is, the longer the time length of a sleep period is. For this case, frequent listening and waking up of the warm RSE and the warm VMs are avoided so that additional energy use is reduced.
As the sleep parameter ϕ gets larger, the energy-saving rate S of the system decreases. e larger the sleep parameter is, the shorter the time length of a sleep period is. For this case, the warm RSE and the warm VMs listen to the buffer and wake up from sleep frequently. is causes additional energy consumption.
For the same sleep parameters ϕ in both Figures 3(a) and 3(b), we notice that the energy-saving rate S of the system goes up as the capacity L of a hot PM buffer or the service rate μ 1 of a request on a hot VM increases. e larger the capacity of a hot PM buffer is, the more requests the hot PM can accept. e higher the service rate of a request on a hot VM is, the less time a request occupies the hot VM. erefore, the processing capability of a hot PM becomes stronger. For this case, fewer requests are allocated to the warm pool so that the warm PMs are more likely to be in the sleep state. Accordingly, the energy-saving rate of the system is greater.
Comparing Figures 3(a) with 3(b), we find that, for the same capacity L of a hot PM buffer, the same service rate μ 1 of a request on a hot VM, and the same sleep parameter ϕ, a larger number n h of the hot PMs leads to a higher energysaving rate S of the system. e more the PMs are deployed in the hot pool, the stronger the processing capability of the hot pool is. For this case, fewer requests are allocated to the warm pool so that the warm PM is more likely to be in the sleep state. is causes an increase in the energy-saving rate of the system. In addition, we also find that the more the PMs are deployed in the hot pool, the closer the energy-saving rates of the system with different capacities of a hot PM buffer and different service rates of a request on a hot VM are. is implies that the capacity of the hot PM buffer and the service rate of a request on a hot VM have less influence on the energy-saving rate of the system as the number of the hot PMs rises.
For the same sleep parameters ϕ in both Figures 3(c) and 3(d), we observe that the energy-saving rate S of the system rises as the service rate μ 2 of a request on a warm VM or service rate β 2 of a request on a warm RSE grows. e higher the service rate of a request on a warm RSE is, the less time a request occupies the warm RSE. e higher the service rate of a request on a warm VM is, the less time a request occupies the warm VM. For this case, the warm PM is more likely to become idle and enter a sleep period. erefore, the energy-saving rate of the system shows a growth trend.
Comparing Figures 3(c) with 3(d), we find that, for the same service rate μ 2 of a request on a warm VM, the same service rate β 2 of a request on a warm RSE, and the same sleep parameter ϕ, a greater number n w of the warm PMs leads to a higher energy-saving rate S of the system. e more the PMs are deployed in the warm pool, the stronger the processing capability of the warm pool is. For this case, the probability of a warm PM being idle is higher, so the warm PM is more likely to be in the sleep state. is leads to an increase in the energy-saving rate of the system. In addition, we also find that the more the PMs are deployed in the warm pool, the closer the energy-saving rates of the system with different service rates of a request on a warm VM and different service rates of a request on a warm RSE are. is implies that when the number of warm PMs is greater, the energy-saving rate of the system is rarely affected by the service rate of a request on a warm VM and the service rate of a request on a warm RSE.

Performance Optimization
Based on the numerical results given in Section 7, we find that, with an increase in the sleep parameter ϕ, the average latency E[W] of requests shows a downward trend, and the energy-saving rate S of the system also decrease. is indicates that when the sleep parameter tends to infinity, the average latency of requests will be minimized, and the energy-saving rate will be close to zero. Obviously, in this case, the energy-saving mechanism will not work at all. Conversely, when the sleep parameter tends to zero, the energy- Table 3: Improved Salp Swarm Algorithm proposed to obtain the optimal sleep parameters.
Step 1. Initialize the number N of salps, maximum iteration T max for each salp's position, initial inertia weight w s , inertia weight w m at the maximum iteration, upper search boundary u b , and lower Search boundary l b .
Step 2. Initialize the position ϕ i (i � 1, 2, . . . , N) for each salp by using a chaotic equation: ϕ 1 � rand. % rand represents random numbers that obey uniform distribution between (0, 1).% for i � 2: Step 3. Calculate the fitness F i (i � 1, 2, . . . , N) for each salp: Step 4. Select the best position ϕ * among all the salps as the source food and calculate the fitness F * of the source food: Step 5. Set the initial number of iterations as t � 1.
Step 6. Update the coefficient c 1 and inertia weight w(t) with a nonlinear decreasing function: Step 7. Update the position ϕ i and calculate the fitness F i (i � 1, 2, . . . , N − 1) for other salps. for Step 8. Update the source food ϕ * and calculate the fitness F * of the source food: ϕ * � argmin i∈ 1,2,...,N Step 9. Check the number of iterations: if t < T max t � t + 1, go to Step 6 endif Step 10. Output the optimal sleep parameter ϕ * and the minimum cost F * . 16 Complexity saving rate will be maximized, and the average latency of requests will become too great to be accepted. In this case, the cloud system cannot provide service normally. How to optimally set the sleep parameter is an important issue in any energy-efficient resource management scheme. In this paper, the criterion for optimization is to balance different performance measures. To do this, we combine the average latency of requests and the energy-saving rate of the system and construct a cost function F(ϕ) as follows: where f 1 and f 2 are the influencing factors for the average latency E[W] of requests and the energy-saving rate S of the system, respectively, in regards to the cost function in the system parameters. It is noted that the higher the cloud user's demand for the response performance is, the larger the parameter f 1 should be set; the higher the cloud provider's demand for the energy efficiency is, the larger the parameter f 2 should be set. We note that it is difficult to express the average latency E[W] of requests and the energy-saving rate S of the system in closed forms. erefore, we cannot easily figure out the monotonicity of the cost function. For minimizing the system cost F(ϕ) and optimizing the sleep parameter ϕ, we introduce a swarm-based algorithm: SSA.
SSA is an intelligent searching optimization algorithm inspired by the swarming behaviour of salps. In 2017, Seyedali et al. first established a mathematical model of salp chains and presented the SSA to settle many optimization problems [22]. SSA has only one main controlling parameter, so it is simple and easy to implement. However, like other swarm-based algorithms, SSA has the insufficiencies of low convergence precision and slow convergence speed when dealing with highdimensional complex optimization problems [23]. In the classical SSA optimization process, global exploration and local exploitation are a pair of contradictions. If this process is out of balance, the algorithm easily falls into local optimization and leads to convergence stagnation. Consequently, in this paper, we present an improved SSA by introducing logistic chaotic initialization and adaptive inertia weight [24]. We call this improved SSA LA-SSA.
In this LA-SSA, we firstly adopt a logistic chaotic mapping method to generate the initial salp population.
is enhances the diversity of the initial individuals and improves the convergence speed of the algorithm in the early stage. Secondly, we introduce an adaptive inertia weight to update the follower position. e inertia weight reflects the ability of the follower to inherit the salp position from the previous one. If the position of the follower is the locally optimal solution, it is easy to fall into the local optimum and result in convergence stagnation for SSA. Moreover, to improve the convergence precision and help SSA break out of the local optimum, in this paper, the inertia weight of linear decline is introduced, which determines the degree of influence of the previous individual on the current individual. is means salp individuals have strong global convergence capacity and relatively accurate results can be obtained in the later stage. Table 3 shows the main steps of the LA-SSA.
In addition to utilize the parameters in Table 2, we set f 1 � 5, f 2 � 1, N � 50, T max � 100, ξ � 4, w s � 0.9, w e � 0.4, ub � 5, and lb � 0 as an example in the LA-SSA to optimize the dynamic energy-efficient resource management scheme proposed in this paper. For different capacities L of a hot PM buffer and different service rates μ 2 of a request on a warm VM, we produce the optimal sleep parameter ϕ * and the minimum cost F * in Table 4.
From Table 4, we observe that, for the same capacity L of a hot PM buffer, the optimal sleep parameter ϕ * maintains an upward trend as the service rate μ 2 of a request on a warm VM increases. In contrast, the minimum cost F * shows a downward trend when the service rate μ 2 of a request on a warm VM goes up.

Summary
Considering large amounts of energy consumption generated by cloud data centers, we proposed a dynamic energy-efficient resource management scheme under a multitier cloud architecture. In order to improve the energy efficiency while maintaining the quality of experience for cloud users, we grouped the PMs into different resource pools and introduced a synchronous sleep mechanism to the warm pool. By establishing a Markov chain, we obtained the average latency of requests and the energy-saving rate of the system. In addition, we provided numerical results to study the influence of the sleep mechanism on the system performance. To balance different performance measures, we constructed a system cost function. Moreover, we presented an improved SSA to obtain the optimal sleep parameters and the minimum costs.
In subsequent work, we consider to study energy conservation in cloud systems with heterogeneous cloud users and PMs. Furthermore, we consider to analyze the system models by considering any general stochastic processes, such as Markovian Arrival Process (MAP) and Markovian Service Process (MSP).

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.