Study QoS Optimization and Energy Saving Techniques in Cloud, Fog, Edge, and IoT

With an increase of service users’ demands on high quality of services (QoS), more and more efficient service computing models are proposed.,e development of cloud computing, fog computing, and edge computing brings a number of challenges, e.g., QoS optimization and energy saving. We do a comprehensive survey on QoS optimization and energy saving in cloud computing, fog computing, edge computing, and IoT environments. We summarize the main challenges and analyze corresponding solutions proposed by existing works.,is survey aims to help readers have a deeper understanding on the concepts of different computing models and study the techniques of QoS optimization and energy saving in these models.


Introduction
With the development of the Internet, more and more computing techniques are developed. In this situation, an increasing amount of data needs to be processed. e increase of users' requirements causes the development of different types of computing models, such as cloud computing, fog computing, and edge computing.
Cloud computing is an early computing model that has made great contributions to data processing. It provides convenient and quick network access to shared configurable resources, such as networks and servers. In addition, provisioning and publishing these resources do not require much administration and interaction of service providers [1]. e structure of cloud computing is shown in Figure 1.
Due to the development of the IoT and the increasing needs of people, the IoT system based on cloud computing faces some limitations. In this situation, cloud computing cannot play a good role in large-scale or heterogeneous conditions [3]. erefore, a new computing model called fog computing is developed on the basis of cloud computing. Compared with cloud computing, the main advantage of fog computing is that it extends cloud resources to the network edge. erefore, fog computing can facilitate the management of resources and services [4]. e structure of fog computing is shown in Figure 2.
Edge computing allows operations to be performed on the edge of a network [2]. Edge computing refers to all the resources of computing and network from data sources to cloud data centers. In edge computing, the flow of computing is bidirectional and things in edge computing can both consume data and produce data. at is, they can not only ask the cloud for services but also carries out computing jobs in the cloud [2]. e structure of edge computing is shown in Figure 3. e most popular embodiment of edge computing is the MEC, which refers to the technology of performing computationintensive and delay-sensitive tasks for mobile devices. And its theory is collecting a large amount of free computing power and storage resources located at the edge of a network. e European Telecommunication Standards Institute was the first to define it as a computing model. MEC provides the capabilities of information technology and cloud computing at the network edge. e IoT is created by the diffusion of sensors, actuators, and other devices in the communication driven network. e development of wireless technologies, such as the wireless sensor network technology and actuator nodes, promotes the development of the IoT technology. With the development of the IoT, its application has gradually expanded to cover increasingly wider domains. However, it always aims to make computers perceive information [6].
is paper investigates the important papers related to these computing models. For each paper, we point out the problems it aims to solve and introduce the solutions it proposes. e main contribution of this paper is as follows: (1) do a comprehensive survey on the techniques of QoS optimization and energy saving in different computing models, (2) classify papers according to the problems solved by the reviewed works, and (3) compare and summarize the main features of each type of paper. e structure of this paper is Section 2 studies five energy saving techniques under different computing models, and Section 3 concludes this paper.

QoS Optimization and Energy saving Techniques in Different Computing Models
In this section, we introduce the main works of QoS optimization and energy saving techniques in different computing models. We categorize these works in terms of the means they use to achieve the objective of QoS optimization and energy saving, which are (1) quality of service (QoS) guarantee or service-level agreement (SLA) assurance, (2) resource management and allocation, (3) scientific workflow execution, (4) server optimization, and (5) load balancing.

QoS
Guarantee or SLA Assurance. Improving QoS or reducing SLA violations can effectively guarantee the transmission bandwidth, reduce the transmission delay, and reduce the packet loss rate of data. Striking a balance between QoS and limited resources can achieve energy saving.

Cloud Computing.
Mazzucco et al. [7] let cloud service providers get the maximum benefit by reducing power consumption. In addition, they introduced and evaluated the policy of dynamic allocation of powering servers' switches. It can optimize users' experience while consuming  2 Complexity the least amount of power. He et al. [8] proposed a servicebased system supporting keyword search, in which different search keywords represent different tasks. is method can help unprofessional service users build service-oriented systems. Sun et al. [9] proposed a cloud service selection method to measure and aggregate the nonlinear relationship between standards. And a framework based on priority is designed to determine the criteria relationships and weights when historical information is insufficient. Mazzucco and Dyachuk [10] were also committed to making cloud service providers obtain the largest profits. ey proposed the dynamic distribution strategy of powering server switch. e strategy not only enables users to get good service but also reduces power consumption. e number of live servers determines the state of the system, but running or closing a server cannot be done in a flash. So it is important to take into account of the time. Given the short time required for the server switch, formula (1) [10] represents the cost of changing the number of servers running per unit time. In order to make users have a good experience, this paper further uses a forecasting method to accurately predict the users' time-changing needs: where t represents the observation time, Δn represents the number of servers whose state change over time, d i represents the cost of the state change of a hardware component, e 3 represents the energy consumed in a unit time to change the state, k represents the average time to change the state of a server, and l represents the amount of component.
Mazzucco et al. [7] and Mazzucco and Dyachuk [10] both explore strategies of reducing the power cost of running a data center and changing the on-off state of the servers. e two strategies can maximize the users' experience and save energy at the same time. eir difference is that Mazzucco and Dyachuk [10] believes that it is impossible to accurately predict the changes of users' needs over time. So, compared with paper [7], the strategy proposed in paper [10] is faulttolerant. He et al. [11] proposed three service selection methods that support QoS and can combine multitenant service-based systems.
ese three methods can achieve three degrees of multitenant maturity, which is more efficient than the traditional single-user approach. Sun et al. [12] proposed a unified semantic model to describe cloud service. is model expands the basic structure of unified service description language. And it defines a transaction module to model the rating system for cloud services from various perspectives. So, it can improve the ability of the model on service ranking. In addition, an annotation system is put forward to enrich the language expression. Wang et al. [13] proposed a fault-tolerant strategy based on multitenant service criticality, which can provide redundancy for key component services, evaluating the criticality of each component service to determine the optimal fault-tolerant policy. erefore, the quality of the multitenant based service system can be guaranteed. Mustafa et al. [14] leveraged the notion of workload consolidation to improve energy efficiency by putting incoming jobs on as few servers as possible. e concept of SLA is also imported to minimize the total SLA violations. Given that a change in workload changes the utilization of CPU required over time. So, an integral function (formula (2) [14]) is used to represent the total energy (E) consumed by a server (S) operation: where P is the amount of power consumed by the server in terms of CPU utilization (u) in time t. Bi et al. [15] established an architecture that can administrate itself in cloud data centers firstly. e architecture is suitable for web application services with several levels and has virtualization mechanism. en, a mixed queuing model is proposed to decide the number of virtual machines (VMs) in each layer of application service environments. Formula (3) [15] is used to represent the local profit that can be made by the ith virtualization application service environment. Finally, a problem of misalignment restrained optimization and a heuristic mixed optimization algorithm are proposed. Both of them can make more revenues and meet requirements of different customers: where Revenue(E), Penalty(E), Loss(E), and Cost(E), respectively, represent the total benefit, penalty, loss, and cost of VMs.
Singh et al. [16] proposed a technology named STAR which can manage resources itself in the cloud computing  Complexity 3 environment and reduce SLA violations. So, the payment efficiency of cloud services can be improved. Beloglazov and Buyya [17] proposed a system to manage energy in the cloud data center. By continuously integrating VMs and dynamically redistributing VMs, the system can achieve the goal of saving energy and providing a high QoS level at the same time. Guazzone et al. [18] proposed an automatic management system (see Figure 4) for resources to provide certain QoS levels and reduce energy consumption. Resource manager of the framework in Figure 4 combines virtualization technologies and control-theoretic technologies. Virtualization technologies deploy each application to independent VM. And control-theoretic technologies realize the automatic management of computer performance and energy consumption. In addition, the resource manager consists of several independent components named Application Manager, Physical Machine Manager, and Migration Manager. Different from traditional static methods, this method can both fit the changing workloads dynamically and achieve remarkable results in reducing QoS violations. Sun et al. [19] established a model to simplify the decision of cloud resource allocation and realize the independent allocation of resources. e optimal resource configuration can be obtained, so the QoS requirements can be well met. Siddesh and Srinivasa [20] explored the problems of dynamic resource allocation and SLA assurance. ey proposed a framework to deal with heterogeneous workload types by dynamically planning computing capacity and assessing risks. e framework uses scheduling methods to reduce SLA violation risks and maximize revenues in resource allocation.
Garg et al. [21] proposed a resource allocation strategy for VM dynamic allocation. e strategy can improve resource utilization, increase providers' profits, and reduce SLA violations. Jing et al. [22] proposed a new dynamic allocating technique using the mixed queue model, meeting customers' different requirements of performance by providing virtualized resources to each layer of virtualized application services. All these methods can reasonably configure resources in the cloud data center, improve system performance, reduce additional costs of using resources, and meet the required QoS.
Qi et al. [23] proposed a QoS-aware VM scheduling strategy named QVMS to satisfy QoS. Firstly, the scheduling problem is transformed into a problem with several objectives. And then the optimal VM migration method is found according to the genetic algorithm. e scheduling strategy can effectively manage resources in the network physical system, thus reducing the energy consumption and improving QoS levels. Qi et al. [25] proposed a service recommendation strategy by considering the time factor to improve the traditional location-sensitive hash technology. e policy emphasizes the influence of dynamic factors on QoS and the protection of user privacy. Table 1 shows a summary of the abovementioned works. e solution of the problems in Table 1 can improve QoS in cloud computing environment. Server management refers to dynamically allocating powering servers' switches. Workloads consolidation refers to combining work to save energy.
VM management refers to reasonable scheduling or integration of VMs to achieve better performance. Self-management refers to the realization of self-management of resources, which can achieve higher efficiency. Resource management refers to the correct allocation of resources to reduce waste. Service management is about making reasonable service choices [26].

Fog
Computing. Gu et al. [27] used fog computing to process a large amount of data generated by medical devices and built Fog Computing Supported Medical Cyber-Physical System (FC-MCPS). In order to reduce the cost of FC-MCPS, research studies were carried out on the joint of base station, task assignment, and VM layout. e problem is modeled as a mixed integer linear programming (MILP). A two-stage heuristic algorithm based on linear programming (LP) is proposed to solve the problem. Ni et al. [28] proposed a resource allocation approach based on fog computing, which enables users to select resources independently. In addition, this approach takes into account the price and time required to finish the job. Formula (4) [28] is used to define the credibility BC r e ij of Resource R j received from user i , when the user interacts with Resource R i : where the value of ω k ϵ[0, 1], 4 k�1 ω k � 1, which can be determined by the user or the actual situation, λ r esp, c e xec, η r eboot, and μ r el are the response speed of the corresponding index service, the efficiency of execution, the speed of restart, and the reliability, respectively.

Edge Computing.
Wei et al. [29] proposed a unified framework in the sustainable edge computing to save energy, including the energy that is distributed and renewable. And the architecture can combine the system that supply energy and edge services, which can make full use of renewable energy and provide better QoS. Lai et al. [30] proposed an optimized allocation method for edge users. e method can not only maximize the amount of resources allocated to users but also consider the dynamic QoS level of users. So, edge user allocation problem can be made more general and improving the quality of experience.

MEC.
Xu et al. [31] used block chain to improve the traditional crowdsourcing technology. Firstly, they proposed a mobile crowdsourcing framework using block chain technology to protect user privacy. en, they used dynamic programming strategy of clustering algorithm to classify requesters. Finally, they generated service policies to balance profits and energy consumption.

IoT.
Rolik et al. [32] proposed a method to build a framework of IoT infrastructure based on microcloud. e method can help users use resources rationally, reduce the cost of managing infrastructure, and improve the quality of life of consumers. He et al. [33] proposed a dynamic network slice strategy. e network slice can be dynamically adjusted according to the time-varying resource demands. is method can improve the utilization of the underlying resources and better meet different QoS demands. Yao and Ansari [34] proposed an algorithm to determine the number of VMs to be rented and to control the power supply. us, the cost of the system can be minimized and the QoS can be improved. Formula (5) [34] is used to limit the delay requirement of QoS. e total delay must not exceed the computation deadline of each task, and the total delay is composed of wireless transmission delay and fog processing delay: where c and w, respectively, represents fog processing and wireless transmission, i denotes a location, t c i represents the delay of processing, t w i represents the delay of wireless transmission, D i denotes the deadline, and N denotes different locations.

Resource Management and Allocation.
Rational allocation of resources is an effective means to save energy.

Cloud Computing.
Wang et al. [35] introduced an allocation method based on distributed multiagent to allocate VMs to physical machines. e method can realize VM consolidation and consider the migration costs simultaneously. In addition, a VM migration mechanism based on local negotiation is proposed to avoid unnecessary VM migration costs. Hassan et al. [37] established a formulation of universal problem and proposed a heuristic algorithm which has optimal parameters. Under this formulation, dynamic resource allocation can be made to meet the QoS requirements of applications. And the cost needed for dynamic resource allocation can be minimized with this algorithm. Wu et al. [38] proposed a scheduling algorithm based on the technology that can scale the voltage frequency dynamically in cloud computing. e algorithm can allocate resources for performing tasks and realize low power consumption network infrastructure. Compared with other schemes, this scheme not only sacrifices the performance of execution operations but also saves more energy.
Sarbazi and Zomaya [45] used two job consolidation heuristic methods to save energy. One is MaxUtil to better utilize resources and the other is Energy-Conscious Task Consolidation to reduce energy consumption. ese two methods can promote the concurrent execution of multiple tasks and improve the energy efficiency. Hsu et al. [46] proposed a job consolidation technique to minimize energy consumption. Formula (6) [46] defines the energy consumption of VM V i from time t 0 to t m in the cluster is defined. And formula (7) [16] Reduces SLA violations and improves payment efficiency of cloud services A dynamic resource management system [18] Self-manages the resources of cloud infrastructures to provide appropriate QoS and fits the changing workloads dynamically Resource management A model that can realize the independent allocation of resources [19] Obtains the optimal resource configuration, meets the QoS requirements, and provides economical cloud resources A dynamic resource allocation strategy [20,21] Reduces SLA and maximizes revenues and resource utilization on the cloud A mixed queue model [22] Reasonably configures the resources in the cloud data center, improves the system performance, reduces the additional cost of using resources, meets the required QoS, and provides virtual resources to each layer of virtual application services Service management A unified semantic model that can describe cloud service [12] Improves the ability of model on service ranking and enriches the language expression A recommendation service strategy [25] Emphasizes the influence of time factors on QoS and improves the traditional locationsensitive hash technology to protect users' privacy A cloud service selection method using fuzzy measure and Choquet integral and a framework based on priority [9] Selects service when historical information is insufficient to determine the criteria relationships and weights ree service selection methods that support QoS and can combine multitenant service-based systems [11] Achieves three degrees of multitenant maturity, which is more efficient than the traditional single-user approach A fault-tolerant strategy based on multitenant service criticality [13] Guarantees the quality of the multitenant-based service system A service-based system supporting keyword search [8] Effectively helps system engineers who are not familiar with service-oriented architecture technology to build service-oriented systems 6 Complexity merges tasks in virtual clusters. Once a task migration happens, the energy cost model will take into account the network latency. Sarbazi-Azad and Zomaya [45] and Hsu et al. [46] both maximize the benefit of cloud resources by using task merging techniques. Sarbazi-Azad and Zomaya [45] uses a greedy algorithm called MaxUtil. While, Hsu et al. [46] takes into account the network latency associated with task migration. So, in [46], a 17% improvement is achieved over MaxUtil: where E t is the energy consumption in unit time and n is the number of VMs in the cluster. Hsu et al. [47] proposed a task integration technology based on the energy perception. According to the characteristics of most cloud systems, the principle of using 70% CPU is proposed to administrate job integration among virtual clusters. is technology is very effective in reducing the amount of energy consumed in cloud systems by merging tasks. Panda and Jana [48] proposed an algorithm with several criteria to combine tasks. e algorithm not only considers the time needed for processing jobs but also considers the utilization rate of VMs. And the algorithm is more energy efficient because it takes into account not only the processing time but also the utilization rate of VMs. Wang and Su [39] proposed a resource allocation algorithm to deal with wide range of communication between nodes in cloud environment. is algorithm uses recognition technology to dynamically distribute jobs and nodes according to computing ability and factors of storage. And it can reduce the traffic when allocating resources because it uses dynamic hierarchy. Lin et al. [40] proposed a dynamic auction approach for resource allocation. e approach can ensure that even if there are many users and resources, providers will have reasonable profits and computing resources will be allocated correctly. Yazir et al. [41] proposed a new method to manage resources dynamically and autonomously. Firstly, resource management is split into jobs and each job is executed by autonomous nodes. Second, autonomous nodes use the method called PROMETHEE to configure resources. Krishnajyothi [36] proposed a framework which can implement parallel task processing to solve the problem of low efficiency when submitting large tasks. Compared with the static framework, this framework can dynamically allocate VMs, thus reducing costs and the time of processing tasks. Lin et al. [42] proposed a method to allocate resources dynamically by using thresholds. Because this method uses the threshold value, it can optimize the reallocation of resources, improve the usage of resources, and reduce the cost. Xu et al. [43] proposed a data placement strategy named IDP for the data generated by IoTs devices to achieve reasonable data placement. In this way, the privacy of these data can be protected while resources are allocated reasonably. Jo et al. [44] proposed a computing offload framework under 5G network. e framework transfers the computing burden to the cloud, thus reducing the computing load of clients and the communication cost. Table 2 shows a summary of the abovementioned works. e problem of resource allocation and management in cloud computing can be divided into problems in Table 2. VM management is about a reasonable configuration of VMs. Resource allocation represents the dynamic and flexible allocation of resources. Task integration refers to combining tasks to save energy and improve efficiency.

Fog
Computing. Yin et al. [49] established a new model of scheduling jobs, which applies containers. In order to make sure that jobs can be finished on time, a job scheduling algorithm is developed. e algorithm can also optimize the number of tasks that can be performed together on the nodes in fog computing. And this paper proposes a redistribution mechanism to shorten the delay of tasks.
ese methods are very effective in reducing task delays. Aazam and Huh [50] established a framework to administrate resources effectively in the mode of fog computing. Considering that there are various types of objects and devices, the connection between them may be volatile. So, a method for predicting and administrating resources is proposed. e method considers that any object or device can quit using resources at anytime. Cuong et al. [5] studied the allocating resources jointly problem and the problem of carbon footprint minimization in fog data center. Formula (8) [5] is used to denote the energy consumption of servers. In addition, a distributed algorithm is proposed to solve the problem of wide range optimization: where P(y) represents the power supply required by the servers in a data center; y represents the video stream; κ denotes a conversion factor that converts the video stream into workload; C represents the data center's load capacity; and P idle and P peak , respectively, represent the idle power and peak power of the servers. Jia et al. [51] studied the problem of computing resource allocation in fog computing network with three levels. Firstly, the problem of resource allocation is transformed into a bilateral matching optimal problem. And then a bimatching approach is proposed for this problem, which can improve the performance of the system and obtain higher cost efficiency. Zhang et al. [52] proposed a framework for joint optimization under fog computing to allocate fog nodes' finite computing resources. e framework can achieve the best allocation and effectively improve the networks' performance. Tan et al. [53] presented a method to allocate computing and communication resources. e method transfers computing jobs to remote cloud and nodes and simplifies edge nodes' computing and computing energy. Vasconcelos et al. [54] developed a platform to allocate resources accessible to client devices in fog computing environment, allocating the resources of devices near the host to meet the applications needs for rapid response to computing resources. Aazam et al. [55] presented a method Complexity 7 to estimate and manage resource in fog computing. e method is based on the fluctuation of customer abandonment probability, type and price of service, and so on. Table 3 shows a summary of the abovementioned works. e problems in Table 3 are also derived from resource allocation and management problems. Task allocation represents the scheduling and redistribution of tasks. Resource allocation is still about the dynamic and flexible allocation of resources. Low latency refers to taking short time to configure and manage resources, which can improve efficiency.

Edge Computing.
Tung et al. [56] proposed a new framework for resource allocation based on market needs. e resources come from edge nodes (ENs) with limited heterogeneous capabilities and are allocated to multiple competing services on the network edge. Generating a market equilibrium solution by reasonably pricing ENs can obtain the maximum utilization of marginal computing resources. Xu et al. [57] proposed a strategy to optimize offloading and privacy protection. is strategy shifts tasks firstly to improve the resource utilization of resource-limited edge cells. And then it balances QoS performance and privacy protection to achieve joint optimization.
Xu et al. [58] proposed an offload strategy for edge computing under 5G network, which uses block chain technology. e optimal strategy is further obtained by using the balanced offloading method. It solves the problem of data loss under the condition of transmission delay, which is caused by the uneven requirements of user equipments on resources. Xu et al. [59] proposed a computational offloading method named EACO to reduce the energy consumption in smart computing models. Figure 5 shows architecture of smart edge computing, where the shortest path is used to unload tasks. EACO uses genetic algorithms to reduce the energy consumption for operating edge computing nodes and improve the efficiency of performing complex computing tasks. Xu et al. [60] proposed a computational offloading strategy for edge computing to protect the privacy of interconnected vehicle networks. ey firstly analyzed privacy conflicts of tasks. And then they designed the communication route to obtain routing vehicles, which can achieve the optimization of several objectives. Yeting et al. [61] proposed a unique resource allocation mechanism. e mechanism takes each individual task as the basis for  [35] Centralizes VMs to physical machines and reduces the overall energy cost A framework which can implement parallel task processing [36] Dynamically allocates VMs for large tasks

Resource allocation
A general problem formula and a heuristic algorithm with optimization parameters [37] Dynamically allocates resources according to QoS requirements and realizes energy saving by optimizing the number of servers A scheduling algorithm based on the technology that can scale the voltage frequency dynamically [38] Ensures the performance of executing jobs while implementing green computing A fuzzy pattern recognition technology [39] Reduces the traffic when allocating resources and uses dynamic hierarchy A dynamic auction approach [40] Guarantees profits of providers and allocates resources correctly when there are a large amount of users and resources PROMETHEE [41] Allows node agents to decompose and execute tasks autonomously to improve the flexibility of resource allocation A method using thresholds [42] Optimizes resource reallocation, improves resource usage, reduces cost, and studies resource allocation strategies at the application level IoT-oriented data placement (IDP) method [43] Protects data privacy, allocates resources reasonably, and focuses on the placement method of IoT data A computing offload framework under 5G network [44] Reduces the computing load of clients and the communication cost

Task consolidation
Two job consolidation heuristics named "MaxUtil" and energy-conscious task consolidation [45] Promotes concurrent execution of multiple tasks and improves energy efficiency A job consolidation technique aiming at energy saving [46,47] Reduces the power consumption of cloud system, protects data privacy, allocates resources reasonably, limits CPU usage, and merges tasks in virtual clusters An algorithm with several criteria [48] Takes into account the job processing time and the VM utilization rate simultaneously. Dramatically reduces energy consumption compared with state-of-the-art works 8 Complexity resource allocation, rather than for the whole service. It reduces the packet loss rate and saves energy by unloading services.

MEC.
Chen et al. [62] studied the problem of computing unloading with several users in the environment of MEC with wireless interference which have many channels. In addition, a distributed algorithm for computing unloading is developed. e algorithm can perform the unloading well even when there are a large number of users. Gao et al. [63] built a quadratic binary program, which is able to assign tasks in mobile cloud computing environment. Two algorithms are presented to obtain the optimal solution. Both of these heuristic algorithms can effectively solve the task assignment problem. Xu et al. [64] proposed an offloading method using block chain technology. It can guarantee the loss of data in offloading tasks under edge computing. And it can solve the problem of resource requests out of proportion due to the limited load of edge computing equipment during task transfer. Yifei et al. [65] proposed a model-free reinforcement learning framework to solve the problem of computational unloading. is model can be applied to the computational unloading with time-changing computing requests.

2.2.5.
IoT. Barcelo et al. [66] expressed the problem of service allocation [67] as a mixed flow problem with minimum cost which can be solved by LP, solving this service allocation problem can solve the problems of unbalanced network load and delay of end-to-end service. And it can also figure out the problem of excessive consumption of electricity brought by the architecture of centralized cloud. Angelakis et al. [68] assigned the requirements of services resources to heterogeneous network interfaces of equipments. So, more heterogeneous network interfaces can be used by a large amount of services.  A framework to administrate resources and a method to predict and administrate resources [50] Assists service providers to predict the amount of available resources based on different types of service customers and deals with the phenomenon that objects or devices withdraw from resource utilization at any time A bi-matching approach [51] Improves the performance of the system and obtains higher cost efficiency A framework for joint optimization [52] Optimizes resource allocation and improves network performance A method to allocate computing and communication resources [53] Allocates computing and communication resources, transfers computing jobs to remote cloud and nodes, simplifies edge nodes' computing, and saves computing energy A platform to allocate resources accessible to client devices [54] Enables rapid response to computing resources and allocates the resources of devices near the host A framework to manage resources [55] Considers the fluctuation of the customer abandonment probability and service types Low latency A distributed algorithm [5] Solves the problem of wide range optimization and allocates resources jointly Complexity Li et al. [69] proposed communication framework in 5G and studied the problem of allocating power and channels. So, the signal data in the channel can be available and the total energy efficiency can be maximum. Formula (9) [69] shows how to calculate the energy efficiency of a system: where EE S i,K and EE A j,k , respectively, denote the energy efficiency of sensor S and actuator A on channels. e sets of sensors, actuators, and channels are, respectively, represented as s � S 1 , S 2 , . . . , S M , A � A 1 , A 2 , . . . , A N , and C � C 1 , C 2 , . . . , C K .
Liu et al. [70] studied the problem of allocating resources efficiently on IoT that supplies wireless power. In this method, users are first grouped into accessible channels. And then power distribution of users grouped in the same channel is studied to improve throughput of the network.
is method can allocate finite resources to a large group of users. Ejaz and Ibnkahla [71] proposed the resource allocation framework with several bands under cognitive 5G IoT. In the highly dynamic environment of the IoT, multiband method can manage resources more flexibly and reduce more energy consumption. In addition, a reconstruction approach with several levels is proposed to allocate resources reasonably for applications with different needs of QoS. Colistra et al. [72] proposed a protocol which is distributed and optimal to allocate resources in heterogeneous IoT. Because this protocol has excellent adaptability when changing topology of network, it can distribute resources evenly among nodes. Jian et al. [73] proposed a multilevel allocating resources algorithm for IoT communication using advanced technology. e algorithm uses hierarchical structure and has fast data processing rate and very low latency in a saturated or not saturated environment.
Zheng and Liu [74] proposed a new algorithm to allocate bandwidth dynamically for controlling remote computers in the IoT.
is method can reduce the error of signal reconstruction under the same bandwidth and make the bandwidth allocation of IoT more reasonable. Gai and Qiu [75] used reinforcement learning mechanisms to allocate resources to achieve high Quality of Experience. is method can effectively solve the resource allocation problems caused by the mismatch of service quality and complex service providing condition in the IoT. Table 4 shows a summary of the above works. e problem in Table 4 represents the realization of dynamic and flexible allocation of resources. e resources here can represent channels, bandwidth, and power.

Scientific Workflow Execution.
Implementing scientific workflows, especially in heterogeneous environments, can reduce resource waste and reduce energy costs. Scientific workflow can be obtained by reasonably allocating resources and dynamically deploying VMs.

Cloud Computing.
Xu et al. [76] proposed a resource allocation method called EnRealan to solve the problem of energy consumption. e dynamic deployment of VMs is generally adopted to execute scientific workflows. Bousselmi et al. [77] proposed a scheduling method based on energy perception for executing scientific workflows in cloud computing. At first, an algorithm of splitting workflows for energy minimization is presented, which can achieve a high parallelism without huge energy consumption. en, a heuristic algorithm used to optimize cat swarm is proposed for the created partitions. e algorithm can minimize the total consumption of energy and the execution time of workflows. Sonia et al. [78] proposed a workflow scheduling method with several objects and hybrid particle swarm optimization algorithm. In addition, a method for dynamically scaling voltage and frequency is proposed. e method can make the processors work at any voltage level, so as to minimize the energy consumption in the process of workflow scheduling. Both Bousselmi et al. [77] and Sonia et al. [78] use scheduling method to achieve scientific workflows and study the problem of energy consumption. e difference is that Bousselmi et al. [77] focuses on intensive computing tasks, while Sonia et al. [78] focuses on workflow scheduling on heterogeneous computing systems.
Cao [79] established a scheduling algorithm of scientific workflows with an objective of energy saving. is algorithm can enable service providers to gain high profits and reduce users' overhead at the same time. Li et al. [80] proposed a scheduling algorithm based on cloud computing, which can minimize cost of performing workflows within a specified time. In addition, the rented VM was modified to save cost further. Khaleel and Zhu [81] proposed a scheduling algorithm and took scientific workflows as a model to make full use of cloud resources and save energy. Shi et al. [82] designed a flexible resource allocation and job scheduling mechanism to implement scientific workflows. Because this mechanism can implement scientific workflows within prescribed budgets and deadlines, so it can work better than other mechanisms. Table 5 shows a summary of the abovementioned works. e problems in Table 5 are derived from the implementation of the scientific workflow. VM deployment refers to the rational allocation of VMs. Workflow scheduling refers to reducing the scheduling energy and time. In addition, it refers to the scheduling of the workflow on heterogeneous systems. Cost reduction refers to reducing the cost of workflow execution. Effective implementation is about scientific workflow execution within a specified budget and time.

Server Optimization.
Server optimization is also a good way to save energy. e goal of optimizing the server can be achieved by uninstalling unnecessary servers or consolidating servers, as well as by reasonably scheduling tasks. Unlike QoS optimization, server optimization aims to optimize the number of used servers, improves the energy efficiency of servers, and consolidates servers. However, QoS optimization studies how to make users get better experience and meet their needs.

Cloud Computing.
Ge et al. [83] proposed a gametheoretic method and transformed the problem of minimizing energy into a congestion game. All mobile devices in this method are participants in the game. e method chooses a server to unload the computation tasks to optimize QoS levels and save energy, which can optimize the system and save energy. Wang et al. [84] proposed a MapReducebased multitask scheduling algorithm to achieve the objective of energy saving. is model is a two-layer model, which considers the impact of server performance changes on energy consumption, and the limitation of network bandwidth. In addition, a local search operator is designed, based on which a two-layer genetic algorithm is proposed. e algorithm can schedule tens of thousands of tasks in cloud and achieve large-scale optimization. Yanggratoke et al. [85] proposed a general generic gossip protocol, aiming at allocating resources in cloud environment. An instantiation of this protocol was developed to enable server consolidation to allocate resources to more servers to meet changing load patterns.

Load
Balancing. Load balancing can help save energy by managing the number of servers and allocating resources. [86] introduced an operation model that balances cloud computing load and expands applications to save energy. e principle of this model is to define an operating system. e system should make servers run in the system as many as possible. Achieves a high parallelism without huge energy consumption and minimizes the total consumption of energy and execution time of workflows A workflow scheduling method with several objects and hybrid particle swarm optimization algorithm [78] Makes the processors work at any voltage level, minimizes the energy consumption in the process of workflow scheduling, and studies the scheduling problem of workflows on heterogeneous systems A scheduling algorithm based on various applications [79] Enables service providers to gain high profits and reduces user overhead at the same time Cost reduction A scheduling algorithm based on energy perception [80,81] Minimizes the cost of performing workflows while meeting the time constraint Effective implementation A flexible resource allocation and job scheduling mechanism [82] Implements scientific workflows within prescribed budgets and deadlines Table 4: Work summary of resource allocation and management in IoT.

Problems Solutions Literatures Advantages
Resource management A distributed cloud network framework [66] Replaces the centralized architecture with a distributed cloud architecture, solves the defects of the centralized cloud architecture, and brings people better experience A MILP model [68] Assigns the requirement of services' resource to heterogeneous network equipment interface A framework for communication used in 5G [69] Transforms the resource allocation problem into a power and channel allocation problem, minimizes the total energy consumption, and improves QoS levels A low complexity channel allocation algorithm [70] Improves throughput of the network and allocates finite resources to a large group of users A resource allocation framework with several bands under cognitive 5G IoT [71] Manages resources more flexibly and reduces energy consumption than common single-band approach A protocol which is distributed and optimal to allocate resources [72] Has excellent adaptability when changing topology of network and dynamically manages resources in the heterogeneous IoT environment A multilevel allocating resources algorithm for IoT communication [73] Has fast data processing rate and very low latency in both saturated and nonsaturated environment A new algorithm to allocate bandwidth dynamically [74] Reduces the error of signal reconstruction under the same bandwidth and makes the bandwidth allocation of IoT more reasonable A reinforcement learning mechanism [75] Effectively solves the resource allocation problems caused by the mismatch of service quality and complex service providing condition in the IoT When no tasks are being performed, the system should adjust servers to sleep, thus energy consumption can be reduced. Justafort et al. [87] mainly studied the problem of workload distribution across cloud computing environment and proposed a method to solve the problem of the VM layout. So, the footprint of carbon can be effectively reduced. Panwar and Mallick [88] proposed an algorithm to dynamically manage the load and effectively distribute the total incoming requests between VMs. rough efficient and uniform utilization of resources, this algorithm can achieve uniform distribution of load between servers. Yang et al. [89] proposed a power management mechanism to balance the load. e system can monitor VMs and dynamically allocate the resources. Yang et al. [90] proposed an optimization system to better allocate resources dynamically, which can balance the load of VMs running on multiple physical machines. Under this system, VMs can be migrated automatically to adjust high and low loads without interrupting services. Yang et al. [89,90] manage VMs to achieve load balancing. ey allocate resources dynamically to migrate VMs, which can balance workloads on different physical machines. e difference is that Yang et al. [89] integrate a dynamic resource allocation approach with OpenNebula. While, Yang et al. [90] focuse on avoiding service outages during VM migration. Table 6 shows a summary of the abovementioned works. e problems in Table 6 are from the load balancing problem. Server management is about the control of the number of servers running in the system. Workload management is the rational allocation of workload or tasks. VM management refers to configuring VM resources and migrating VMs to adjust loads.

Fog
Computing. Xu et al. [91] proposed a method called "DRAM" to dynamically allocate resources in fog computing environment, which can avoid both too high and too low loads. e method first analyzes the load balance of different kinds of computing nodes. And then it designs a fog resource allocation method to achieve load balance, which allocates resources statically and migrates services dynamically. Oueis et al. [92] studied the load balance problem in fog computing. A custom fog clustering algorithm is proposed to solve the problem. In the problem, several users need to offload computations and all of their demands need to be handled by local computing cluster.

IoT.
Wang et al. [93] established architecture of the energy saving targeted system in industrial IoT. In addition, in order to predict sleep intervals, they developed a sleep scheduling and a wake protocol, which provide a better way for energy saving.

Conclusion
is paper did a comprehensive study of QoS optimization and energy saving in cloud computing, edge computing, fog computing, and IoT models. We summarized five main problems and analyzed their solutions proposed by existing works. By conducting this survey, we aim to help readers have a deeper understanding on the concepts of different computing models and the techniques of QoS optimization and energy saving in these models. e investigated papers focus on issues about ensuring QoS and reducing SLA violations and resource management. In the case of QoS assurance and SLA violations reduction, the main solution of QoS assurance is efficient VM management.
is solution can meet customers' requirements through reasonable scheduling and integration of VMs. Most of resource management techniques are realized by reasonable scheduling of resources, which can reduce the waste of VMs, servers, and traffic.

Disclosure
is manuscript is an extension of A Survey of QoS Optimization and Energy Saving in Cloud, Edge, and IoT in e 9th EAI International Conference on Cloud Computing.

Conflicts of Interest
e authors declare that there are no conflicts of interest regarding the publication of this paper. Table 6: Work summary of load balancing in cloud computing.

Problems
Solutions Literatures Advantages Server management An operation model that can balance cloud computing load and expand application [86] Saves energy by managing the number of servers running in the system

Workload management
A hybrid approach [87] Reduces the footprint of carbon and allocates workload across cloud computing An algorithm to dynamically manage the load [88] Manages the load, evens the load distribution between servers, and allocates tasks between VMs VM management A green power administration mechanism [89] Monitors and jointly allocates VM resources An optimization system [90] Migrates VMs to adjust high and low loads without interrupting services and balances the load of VMs running on multiple physical machines 12 Complexity