SRAF : A Service-Aware Resource Allocation Framework for VM Management in Mobile Data Networks

Service latency and resource utilization are the key factors which limit the development of mobile data networks. To this end, we present a service-aware resource allocation framework, called SRAF, to allocate the basic resources by managing virtual machine (VM). In SRAF, we design two new methods for better virtual machine (VM) management. Firstly, we propose the self-learning classification algorithm (SCA) which executes the service request classification. *en, we use the classification results to schedule different types of VMs. Secondly, we design a sharing mode to jointly execute service requests, which can share the CPU and bandwidth simultaneously. In order to enhance the utilization of resources with the sharing mode, we also design two scaling algorithms, i.e., the horizontal scaling and the vertical scaling, which execute the operation of resource-level scaling and VM-level scaling, respectively. Furthermore, to enhance the stability of SRAF and avoid the frequent operation of scaling, we introduce a Markov decision process (MDP) to control VM migration. *e experimental results reveal that SRAF greatly reduces service latency and enhances resource utilization. In addition, SRAF also has a good performance on stability and robustness for different situations of congestion.


Introduction
Virtual machine (VM) management based on service awareness is a new method which can greatly reduce the service latency and enhance the resource utilization.In addition, VM management has been widely used in various mobile data networks, such as information center network (ICN), mobile vehicle network (MVN), and mobile cloud network (MCN).However, the resource pools of networks are limited.Moreover, the service latency and resource utilization are interactional.erefore, how to reduce the service latency and how to enhance the utilization of resources, simultaneously, have been the focus point of research studies, especially for MCN [1,2].
For this reason, service latency and resource utilization have become main aspects in many research studies.In the perspective of service latency, Reference [3] proposes the Predictable Resource Guarantee Scheduler scheme to realize the proportional sharing of CPU and I/O bandwidth, which reduces the waiting time in the Xen platform.Reference [2] uses the cloudlet selection strategy to schedule the cloudlets for cutting down the response time.In addition, there are also some studies using the method of cutting down the distance between locations to reduce the latency.For example, Reference [1] aims to find the shortest path between the user and the nearest cloud datacenter for reducing the transmission latency.For reducing the queueing time of requests, Reference [4] uses the method of active communication between controllers to proactively pull the requests when the controller finishes its requests so as to cut down the queue length.On the contrary, in the perspective of resource utilization, Reference [5] presents a smart migration mechanism to implement processor memory optimization based on VM placement.References [6,7], respectively, use the methods of Bejo and kNN classification schemes to classify the requests for better scheduling.Reference [8] designs a Lyapunov optimization framework to improve the efficiency of the mobile-edge computing.e purpose of the Lyapunov optimization framework is to minimize the resource overload by VM scheduling.
e works cited above propose many new ideas or methods to realize the optimal request schedule or optimal resource configuration, which can reduce delay or enhance the resource utilization.However, due to the complexity of the cloud network and diversity of mobile devices, the requests are also various and uncontrolled.So, the simple objective studies do not always have a good performance on different factors because many factors are interactional.
erefore, there are some researchers who design efficient methods with an overall framework to optimize these problems.In [9], the authors design a resource sharing framework named "Symbiosis" to realize the sharing of CPU and bandwidth.When one request is working in the CPU, the Symbiosis will make another request to perform the transmission.Moreover, the Symbiosis can efficiently reduce the service latency of the requests.In [6], the authors propose a new classification algorithm named "Bejo" to classify the requests.e classification results are used to perform the VM scheduling.e fitting VM for requests can enhance the utilization of resources.erefore, we propose a new framework called SRAF to execute requests classification and resources sharing based on the strength of research studies in [6,9].
For measuring the performance of SRAF, we analyze MCN in detail.e SRAF can be divided into two aspects.Firstly, we propose a self-learning classification algorithm (SCA) to perform the classification operation before the request is sent to the VM.
e SCA is designed by two weighting methods, location weighting and feature weighting [10], which can improve the veracity of requests classification.e precise classification results can help the request find a fitting VM so as to reduce the service latency.Secondly, we design a sharing mode (Figure 1) to realize the resource sharing in a VM.Furthermore, in order to improve the utilization of the resources, we also design two VM scaling algorithms, the horizontal scaling and vertical scaling.e former is to realize the resource-level scaling in a VM.When the utilization of CPU or bandwidth is too high, the algorithm will add corresponding resources to the VM for avoiding overload, or otherwise for scaling down.e latter is to perform the VM-level scaling.When all the VMs are busy and the arrival tasks are continuously growing, new VMs are created, or otherwise released.e contributions of this paper are as follows: (i) We propose a new framework named "SRAF" to improve the resource utilization and reduce the service latency simultaneously.(ii) We design the SCA which has the self-learning capacity for updating features so as to classify the service requests.In addition, SCA can improve the accuracy of classification continuously until all the features are learned.(iii) We introduce a Markov decision process (MDP) to control VM migration so as to reduce the frequent scaling operation and enhance the stability of SRAF.
(iv) We propose a Combination Scheduling Cost Model and a sharing mode for mobile data networks.Combination Scheduling Cost Model can systematically operate VMs scheduling and scaling.Moreover, the resource utilization is improved directly via the sharing mode.
e rest of this paper is organized as follows: Section 2 has a brief introduction of related research works.We particularly introduce the details of each component of SRAF in Section 3. Section 4 shows the overall process of SRAF, including each model and related algorithm.Section 5 presents the feasibility and performance of SRAF by some experiments.Section 6 presents a brief conclusion of this paper.

Related Work
To reduce the service delay and enhance the utilization of resources, many studies propose various methods.For example, Reference [3] proposes a new prediction method to reach the objective of resource sharing.Researchers use the prediction method to predict the demand of the next duration time so as to adjust the resources for enhancing the utilization of resources.References [2,5] use different methods to schedule the cloudlet and VM for reducing the response time and free time of resources, respectively.Due to the diversity of mobile access devices, the requests are also different and uncontrolled.For this reason, some studies design different classification algorithms to classify different requests, which depend on demand or input data for prior disposal [6,7].en, researchers can use some classical and effective methods to reduce the queueing time and improve the utilization of resources, such as shortest job first (SJF) [12] and priority-aware longest job first (PA-LJF) [13].Moreover, there are also some other new improved methods, such as shortest expected-remaining service time policy (SE-RSTP) [14] and dynamic-threshold service policy [15], which have a better performance for reducing delay and improving the QoS.
In addition, due to the randomness of service requests, some researchers use the Markov decision process (MDP) to quantify the overall process of cloud service.For example, Reference [16] uses the dynamic Markov decision process to model the process of VM scheduling.en, the value iteration algorithm is used to find the optimal VM control policy for reducing energy expenditure.Reference [17] also uses the MDP to quantify the overall VM control.It uses the Bellman optimality equation to find a global optimum threshold so as to cut down erratic operation of VMs.To enhance the veracity of task scheduling by MDP, Reference [18] designs a semi-Markov decision process to select some computation-intensive tasks for offloading so as to reduce the computations in mobile devices.
Due to the above analysis, single method or policy can only realize one or two objectives.erefore, some researchers choose to design an overall framework to handle multiple objectives.For example, References [8,9] design different frameworks, Lyapunov optimization framework 2 Mobile Information Systems and Symbiosis, to control the scheduling of requests and VMs, respectively.Furthermore, there are also some researchers who add classi cation methods to enhance the performance of di erent frameworks for di erent objectives.For example, Reference [19] uses the reweighting method to label di erent factors by machine learning.Reference [11] uses the multi-instance learning method (MIL) to quantify di erent data for precise classi cation with the probabilistic graphical framework.Reference [10] uses the method of local feature selection to classify the data directly.In addition, many studies also use the sharing method to improve the utilization of resources in di erent elds.For example, Reference [20] proposes a feasible and truthful incentive mechanism (TIM) to realize the resource sharing with the trade-o between users and service providers.Reference [21] uses the sharing mode to satisfy the resources demand of the remote radio network and central virtual base station so as to maximize downlink of networks.

Architecture Design
In this section, we designed a green cloud resource allocation framework called SRAF. e objective is to reduce the delay of scheduling the service request and improve the resource utilization simultaneously.Our framework contains three layers, the User Layer, the Request Manager Layer, and the Resource Provider Layer.
In Figure 2, we show the overall response process of service requests.e User Layer has many users with various mobile terminals which send service requests to the mobile cloud.e Request Manager Layer is the most important layer to receive the service requests from the User Layer.Its main duty is to make optimization management of service requests and VMs.en, the results are sent to the Resource Provider Layer for VM con guration.e Resource Provider Layer provides basic resources for the service.e Request Manager Layer includes four components: (1) e History Loads is used to store the requests and their categories which can help the Classi cation Manager in updating its feature mapping library.e category information comes from the Combination Scheduling Manager when requests are serviced.(2) e Classi cation Manager analyzes the information from the requests and classi es the requests into three types, i.e., le-focus tasks, video-focus tasks, and normal tasks, depending on the demand of bandwidth and CPU resources (details are given in Section 4.2).e Resource Provider Layer has many resource pools, such as CPU, bandwidth, and memory.is layer provides basic resources for VMs so as to handle these service requests.

Model Design and Algorithm Analysis
4.1.System Model.Our goal is to reduce the service delay and enhance the resource utilization by the proposed system architecture, which can choose a suitable VM in the physical machine for the service requests.In other words, we will make a tting combination of request, VM, and physical host, described as De nition 1.For a set of hosts host k host 1 , host 2 , . . ., host k , . . ., host Q } and the VM set VM j VM 1 , VM 2 , . . ., VM j , . . ., VM N }, the connection of them can be de ned as a matrix U Q×N , i.e., where if VM j is not created or released on host k at the beginning, then we set u kj −1.If VM j locates on host k , then u kj ∈ (0, 1).At the same time, we use the u cpu kj ∈ (0, 1) to denote the utilization of CPU on VM j and use u bw kj ∈ (0, 1) to denote the utilization of bandwidth on VM j .
De nition 2. For a set of tasks task i task 1 , task 2 , . . ., task i , . . ., task M } and a virtual machine set VM j VM 1 , VM 2 , . . ., VM j , . . ., VM N , the distribution of the tasks is de ned as a matrix A M×N , i.e., where a ij ∈ 0, 1 { }; if a ij 1, then that task i is distributed on VM j .If a ij 0, there is no connection between task i and VM j .Mobile Information Systems

Classi cation Model.
Considering the complexity of mobile terminals, we will make the classi cation from the perspective of service in the mobile cloud network.erefore, all of the requests can be divided into three kinds of tasks, i.e., the video-focus tasks, le-focus tasks, and normal tasks.
e classi cation model has three components: (1) feature mapping library is used to store the relationship of the features and their corresponding service classes so as to make classi cation; (2) classi er traction is used to classify the tasks according to their input data and the mapping in the feature mapping library; and (3) supervisor updater is used to supervise and update the mapping in the feature mapping library based on the information from the History Loads.
In this section, we use self-learning classi cation algorithm (SCA) to operate classi cation.SCA is designed by the machine learning technology [22,23] because we can constantly extend and update the feature mapping library so as to improve the accuracy of the classi cation by the machine learning.e SCA also improves the traditional classi cation algorithm by multiple weighting and uses the semisupervised method to update the features in the feature mapping library according to the feedback from the History Loads [24].SCA uses the method of learning to expand the new relationship of features and service requests so as to enhance the veracity of classi cation results.erefore, SCA uses the combination of location weighting, feature weighting, and self-learning methods to determine the nal class of each request.
In the process of SCA, we use G V and G F to denote the mapping set of videos and les, respectively.We rstly append some typical features into the feature mapping library for the mapping set G G V , G F , such as G V tv, video , dvd, video , . . ., avi, video { } and G F doc, file , wps, file , . . ., ppt, file .en, SCA uses the input data of requests to nd the mapping in the G.In the process of mapping, we use the methods of location weighting and feature weighting to ensure the request has a precise classi cation.Feature weighting is that di erent features have di erent weighting.For example, the feature "video" has a large weighting than "avi" for indicating video tasks.Location weighting is that we use the location of di erent features in the URL to weighting.For example, if a request URL is divided into n segments, then we can use these n segments to form a one-dimensional array L l 1 , l 2 , . . ., l n .We use α i to denote the location weight of the i − th feature: where n loc is the location of the i−th feature in the order.Moreover, the more forward the location is in the order, the more important the feature is in the description [25].We use β as the nal weight to determine which task line the request should be scheduled.For example, there is a request with some features in G V , such as fea Hence, the total weighting of video features in the request URL is calculated by Similarly, the total weighting of le features in the request URL is calculated by Finally, the attribution of the service request is calculated by If β v ≥ β f , this request is transmitted to the video-focus tasks line.Otherwise, this request is transmitted to the lefocus tasks line.If β v and β f are equal to zero, this request will be transmitted to the normal tasks line.e process of SCA is shown in Algorithm 1.

VM Migration.
According to De nition 1 and De nition 2, the overall process of SRAF is to nd an appropriate location in VM j and host k for task i .If we de ne a location function as B(t) ≔ [a ij (t), u kj (t)], then that task i is scheduled on VM j and VM j located in host k at time t.In addition, when task i is classi ed by classi er traction, it will be scheduled to VM j and it cannot be transferred.erefore, when the resource of host k cannot satisfy the demand of  Mobile Information Systems VM j , VM j will be migrated to another host at timeslot t based on practical conditions from the Monitor Manager. Let represents the set of action, where d j,kk ′ (t) means VM j can migrate from host k to host k ′ at timeslot t.Correspondingly, each d j,kk ′ (t) has a migration probability as p j,kk ′ (t), and all the probabilities make up probability set P, indexed as e cost function of VM migration is defined as which means the additional expenditure of VM migration.C k,k′ is the migration expenditure of VM j from host k to host k′ .In addition, C k,k′ is influenced by the migration distance and the latter operation expenditure.Hence, let E � B(t), D(t), P(t), f j (t)   be a basic MDP to represent VM migration because the arrival of service requests is based on the Poisson process [32,33].So, if the capacity of VM j is stationary, the overload and VM migration will be a loop in a long time.erefore, we can get a stationary policy π to control the overall process of VM migration.Now, we use the Bellman optimality equation and the method of dynamic programming to obtain the optimal control policy π [26,27].We introduce the state value function as follows: where R(d j,kk ′ (t)) means that the penalty of VM j operates the action d j,kk ′ (t); T is the discount factor to determine the importance of history data; b(t) is the location of VM j at timeslot t in B(t); and π(kk ′ ) means that VM migrates from host k to host k ′ by policy π.We use (9) to select the optimal state at the next timeslot so as to maximize the reward.en, the action value function is We use (10) to determine the action which can satisfy the optimal state at the next timeslot.Finally, we use the value iteration algorithm to handle the control policy π [27][28][29].
In Algorithm 2, we aim to get and update the control policy π, which can control VM migration based on history data in the History Loads.e control policy π can improve the resource utilization and load balancing in the system.e overall process of Algorithm 2 is to update the state value function by finding the optimal reward path.In other words, we need to traverse all b(t) and choose an optimal location to migrate VM, which can maximize the reward of all the VMs.en, we find an optimal string of states of B(t) over time.Finally, the algorithm uses the backstepping approach to get control π using the action value function and the optimal string of states in B(t).

Combination Scheduling Cost Model.
We firstly design two types of VMs, the high type and the normal type, which have a different capacity of conducting tasks.e tasks in III-B are divided into three kinds.We use the sharing mode (Figure 1) and Symbiosis in [9] to execute the scheduling according to the tasks' demand for resources.
From Figure 2, we propose a sharing mode for the videofocus tasks and file-focus tasks to jointly share the resources of the high-type VMs.With the sharing mode, one VM can simultaneously execute two tasks, one video-focus task and one file-focus task.Moreover, two tasks share the resources of their owner VM, such as the bandwidth and CPU resources.If the video-focus tasks are less than the file-focus tasks, the VM can have two file-focus tasks.If the videofocus tasks are more than file-focus tasks, the VM will execute one video-focus task and wait for the file-focus tasks.On the contrary, the normal tasks are processed on normaltype VMs by the Symbiosis [9] based on the idea of space sharing in CloudSim [30].
In Figure 3, we use an example to show the process of sharing mode.We design three tasks and give different Mobile Information Systems lengths for each task in transmission by bandwidth and the execution length of CPU.In order to express clearly, we set the execution e ciency of bandwidth as 2 in one interval and that of CPU as 3 in one interval.Due to the sharing mode, one VM can have two tasks, and these tasks share resources based on the percentage of 1:1.So, the process of working is as follows: at time 0, the v-task1 and f-task1 were allocated to the same VM.Firstly, v-task1 and f-task1 begin to transfer their transmission length (T-length) by sharing the bandwidth.At time 2, the T-length of f-task1 is nished.e f-task1 begins to solely execute its execution length (E-length) on CPU.At the same time, v-task1 occupied the bandwidth by itself for transferring its remaining T-length.At time 3, v-task1 nishes its T-length and begins to execute its E-length by sharing with f-task1.At time 5, f-task1 nishes the E-length and leaves the VM.en, the f-task2 begins to transfer for working.So, the f-task2 uses bandwidth by itself, and v-task1 also uses the CPU by itself.At time 6, the f-task2 nishes its T-length and begins to occupy the CPU with v-task1.At time 12, f-task2 is nished and leaves the VM.V-task1 uses CPU by itself.Finally, v-task1 nishes its work and leaves the VM at time 14.At this point, the overall process is nished.Algorithms 3-5 In the following, we present the overall algorithm process of SRAF.When the system scheduling algorithm (SSA) starts, we will create basic VMs (line 6) for rstly scheduling.e Monitor will constantly monitor u bw kj (s) and u cpu kj (s) of every VM at the beginning of the s − th interval Δt.
en, the algorithm will choose an operation by the control policy π, whether doing VM migration or VM scaling, for minimizing the cost (lines 7-11).When the task i arrives, SSA will classify task i into the corresponding task line (lines 1-4 and 13).When the task lines have tasks, we will schedule them according to u bw kj (s) and u cpu kj (s) so that we can make full use of the resources (lines 14-17).
In the process of horizontal-scaling algorithm, we set the CPU maximum utilization threshold and the bandwidth maximum utilization threshold as up cpu and up bw , respectively.en, we use up cpu and up bw to compare with u cpu kj (s)and u bw kj (s) at the beginning of each interval Δt, respectively.Due to the comparison results, the algorithm chooses to perform di erent operations of resource-level scaling of every active VM.
In Algorithm 5, the arrival ratio and nished ratio represent the quantity of the arriving and nished requests at the beginning of the interval Δt, respectively.Its main duty is to control VMs in the overall framework.According to the situation of resource utilization and the quantity of requests, the algorithm executes the VM-level scaling.

Performance Evaluation
To evaluate the proposed framework in this paper, we build the SRAF in CloudSim which is a discrete event simulator [30].In CloudSim, we can make duplicate and controllable experiments following our idea.CloudSim can support various environments for the resource allocation and scheduling study.We implement all the models and algorithms in CloudSim for comprehensive evaluation and analysis.
In the following, we show the overall calculation process of the execution cost for measuring performance.In the process of scheduling, we can use (11) and (12) to quantify the cost of each VM at the s − th interval Δt: (5) break.( 6) else ( 7) V V ′ .Mobile Information Systems practical resources cost of all the VMs on host k in their working period is given by e total cost of all the resources in VMs is given by erefore, we get the total resource utilization by using the following equation: where q ∈ cpu, bw  .We can get bandwidth utilization and CPU utilization, respectively, by (15).From the analysis above, we get the final optimization cost as follows based on the control policy π and scaling operation: where f π j (s) is the migration cost of VM j by the control policy π at the s − th timeslot.

System Configuration.
We simulate two physical nodes, and each node has enough resources.VM configuration is shown in Table 1 [9,31].Due to the expensive CPU, we use the quantity of CPU to limit the number of VMs. e workload dataset in this paper is from the Laboratory for Web Algorithmics (LAW) (the dataset is named "eu-2015.urls.gz";see http://law.di.unimi.it/webdata/eu-2015/for more information).In addition, we use the Poisson process to simulate the arrival process of service requests [32,33].To test the performance of frameworks, we try to stabilize the arrival rate.In this paper, we set λ � 8.In the following, we will set λ from 1 to 10 for testing the robustness of SRAF.

Performance Analysis.
In this section, we compare our framework (SRAF) with the framework (Symbiosis) which is proposed in [9].We also add the Bejo algorithm [6] into the Symbiosis.In addition, we add the deadline factor into the experiments for clearly showing the difference between SRAF and Symbiosis.
In Figure 4, we make the comparison of SRAF and Symbiosis at different deadlines.All the tasks will be serviced in each framework, and they have three statuses.Success means that the task is finished smoothly in its owner VM.Failed means that the task is discarded when its service time exceeds the deadline.Scale means that the task needs extra resource from the resources pool for its working.In Figure 5, the results represent the utilization of bandwidth and CPU in Symbiosis and SRAF at different deadlines.In Figure 4, with the growing deadline, many failed and scale tasks become success tasks and wait for processing.e bandwidth and CPU will have more idle time.As a result, the resource utilizations are decreased as shown in Figure 5.
In Figure 4, when deadline is 20 ns and 30 ns, there are still some failed tasks in SRAF, while Symbiosis has none.When the deadline is 40 ns and 50 ns, all tasks in SRAF are success tasks, but there are also some scale tasks in Symbiosis.
erefore, SRAF has a higher absorption rate of requests and higher efficiency than Symbiosis.What caused the status above has two sides.For one thing, the more training the SCA has, the better classification results it has.
e better training of SCA can make a better combination of file-focus tasks and video-focus tasks for reducing the waiting time by the sharing mode.Immediately following the operation of SCA, SRAF will make a full use of resources.As a result, SRAF has more time for working and many failed and scale tasks will become success tasks.For another thing, the sharing mode may prolong the execution time of long tasks (video tasks).But this problem can be solved by the method of resetting resource sharing proportion.For example, in Figure 5, the resource utilizations of the Symbiosis are decreasing with the growing deadline.
e resource utilizations of SRAF are approximated to 97%. is phenomenon means that there are some long tasks held on CPU resources with the growing deadline.As a result, the bandwidth resources are unoccupied.Finally, the utilization of bandwidth is decreasing and that of CPU is increasing.However, SRAF has a different performance.Because of SCA, the long tasks (video tasks) are executed with short tasks (file tasks) by the sharing mode.In other words, one VM can have two tasks.When the short task has been finished and the long task is still working, a new short task will come for transmitting.As a result, the CPU and bandwidth are occupied by one task.Hence, the resource utilization of CPU and bandwidth decreases, but in a small range.However, the resource utilizations are still higher than those of Symbiosis.e detailed process of the above example is shown in Figure 2. Taking a holistic look of Figure 4, the SRAF also has a shorter dropping rate and scaling rate of all the tasks.erefore, facing the same tasks, SRAF has a higher resource utilization and task processing rate than Symbiosis.
In order to measure the performance of our control policy π in Section 4.3, we add the operation of VM migration into Symbiosis for comparison, which is named "Symbiosis + vm-mi" (SVM).Additional execution time represents the total execution time of VM migration and scaling operation.Additional cost means the total cost of all the VM migration and scaling operations.For enhancing the veracity, we make ten experiments of SRAF, Symbiosis, and SVM, respectively, for comparison at each deadline.Let us firstly make a comparison with Symbiosis and SVM in Figures 6 and 7 because SVM makes a control policy to measure the VM migration.erefore, when VM becomes overloaded (resource utilization exceeds thresholds), the overloaded VM will firstly choose to make migration by control policy.If not, SVM will do scaling operation (shown in Algorithms 4 and 5).On the contrary, Symbiosis can only do scaling operation for these overloaded VMs.In theory, VM migration can reduce the cost than scaling operation because scaling operation is easily creating frequent uctuation.Hence, taking an overall look of Figures 6 and 7, SVM has a shorter execution time and cost than Symbiosis because the control policy can make a long-time prediction to operate VM migration so as to avoid the VM overload and resource lack.In other words, SVM uses the method of VM migration to cut down frequent scaling operation for reducing the additional cost.
From the overall perspective of Figure 6, the total execution time of Symbiosis is much more than that of SRAF.e total execution time of SRAF is almost at the level of 0.6 × 10 5 ns when deadline is 50 ns.Correspondingly, the additional execution time is approximated to zero.But the total execution time of Symbiosis is almost four times higher than that of SRAF when the deadline is 50 ns.e total cost of Symbiosis in Figure 7 is also approximately ve times higher than that of SRAF.e analysis above means that the service latency of SRAF is much shorter than that of Symbiosis.For example, in Figure 5, the decrease of resource utilizations on Symbiosis is sharper than that on SRAF with the growing deadline.erefore, when more failed and scale tasks become success tasks, the total execution time of Symbiosis will continually increase.e total execution time of SRAF is changing in a small range.at is to say, facing more tasks,    Mobile Information Systems SRAF has a stable and better performance on reducing service latency and cost than Symbiosis.
For testing the robustness of SRAF, we set λ from 1 to 10 to simulate di erent situations of congestion.During experiments, we will provide enough resources.In order to avoid the additional cost made by frequent operation of changing the resources of VMs, we make the operation of resource-scaling can only maximully add twice resources of the original con guration resources on VM.We make ten experiments for each framework at di erent λ.
en, according to these experiments, we get the mean bandwidth utilization, mean CPU utilization, and total cost of each framework, which are shown in Figures 8-10, respectively.In Figure 8, the bandwidth utilization of SRAF, Symbiosis, and SVM is increasing with the growing value of λ.What caused the increasing phenomenon has two sides.Firstly, because the arrival rate of requests is based on the Poisson distribution, the situation of congestions becomes smooth with the growing value of λ.So, facing the more stable arrival rate, frameworks will have more time for working.As a result, all the bandwidth utilizations have an increasing trend with the growing value of λ.Secondly, if the deadline of requests exceeds the transmission time, they will become the failed tasks.In addition, we do not calculate the time and cost of failed tasks.In other words, failed tasks are the waste of bandwidth, and it is the main factor a ecting the bandwidth utilization.erefore, with the growing value of λ, the more stable arrival rate will make less failed tasks.As a result, the bandwidth utilization of frameworks is increasing.Taking a detailed look of Figure 8, it is observed that the bandwidth utilization of SRAF is higher than that of Symbiosis and SVM.Symbiosis and SVM have an approximate trend.It is because that SRAF can service two tasks simultaneously with the sharing mode (shown in Figure 2).e sharing mode can make full use of bandwidth and CPU resources than Symbiosis and SVM.erefore, SRAF can e ectively reduce the latency and enhance the utilization of free resources.e same trend of SVM and Symbiosis is that they have the same method of bandwidth transmission and do not have the sharing mode.According to these analogies above, for di erent arrival rates of requests, SRAF has a better performance on bandwidth utilization than Symbiosis and SVM, especially with the stable arrival rate.
Figure 9 presents the CPU utilizations of SRAF, Symbiosis, and SVM at di erent λ.Taking an overall look of Figures 8 and 9, CPU utilization of di erent frameworks is  Mobile Information Systems also increasing with the growing value of λ as the situation of bandwidth utilization.In addition, the increases of adjacent λ on CPU utilization are slightly higher than those on bandwidth utilization.It is because that the sharp congestion will cause more free time on the CPU resource than the bandwidth resource.e queueing time will exceed the deadline and cause many failed tasks.When the value of λ is growing from 1 to 4, the CPU utilizations of SVM are higher than those of Symbiosis.is phenomenon means that VM migration in SVM cuts down the quantity of scaling tasks and failed tasks.It is because that VM migration can avoid the overload of VMs and reduce the free time of failed tasks.In addition, with the growing value of λ, the arrival rate of requests is becoming stable.e stable arrival rate will reduce the frequent uctuation of scaling operation.e operation of VM migration can also be reduced.When the value of λ is larger than 4, the CPU utilizations of SVM and Symbiosis are approximately the same.Certainly, CPU utilization of SRAF is stable and higher than that of SVM and Symbiosis because of the sharing mode and VM migration.Figure 10 represents the total cost of SRAF, Symbiosis, and SVM at di erent λ.Di erent to the situation in Figures 8  and 9, the total cost of SVM and Symbiosis is increasing with growing λ. e total cost of SRAF is decreasing.What caused this phenomenon has two sides.Firstly, because of the higher utilizations of bandwidth and CPU in SRAF, SRAF reduces more waiting time and service time for all tasks with the sharing mode, especially for those short tasks behind the long tasks in the queue.In addition, the one-by-one service method of Symbiosis is the main factor which a ects the utilizations of CPU and bandwidth.erefore, facing the same tasks, Symbiosis will waste many resources and prolong the service time than SRAF.Secondly, VM migration can reduce frequent scaling operation.For example, the total cost of SVM is less than that of Symbiosis at each λ in Figure 10.It is because that the control policy can make a trade-o between VM migration cost and scaling operation cost.Control policy has the ability of prediction according to the history data, which can select the minimal cost of each action for reducing the additional cost.erefore, taking a holistic look of Figures 8-10, SRAF has a better performance on stability and robustness.

Conclusions
In this paper, we have designed, modeled, and evaluated the SRAF, which aims to reduce the latency of service requests in mobile data networks and enhance the utilization of bandwidth and CPU resources.In SRAF, we have proposed the SCA to execute the tasks' classi cation.We also designed a sharing mode to realize the combination process of two tasks.Sharing mode greatly reduces the waiting time during service.In addition, we also designed an MDP to control VM migration.We use the combination method of VM migration and scaling to enhance resource utilization.Finally, we make many di erent experiments to show that SRAF has a good performance on resource utilization, stability, and robustness.

( 3 )
e Combination Scheduling Manager uses the classi cation results from the Classi cation Manager to make combination scheduling of the requests; then, it performs resource allocation and pushes the real information of the requests to the History Loads for updating its features (details are given in Section 4.4) during the operation.(4) e Monitor Manager monitors the utilizations of the CPU and bandwidth of each active VM to support real-time service information.

Figure 1 :
Figure 1: e sharing mode of the high-type VMs.

4
Input: request information Output: line to which request belongs (1) Initialize G V , G F (2) Divide the URL into L; (3) Calculate β v by circularly comparing the features in L, G V according to (3) and (4) (4) Calculate β v by circularly comparing the features in L, G F following (3) and (5) (5) Calculate β by (6); (6) Output the attribution of the request.ALGORITHM 1: e description of SCA.

Figure 5 :
Figure 5: e utilization of bandwidth and CPU in SRAF and Symbiosis frameworks at each deadline.

Figure 4 :
Figure 4: e quantity ratios of three statuses in all tasks.

Figure 6 :
Figure 6: e total execution time (a) and additional execution time (b) of SRAF, Symbiosis, and Symbiosis + vm-mi frameworks at each deadline.