Cost-Efficient Allocation of Additional Resources for the Service Placement Problem in Next-Generation Internet

One of the major challenges in next-generation Internet is to allocate services to nodes in the network. This problem, known as the service placement problem, can be solved by layered graph approach. However, due to the existence of resource bottleneck, the requests are rejected from the beginning in the resource constrained network. In this paper we propose two iterative algorithms for efficient allocation of additional resources in order to improve the ratio of accepted service placement requests. To this end, we (1) introduce a new concept of sensitivity for each service node to locate the bottleneck node, (2) state the problem of allocating additional resources, and (3) use sensitivity to propose a simple iterative algorithm and an utilization-based iterative algorithm for efficient resource allocation. The performance of these two algorithms is evaluated by simulation experiments in a variety of parameter settings. The results show that the proposed algorithms increase request acceptance ratio significantly by allocating additional resources into the bottleneck node and links. The utilization-based iterative algorithm also decreases the long-term cost by making efficient use of additional resources.


Introduction
A huge achievement has been made in the Internet technology over the last four decades in supporting a wide array of distributed applications and in providing fundamental end-to-end communication connectivity.However, with the increase of the Internet's scale and scope of use, some inherent deficiencies of the Internet architecture are gradually exposed.Innovations are needed in the following aspects of the current Internet architecture, namely, mobility support, reliability and availability, and quality of service guarantees [1].To realize these innovations, many promising network architectures have been designed for the next-generation Internet.
An important approach adopted by some next-generation Internet architectures is to move the data processing functions from the end-systems to the routers inside the core of network.In the context of such networks, the data processing function is considered as service (in-network service [2]), and the transcoding, encryption, flow control, multicast, and so forth are examples of the service [2].This approach provides greater flexibility with its ability to compose the services along the data path to satisfy different communication requirements from the end-system applications [3,4].In such network architectures, a major challenge is to determine which nodes the services are placed on along the data path and determine the shortest path between these nodes.This problem is known as the service placement problem which is proven to be NP-hard when considering constraints of the network resources [5].
In the service placement problem, to establish an end-toend connection, the sequence of services that represents the application functionality and the required network resources need to be specified in advance in the end user's request.The data from one end system to another needs to be routed to pass through the nodes where certain services specified in the sequence of services are available while the network resources are sufficient on these nodes.Apparently, the service placement problem is quite different from the traditional routing problem in which the data always follow the shortest path.Furthermore, in the service-centric network architectures (e.g., Network Service Architecture (NSA)), the service controller performs the mapping algorithm to determine where to place which services [2,3].
The layered graph algorithm, with low computational complexity, is an efficient solution to the service placement problem when the resource is unlimited.Yet it shows limitations in finding valid end-to-end connections due to the existence of resource bottleneck in the resource constrained network [5].In this paper, we focus on how to locate the bottleneck based on the layered graph algorithm and to increase the resource capacity of the bottleneck to improve the performance of the resource constrained network.To this end, we first introduce a new concept, sensitivity, to represent the impact that a single service node can have on the performance of the entire network in terms of average ratio of accepted service placement requests.To compute the sensitivity of a service node, we remove this service node from the service network while maintaining the network working to measure the decrease rate in average request acceptance ratio.According to the value of sensitivities, we locate the bottleneck node which is corresponding to the service node with the greatest sensitivity value.We then propose two sensitivity-based iterative algorithms, called SI-AAR and UI-AAR, for solving the problem of allocating additional resources into the bottleneck node and its adjacent links to increase the average request acceptance ratio.Both of them first compute the sensitivities for each node in the network.SI-AAR then provides a simple way to iteratively increase both the CPU capacity and bandwidth capacity by the same increase rate.UI-AAR can iteratively supplement additional resources selectively based on resource utilization ratio.The results from our experiments show that our sensitivity-based iterative algorithms increase request acceptance ratio significantly by allocating additional resources into the bottleneck node and links.UI-AAR outperforms SI-AAR in terms of the long-term average cost and the amount of allocated additional resources by making more efficient use of additional resources.
The rest of the paper is organized as follows.In Section 2, we first overview the related work briefly.Then we state the service placement problem in Section 3. In Section 4, we discuss the concept of the sensitivity and state the problem of allocating additional resources.Section 5 describes the method of computing the sensitivity and two sensitivitybased iterative algorithms, SI-AAR and UI-AAR.Section 6 presents experimental results in a variety of parameter settings.In Section 7, we conclude this paper and outline future research directions.

Related Work
Some network architectures have been designed to support in-network services and provide network functional composition.The Service Integration Control and Optimization (SILO) project considers building blocks of fine-grain functionality as services and combines services to accomplish highly configurable complex communication tasks [4,6].The NSA implements an abstraction for communication between end-systems by providing packet processing service inside the network [2,3].Recent routing architectures such as programmable routers [7,8] and virtualized router platforms [9] have provided technical support for implementation of the above network architectures.
Related service placement problems have been discussed extensively in recent work.In programmable networks, (1) Choi et al. [5] presented a layered graph algorithm to solve the service placement problem.In this algorithm, a multiplelayer graph is constructed and Dijkstra's shortest path algorithm is applied on the layered graph to find the shortest path.Then they proposed a capacity tracking approach to prevent the overuse of resources when considering capacity constraints.But they did not consider the impact that the bottleneck has on the performance of entire network in terms of request blocking rate.(2) Huang et al. [10,11] proposed a distributed solution to the service placement problem in resource unconstrained network.By introducing a service matrix, this distributed algorithm can determine the optimal or near-optimal routes for connection requests.The advantage of this distributed algorithm is that it is more suitable for large-scale networks.
In service overlay network, (1) Raman and Katz [12] also used the layered graph algorithm and focused on load balancing without considering the capacity constraints.By introducing a least-inverse-available-capacity (LIAC) metric to reassign the link cost in the layered graph, it is ensured that the links with lower load are preferred over the links with higher load.(2) Liang and Nahrstedt [13] presented a greedy heuristic algorithm to solve the service composition problem to find low-cost composition solutions.(3) Qian et al. [14] also used heuristic algorithm to establish composite services delivery path with lowest cost.In addition, they considered the changes of data size along the data path when choosing services hop-by-hop.
In cloud environment, Tran et al. [15] used recurrence method to solve the service composition problem.They discussed three exact algorithms for three different topologies: path, star, and tree.
To the best of our knowledge, this paper is the first proposal that studies the problem of bottleneck locating and utilizing in the service-centric network architecture.Here, we introduce several related research works in other network architectures.In the Internet, Hu et al. [16] presented a novel light-weight, single-end active probing tool (Pathneck) which allows end users to efficiently and accurately locate bottleneck.According to the location information of bottlenecks, they used multihoming and overlay routing to avoid bottlenecks.In virtualized network, (1) Butt et al. [17] presented a topology-aware measure method to detect the bottleneck nodes and links in the substrate network.And then they proposed a set of algorithms for reoptimizing and remapping initially rejected virtual network requests.(2) Based on the analysis on resource utilization, Fajjari et al. [18] found that the existence of bottlenecked substrate links is the main reason of most of the request rejections.Given  that they proposed a reactive and iterative algorithm for remapping the rejected request through migration of nodes and its bottlenecked attached links.However, none of them consider adding additional resources to the bottleneck node and links to improve the performance of entire network.

Service Placement Problem
In the network architecture where data processing functions can be implemented inside the network, we term this physical network as service network, the node in this service network as service node, and the link in this service network as service link.Then we state the service placement problem formally as follows.
3.1.Service Network.We model the service network as a weighted undirected graph and denote it by  = (, ), where  is the set of service nodes and  is the set of service links.The number of service nodes and the number service links are denoted by || and ||, respectively.Each service node   ∈  is associated with the CPU capacity weight value (  ), available service set (  ) = {  | service   is available on service node   }, and service   's processing time weight value    (  ) on service node   for service node resources.Each service link (, ) ∈  between service node   and   is associated with the link bandwidth weight value () which denotes the total amount of bandwidth capacity and its link delay weight value   ≡ (, ) for service link resources.
An example of the service network topology is shown in Figure 1.

3.2.
Request.An end user's request for end-to-end customized composite services can be represented as a set including six elements and denoted by  = (  ,   ,   ,   , ,   ).Here   is the source node,   is the destination node,   is the arrival time,   is the duration time,  is the required bandwidth capacity, and   is the required service set which is composed of service number sn and an ordered list of services sl.Each service sl  ∈ sl, where  represents the index of services in the ordered list, is associated with the CPU capacity requirement weight value (sl  ).For example, a request  = (, , 90, 1050, 200 (Mb/s), {4, ( 3 →  1 →  4 →  2 )}) while (sl 1 ) = 5, (sl 2 ) = 10, (sl 3 ) = 10, (sl 4 ) = 5.

Service Path.
Given the end user's request  and the service network , an end-to-end service path is such a path from the source service node   to the destination service node   , and all required services in the service list sl should be processed in sequence along this path.

3.3.1.
Service-to-Node Mapping.Each service from the ordered service list sl needs to be processed by a service node in the end-to-end service path by a mapping M S : sl →  from services to service nodes such that, for all sl  ∈ sl, where (1) if M S (sl  ) = M S (sl   ),  ̸ =   , then sl  is not necessarily equal to sl   , which means multiple services can be performed on a single service node; (2) if M S (sl  ) = ⌀, that indicates the service sl  is not performed on any service node.

Service
Path.Given the service-to-node mapping, the end-to-end service path is denoted by where   1 is the source node,   1 =   ,    is the destination node,    =   ,   2 ,   3 , . . .,   −1 are the service nodes, (  ,  +1 ) is the service link, the hops of P 2 denoted by hops(P 2 ) are equal to  − 1, and    is a service-to-node mapping set on service node    defined as where if    = ⌀, that indicates no service is performed on service node    .
To guarantee the validity of the service path, several requirements have to be met.
(1) All service nodes have sufficient CPU capacity for performing the mapped services such that, for ∀   ∈ P 2 , where sl  →    ∈    indicates that service sl  is performed on service node    .

Objective.
The delay of an end-to-end service path D(P 2 ) is defined as the summation of service processing time on service nodes and communication delay on service links along the service path (6) where   =   1 ,   =    , and  is the number of service nodes in P 2 .The objective of service placement problem in this paper is to find a least delay service path from all valid P 2 .
Due to the finite nature of network resources, capacity constraints are the crucial considerations for solving the service placement problem.When an end-to-end connection request arrives, the service network has to determine whether to accept the request or not according to its specification.If the request is accepted, the service network operator needs to place services on service nodes, allocate the CPU capacity on the corresponding service nodes, and link bandwidth on service links to establish the least delay end-to-end service path.Once the end user leaves, the service path is destroyed and the allocated resources are released.
In this paper, we make several assumptions as follows: (1) We assume that requirements of resources and services specified in an end user's connection request do not change over the duration time of the connection.
(2) An end-to-end service path, which is established according to an end user's connection request, is fixed during the lifetime of this connection.

Problem of Allocating Additional Resources into Service Network Based on Sensitivity
The layered graph with capacity tracking algorithm is an efficient approach to solve the service placement problem.However, the layered graph algorithm cannot perform well when the capacity of network resources is limited.The main reasons include the NP-hard nature of the problem and the existing resource bottleneck.Therefore, a valid service path cannot always be found even when a valid path exists, and the end-toend connection requests are blocked from the beginning [5].
To solve the existed resource bottleneck problem, we propose two iterative algorithms in this paper for efficient allocation of additional resources in order to improve the performance in terms of average request acceptance ratio, denoted by AR.To this end, we (1) introduce a new concept of sensitivity for each service node to locate the bottleneck node, (2) state the problem of allocating additional resources into the service network based on sensitivity, and (3) use sensitivity to propose a simple iterative algorithm and an utilization-based iterative algorithm for efficient allocation of additional resources.

Definition of Sensitivity.
In the service network, a service node can perform complicated data processing functions.
In addition, each service node has different resources, for example, CPU capacity, processing power, available services, storage, and memory.The sensitivity of a service node represents the impact that this service node has on the performance of entire network (e.g., the impact on average request acceptance ratio).When the most sensitive service node (bottleneck node) is located, the owner of the service network (e.g., Infrastructure Provider (InP)) has an opportunity to improve the performance of the entire network by simply supplementing additional resource capacities into one node.
To calculate the sensitivity of a service node, we remove or shut down one different service node   each time from the service network and maintain the network working to measure the average request acceptance ratio without   , denoted by AR().If the AR() drops significantly, the service node   plays a vital role in the service network and holds high sensitivity.
The set of sensitivities for all service nodes in the service network () is a vector defined as Sen = (Sen 0 , Sen 1 , . . ., Sen  , . . ., Sen ||−1 ) where the element Sen  representing the sensitivity of service node   is defined as In the resource constrained network, the average request acceptance ratio is a significant performance metric which determines how many end users' requests are accepted.
After the calculation of every service node's sensitivity, we identify the service node with the greatest sensitivity and term it as the most sensitive node.We term the adjacent service link of the most sensitive node as sensitive link.Then we focus on increasing the CPU capacity of the most sensitive node or the bandwidth capacity of the sensitive links to improve the average request acceptance ratio of the entire network.

Problem Statement.
The problem of allocating additional resources based on sensitivity is stated as follows.We first define the total amount of additional resources added into the service network as where Once the most sensitive node is located, the main objective is to devise the algorithms for efficient allocation of additional resources to improve the performance of entire network.
Similar to the previous work in [15,19,20], the revenue (i.e., economic profit) of accepting an end user's request () at time  can be defined as the resources that  requires multiplied by their prices where   represents the CPU capacity usage price per required resource unit (e.g., $/instance⋅hour) and   represents the bandwidth usage price per required resource unit (e.g., $/Gb⋅hour).Given that R(, ) represents the total price that the end user needs to pay to the InP.The cost of building a service path for an end user's request at time  can be defined as the total amount of resource capacity that the InP allocates to the service path P 2 multiplied by their costs where () represents the CPU capacity usage cost per used resource unit (e.g., $/instance⋅hour) and () represents the link bandwidth usage cost per used resource unit (e.g., $/Gb⋅hour).The cost of serving an end user's request mainly depends on the hops of the chosen service path.
Accordingly, the cost per time unit caused by adding additional resources to the service network is defined as the total amount of additional resources multiplied by theirs costs After a service path is established, the resources allocated to it will be occupied in the whole lifetime of the corresponding request.Thus the total revenue and cost of serving an end user's request are determined by its lifetime   , denoted by R(, ) ⋅   and C(, ) ⋅   , respectively.
In general, the additional resources are allocated permanently into the service network.Hence, the total cost is determined by the running time T of the service network, denoted by C((Res)) ⋅ T.
From InP's point of view, an effective and efficient algorithm of allocating additional resources would minimize the amount of additional resources and maximize the average request acceptance ratio and the average revenue of the InP in the long run.The long-term average revenue of the InP, denoted by R(), is defined as The average request acceptance ratio (AR) of the service network is defined as where | accepted | is the number of requests successfully accepted by the service network and || is the total number of requests.Consider, using the sensitivity-based iterative algorithms for allocating additional resources into the service network, the long-term average cost of the InP which should take the cost caused by taking additional resources into account, denoted by C(), is defined as We measure the efficiency of allocating additional resources in terms of the ratio of long-term average revenue to cost (R/C) ratio which is defined as Our objective is to minimize the amount of additional resources ((Res)) allocated into service network and accept the largest possible number of end user's requests.We also want to increase the long-term average revenue of InP (R()) and decrease the long-term average cost of InP (C()).When the average request acceptance ratios of proposed algorithms are nearly the same, we prefer the one that supplements the least amount of resources ((Res)) and offers highest longterm R/C ratio.
The problem of allocating additional resources stated above is a multiobjective optimization problem with conflicting objectives, which is a combinatorial optimization problem known as NP-hard [21].As a matter of fact, we can only achieve balance among all above objectives by designing effective and efficient algorithms.For example, we cannot supplement additional resources unlimitedly although the average request acceptance ratio (AR) and long-term average revenue (R()) increase sharply at the beginning.With the increase of (Res), (1) the corresponding cost (C((Res))) which is proportional to (Res) increases; (2) the increase in AR and R() will converge eventually; (3) the increase in C() will significantly exceed the increase in R() from some time.Consequently, the R/C will eventually reach an unrealistic value (e.g., R() < 0.5) which is unacceptable for an InP.To achieve better performance, we devise two iterative algorithms for allocating additional resources based on the computation of sensitivity, denoted by SI-AAR and UI-AAR, respectively.We will discuss these two algorithms in the following Section 5 in detail.

Measurement of Resources.
To allocate additional resources efficiently, some resource metrics need to be defined and calculated in advance.

Resources on Service
Node.The available capacity of a service node, denoted by   (  ), is defined as the available CPU capacity of the service node   ∈ , The capacity utilization ratio of a service node, denoted by   (  ), is defined as the total amount of CPU capacity allocated to different services performed on the service node   ∈  divided by the CPU capacity of service node (  ), The average utilization ratio of all service nodes is defined as the summation of utilization ratio of all service node divided by the number of service nodes, The capacity utilization ratio of a service link, denoted by   (), is defined as the total amount of link bandwidth allocated to different links in P 2 divided by the bandwidth of the service link (), The average utilization ratio of all service links is defined as

Sensitivity-Based Iterative Algorithms for Allocating Additional Resources
5.1.The Sensitivity Computing Method.The main task of this algorithm (Algorithm 1) is to set up the layered graph and run the capacity tracking algorithm known as layered graph with capacity tracking (LG-CT) to record the performance metrics of the service network, for example, average request acceptance ratio (AR), long-term average revenue (R()), long-term average cost (C()), average node utilization (  ), and average link utilization (  ).We then remove one different service node   each time from the service network () and run LG-CT again to compute corresponding Sen  ∈ Sen. The most sensitive node   is the greatest element in the vector Sen. Taking advantage of locating the most sensitive node, then we design two iterative algorithms for allocation of additional resources called SI-AAR and UI-AAR, both taking the service network as input.We only consider the supplement of additional resources into the most sensitive node and sensitive links in these two algorithms in order to avoid the rise of the average cost of the InP and the drop of the R/C ratio of the InP in the long run.The iteration method is used for additional resources allocation, since the exact values of  and  are impossible to predict directly.On the one hand, inadequate additional resources can not provide significant improvement on performance.On the other hand, excessive additional resources can result in an overuse of resources.Iteration method provides an effective way to find proper values of  and  by gradually increasing the resource capacity.Details of these two algorithms are given below.

Simple Iterative Algorithm for Allocating Additional Resources (SI-AAR)
. SI-AAR (Algorithm 2) describes a simple way to iteratively increase both CPU capacity and bandwidth capacity simultaneously.The algorithm begins by locating the most sensitive node   in service work ().SI-AAR then supplements  ⋅ (  ) CPU capacity into   and  ⋅ (  ) bandwidth capacity into   .Next, it reruns LG-CT and calculates R/C ratio and the increment in AR denoted by ΔAR.For each iteration, SI-AAR compares ΔAR with a small positive fraction  and R/C with a threshold parameter  which is also a positive fraction.The algorithm terminates under two conditions: (1) if ΔAR < , AR converges; (2) if R/C < , the algorithm will become unacceptable from the point of economic profit.Otherwise, SI-AAR enters the next iteration.
To adjust the increment in additional resources added into the service network in each iteration, we introduce two increment parameters denoted by Δ cpu and Δ bw .Both of them are integers greater than zero and indicate the increment in the amount of CPU capacity and bandwidth capacity, respectively, in each iteration.In realistic network environment, the capacities of network resources (e.g., CPU, storage, bandwidth) can be supplemented by the number of instances which imply the times of resource capacity.For example, we can increase CPU capacity by adding one or multiple CPU instances, increase storage capacity by adding one or multiple hard disks, and increase bandwidth capacity by adding one or multiple communication links.Therefore, it is reasonable to increase resource capacity by one or multiple times in each iteration.  is a constant vector with all elements having the same value 1 and its size is the same as that of . =  bw ⋅   indicates that the bandwidth capacity of all sensitive links will be increased by the same times.

Utilization-Based Iterative Algorithm for Allocating Additional Resources (UI-AAR).
The average utilization ratio of service nodes and service links reflects how much capacity of the network resources is used and which resource is insufficient.The value of the average utilization ratio is decided by several factors, for example, the arrival rate of requests, required resources, and available capacities.We observe the recorded average node utilization ratio   and average link utilization raio   and then compute the difference between them, denoted by Δ, defined as Δ =   −   .We find that much higher increase in average request acceptance ratio can be achieved efficiently by only adding the resource capacity with higher average utilization ratio while reducing the cost caused by adding additional resources.For example, if Δ = 30%, the CPU capacity is the scarce resource under much higher stress, and we can improve average request acceptance ratio significantly by only supplementing CPU capacity into the most sensitive node.In this case, average request acceptance ratio increases slightly if we only add bandwidth capacity into sensitive links.
UI-AAR (Algorithm 3) introduces a mechanism to supplement additional resources into the most sensitive node and sensitive links separately or simultaneously as far as the increment and threshold parameters allow.To allocate additional resources into the service network, UI-AAR has three choices: (1)  cpu = 0,  = 0,  = (0, 0, . . ., 0  ) (2) repeat (3)  cpu =  cpu + Δ cpu ,  =  cpu (4) Add  ⋅ (  ) CPU capacity into  (5) Computing  using Algorithm 5 where, in the first line, (1)  do not be reset to (0, 0, . . ., 0  );  AACBC combines the method in Algorithm 4 with the method in Algorithm 5. Like the method in Algorithm 4, for the most sensitive node, AACBC gradually increases its CPU capacity by  times.For each value of , AACBC uses the method in Algorithm 5 to compute  and then supplements the corresponding resources (Res) into service network.If the condition is true, AACBC adds the value of Δ cpu to  and repeats the above process until AR converges.

Performance Evaluation
In this section, we first describe the evaluation environment and then present our main experimental results to evaluate efficiency of the sensitivity-based iterative algorithms in terms of average request acceptance ratio, allocated additional resources, long-term average revenue, long-term average cost, and R/C ratio.Our evaluation primarily focuses on the performance comparison between proposed two algorithms and the advantages of sensitivity-based bottleneck locating.

Evaluation Environment.
We have implemented a discrete event simulator to evaluate our proposed algorithms.Three different settings are chosen for the following experiments.Differences among the three settings are introduced in Section 6.1.3and shown in Table 2.

Service Network Topology.
In this paper, we do not assume any specialized network topologies.We first use the GT-ITM Tool [22] to randomly generate service network topologies.Each service network is a 20-node 27-link topology with a choice of 4 different services, a scale that corresponds to a small-sized ISP.Specifically, the number of services available on one single service node is an integer uniformly distributed between 1 and 4. The bandwidth capacity of service links and the CPU capacity of service nodes are real numbers uniformly distributed between 50 and 100.The communication delay of each service link is an integer which is proportional to its Euclidean distance and normalized between 1 and 10.Each service's processing time is an integer which depends on its type and the CPU power of its corresponding service node and normalized between 1 and 10.

Request.
We assume that end user's requests arrive in a Poisson process with an average rate of 4 requests per 100 time units.Each end user's request has an exponentially distributed duration time with an average of 1000 time units.We run our simulation for about 50,000 time units, which corresponds to about 2,000 requests, for an instance of the simulation.The required bandwidth for service links and the required CPU capacity for each service are configured according to different settings shown in Table 2.The number of services in end user's requests is fixed to 4. The ordered services list consists of 4 different services randomly distributed among  1 ,  2 ,  3 , and  4 .

Differences in Three Settings.
To observe the performance of our algorithms under different resource utilization, three different experimental settings are configured for our simulation.The differences among them only exist in required CPU capacity for each service and required bandwidth for service links.Setting I represents the scenario that the service network is under more pressure from requested CPU capacity resources than it has from requested bandwidth resources.On the contrary, resource stress that the service network encounters mainly stems from the requested bandwidth in Setting II.In Setting III, with the increasing requirements for resources, the available capacities on both service nodes and service links become insufficient to accept more requests.Details are shown in Table 2.

Compared Algorithms and Parameter Settings.
In our simulator, we implement our sensitivity-based iteration algorithms for allocating additional resources alongside the related strategies: (1) SI-AAR and (2) UI-AAR.We use two specific cases of SI-AAR, denoted by SI-AAR-Least and SI-AAR-RUB (referenced upper bound, RUB), to provide a lowest bound and an referenced upper bound, respectively, on the amount of additional resources.In SI-AAR-Least,  is set to 1 and  is set to (1, 1, . . ., 1), which represents the situation of adding the least amount of resources to the service network using SI-AAR.In SI-AAR-RUB,  is set to 9 and  is set to (9, 9, . . ., 9); that is, the amount of resources of the most sensitive node and sensitive links is increased to 10 times which is not practical but provides a referenced upper bound on the performance of realistic algorithms.Based on extensive simulations, we adjust the parameters of proposed two algorithms.We set   ,   , (), and () to 1, Δ cpu and Δ bw to 1,  to 1%,   to 0.5%,  to 0.6, and  to 10% to achieve greatest increase in average request acceptance ratio and to decrease the amount of additional resources and the number of iterations in the meantime.

Performance Metrics.
In our experiments, we use several performance metrics introduced in previous sections for the purpose of evaluation, for example, average request acceptance ratio (AR), the amount of allocated additional resources ((Res)), long-term average revenue (R()), longterm average cost (C()), and long-term R/C ratio.We measure the average request acceptance ratio (AR) and average request acceptance ratio without node  (AR()), to compute the sensitivities and locate the most sensitive node in the original service network (i.e., the service network without any additional resources being added).To evaluate the effectiveness and the efficiency of our proposed algorithms, we also measure the average request acceptance ratio, average revenue, average cost, and R/C ratio for end user's requests over time in the service network where the corresponding additional resources have been added to.In the meantime, we record the values of two essential parameters  and  to compute the amount of additional resources ((Res)) eventually allocated into the service network for evaluated algorithms.

Evaluation Results
. We present the evaluation results to show the effectiveness and quantify the efficiency of the proposed algorithms under three different scenarios (depicted in Figures 2, 3, and 4).
We first plot AR and AR() against the index of service nodes to show the computation of sensitivities (depicted in Figures 2(a), 3(a), and 4(a)).We use a bar chart to compare the amount of allocated additional resources ((Res)) (depicted in Figures 2(b), 3(b), and 4(b)).Next, we plot average request acceptance ratio, average revenue, average cost, and R/C ratio against time to show how each of these algorithms actually performs in the long run (depicted in Figures 2(c)-2(f), 3(c)-3(f) and 4(c)-4(f)).We summarize our key observations for the three settings (Table 2) as follows.
6.4.1.Sensitivity Computing.We locate the most sensitive node through computation of sensitivities and choose the strategy of allocating additional resources for UI-AAR according to the results of resource utilization computing (shown in Table 3).
(3) Setting III: the configurations of required CPU capacity and bandwidth (Table 2) show that both CPU capacity and bandwidth capacity are under huge pressure.That is the reason why the average request acceptance ratio is only 64% and   is close to   (shown in Figure 4(a) and Table 3).Given that, node 7 is the most sensitive node, − < Δ = 3% < , and the CPU capacity of node 7 and the bandwidth of node 7's sensitive links are chosen to be supplemented together in UI-AAR.

Allocating Additional Resources into the Most Sensitive
Nodes and Sensitive Links Leads to Higher Average Request Acceptance Ratio.In Figures 2(a), 3(a), and 4(a), we also plot average request acceptance ratio against time to show how the original LG-CT algorithm, denoted by Original, performs without any additional resources being added to the service network.It is important to note that we only present the effectiveness of the proposed algorithms here.We will discuss the efficiency of the proposed algorithms to see if and how they can make efficient use of allocated additional resources in Section 6.4.3.
From Figures 2(a), 3(a), and 4(a), it is evident that the proposed algorithms, UI-AAR and SI-AAR, and the two specific cases, SI-AAR-Least and SI-AAR-RUB, lead to higher average request acceptance ratio than the Original under three different scenarios.These three graphs show that sensitivity-based bottleneck locating plays a vital role in allocation of additional resources.It is an effective way to increase the average request acceptance ratio only by supplementing additional resources into bottleneck node (the most sensitive node) and links (sensitive links).
In the three outlined settings, after time unit of 20,000, the average request acceptance ratio of SI-AAR and UI-AAR is nearly the same.In addition, the approximate increments in average request acceptance ratio (e.g., 13.4% and 13.6% in Setting I, 15.6% and 15.4% in Setting II, and 16% and 15.8% in Setting III at the time unit of 50,000) show that both SI-AAR and UI-AAR perform well.In Setting I, SI-AAR-RUB gains much higher acceptance ratio than SI-AAR and UI-AAR, since it supplements ten times amount of additional resources.However, in Setting II and Setting III, the average request acceptance ratio of SI-AAR-RUB is only 0.4% and 2.2% higher than that of SI-AAR and 0.6% and higher      than that of UI-AAR, respectively.On the contrary, SI-AAR-Least produces the least increase in acceptance ratio among the four algorithms.The reasons with respect to above two situations will be analyzed in Section 6.4.3.

UI-AAR Can Allocate the Additional Resources More
Efficiently.It is worth noting that the evaluated algorithms that lead to the higher acceptance ratio also produce higher long-term average revenue (depicted in Figures 2(c), 2(d), 3(c), 3(d), 4(c), and 4(d)) and higher long-term average cost (excluding the cost produced by additional resources).
(1) According to the definition of R(), more requests are accepted, while more revenue can be obtained, and (2) all of them use the same LG-CT algorithm for solving the service placement problem (i.e., same approaches of service placement and resource allocation).However, the amount of additional resources producing additional cost allocated by the compared algorithms is different, which implies that (Res) and R/C ratio are two vital metrics to quantify the efficiency of the evaluated algorithms.
From Given that, the evaluated algorithms that lead to the higher acceptance ratio also produce higher additional cost (C((Res)) and thus produce higher long-term average cost (C()) (depicted in Figures 2(d), 3(d), and 4(d)).
Based on the above illustration and analysis, UI-AAR performs considerably well in terms of acceptance ratio, average revenue, and cost and uses less additional resources.Consequently, UI-AAR produces the highest R/C ratio among UI-AAR, SI-AAR, and SI-AAR-RUB (depicted in Figures 2(f), 3(f), and 4(f)).The reasons why UI-AAR can make more efficient use of additional resources are outlined as follow.
(1) UI-AAR selectively supplements the additional resources based on resource utilization ratio.For example, in Setting I, UI-AAR select only CPU capacity to supplement in each iteration (see Figure 2(f), where  = 5 and  = (0, 0, 0, 0)), the total amount additional resources is 4 times lower than that of SI-AAR and even less than that of SI-AAR-Least.(2) UI-AAR can find better combination of  and .For each value of , UI-AAR computes the optimal value of each element in .For example, in Setting III, for each value of , UI-AAR iteratively supplements only one sensitive link's bandwidth capacity in each iteration, and thus the bandwidth capacities eventually added to sensitive links are different despite the value of  being the same as that in SI-AAR (see Figure 3(f), where  = 4 and  = (2, 4, 3, 3)).The average request acceptance ratio of UI-AAR and SI-AAR is nearly the same, but the total amount of additional resources of UI-AAR is 1.5 times lower than that of SI-AAR.Adding more additional resources along with no improvement on acceptance ratio implies that part of additional resources added by SI-AAR does not make any contribution to the performance in terms of acceptance ratio.Likewise, SI-AAR-RUB is a suitable case to illustrate the overuse of resources.Although the R/C ratio and (Res) of SI-AAR-Least are close to those of UI-AAR (depicted in Figures 2(b), 2(f), 3(b), 3(f), 4(b), and 4(f)), UI-AAR significantly outperform SI-AAR-Least in terms of the average request acceptance ratio and the long-term average revenue.The reasons are given in the following.(1) The additional resources added by SI-AAR-Least are insufficient to accept more requests.(2) Like SI-AAR, SI-AAR-Least does not allocate the additional resource efficiently.For example, compared with UI-AAR, SI-AAR-Least accepts less requests using more additional resources in Setting I.

Conclusion and Future Work
The placement of services is an important problem in any network architecture that supports the implementation of data processing functions inside the network.In this paper, we modeled and stated this problem.To solve the resource bottleneck problem existing in LG-CT algorithm, we introduced a novel concept: sensitivity.We used the sensitivity to locate the most sensitive node and then allocated additional resources into the most sensitive node and sensitive links to improve the performance of entire network.After discussing and formulating the problem of allocating additional resources, we proposed two sensitivitybased iterative algorithms for efficient resource allocation.The first one, SI-AAR, provides a simple way to increase both the CPU capacity and the bandwidth capacity by the same times.The second one, UI-AAR, can supplement additional resources selectively based on resource utilization ratio.Our results from the experiments showed the effectiveness and efficiency of our proposed two algorithms under three specific settings.The increase in average request acceptance ratio was significant if we supplemented additional resources to the most sensitive node and sensitive links.The utilizationbased iterative algorithm (UI-AAR) can make more efficient use of additional resources and thus outperforms SI-AAR in terms of the amount of allocated additional resources, longterm average cost, and long-term R/C ratio.
In future work, we will consider the number of the bottleneck nodes which we choose to supplement resources, focus on medium-size or large-scale network topology, and investigate where the sensitivity can be further applied in the service placement problem to improve the performance in terms of optimizing capacity allocation and balancing load.

Figure 2 :
Figure 2: Comparisons between UI-AAR and SI-AAR in Setting I.

Figure 3 :
Figure 3: Comparisons between UI-AAR and SI-AAR in Setting II.

Figure 4 :
Figure 4: Comparisons between UI-AAR and SI-AAR in Setting III.
Figures 2(b), 3(b), and 4(b), as the lowest bound and the referenced upper bound, SI-AAR-Least and SI-AAR-RUB use the least and most amount of additional resources in SI-AAR, respectively.The additional resources used by UI-AAR are close to that used by SI-AAR-Least in Setting I and Setting II and are twice greater in Setting III.SI-AAR allocates more additional resources compared with UI-AAR, for example, 4, 3, and 1.5 times greater in Setting I, Setting II, and Setting III, respectively.The reason is that UI-AAR only adds CPU capacity or bandwidth capacity in Setting I and Setting II, and both SI-AAR and UI-AAR add CPU capacity and bandwidth capacity together in Setting III.

Table 1 :
Service processing time on service nodes.
the most sensitive node is represented by   and   is a vector representing  adjacent sensitive links of   defined as   = (  1 ,   2 , . . ., ).(  ) is also a vector representing the bandwidth of each sensitive link defined as (  ) = ((  1 ), (  2 ), . . ., (   )). is an integer indicating that the CPU capacity of the most sensitive node will be increased by  times. is a vector defined as  = ( 1 ,  2 , . . .,   , . . .,   ), where the element   is an integer indicating that the bandwidth capacity of the sensitive link    will be increased by   times.
) 4.3.2.Resources on Service Link.Similarly, the available capacity of a service link, denoted by   (), is defined as the total amount of bandwidth available on the service link  ∈ ,   () =  () −  * .

Table 2 :
Differences in three settings.

Table 3 :
Resource utilization in three settings.