Weighted Cache Location Problem with Identical Servers

This paper extends the well-known p-CLP with one server to p-CLP with m ≥ 2 identical servers, denoted by (p,m)-CLP. We propose the closest server orienting protocol (CSOP), under which every client connects to the closest server to itself via a shortest route on given network. We abbreviate (p,m)-CLP under CSOP to (p,m)-CSOP CLP and investigate that (p,m)-CSOP CLP on a general network is equivalent to that on a forest and further to multiple CLPs on trees. The case ofm = 2 is the focus of this paper. We first devise an improvedO(ph+n)-time parallel exact algorithm forp-CLP on a tree and then present a parallel exact algorithm with at most O((4/9)p2n2) time in the worst case for (p, 2)-CSOP CLP on a general network. Furthermore, we extend the idea of parallel algorithm to the cases of m > 2 to obtain a worst-case O((4/9)(n − m)2((m + p)p/ (p − 1)!))-time exact algorithm. At the end of the paper, we first give an example to illustrate our algorithms and then make a series of numerical experiments to compare the running times of our algorithms.


Introduction
Caching has become an important tool to improve the network performance efficiency, reducing delays to every client and alleviating the overload on the server [1][2][3][4].Initially, a large amount of studies considered how to optimize cache performance [5][6][7], cache hierarchies [5], and cooperations among multiple web servers [8,9].Subsequently, how to locate caches or proxies optimally in networks to alleviate the server load became more popular [2,[10][11][12][13].The most popular practice in the past is to place caches on the edges of networks, acting as the network browser and proxy or part of cache hierarchies [1,[3][4][5].Later, Danzig et al. [2] discovered that the advantage of placing caches on the nodes of networks instead of on the edges of networks is to reduce overall network congestion greatly.In this paper, we only discuss how to place caches on the nodes of networks.
The focus of placing caches in networks is how to enhance the effect and efficiency of caching in networks as greatly as possible.This problem can be modeled as the -cache location problem (abbreviated to -CLP or CLP) or -proxy problem.Both of their initial models can be reduced to the-median problem [14,15] essentially.Throughout this paper, we let  denote the number of network nodes, let  denote the number of network edges, let  denote the number of caches or proxies, let and ℎ denote the height of tree.Later, Abrams et al. [7] investigated that almost all current cache products contain a transparent operation mode, called a transparent en-route cache (TERC).When using TERCs in networks with one server, all clients connect to server and caches are placed on the routes from clients to server.Heddaya and Mirdad [10] suggested making use of TERCs to balance load due to the manageability of TERC.Further, Krishnan et al. [11] proposed the cache location problem involving TERCs, studied the problem in several special networks, and presented polynomial time exact algorithms.In the rest of this paper, all of CLPs involve TERC.
The known algorithms for the -proxy problem also apply to -CLP.For a linear network, Li et al. studied the -proxy problem and presented an ( 2 ) time exact algorithm [12].Later, Woeginger used the Monge property to obtain an improved algorithm with () time complexity [16].For a tree network, Li et al. devised an ( 2  3 )-time exact dynamic programming algorithm [13] and Chen et al. presented an improved (ℎ)-time algorithm [17].For a general tree of

Problem Description
Let  = (, , , ) represent a communication network or computer network, where  is the node set and  is the edge set.Every node represents a processing or switching element and every edge represents a communication link [22].Every node  ∈  has a weight () > 0 representing the demand amount of , and every edge  ∈  has a weight () > 0 representing the cost per demand.For any pair of nodes  and , we let [, ] denote the edge of  between  and  and let (, ) denote the shortest path in  connecting  and .Let [, ] denote the cost of edge [, ], and (, ) denote the cost of (, ) which is equal to the sum of all the costs on edges of (, ).So, Let  1 and  2 be two identical servers, which are allocated to a pair of nodes of  in advance.Let Γ denote the set of  cache locations.Suppose that CSOP works and each cache is a duplicate of server.Given any set Γ ⊂  \ { 1 ,  2 }, the cost of node  paying for its per demand depends on the locations of Γ ∪ { 1 ,  2 } and is denoted by (, Γ∪{ 1 ,  2 }).Thus, the cost of  paying for its overall demand is equal to ()(, Γ∪{ 1 ,  2 }).Let (Γ) denote the total cost of all the nodes paying for their overall demand, that is, ( The -cache location problem with two identical servers under CSOP (i.e., (, 2)-CSOP CLP) aims to find  cache locations in  to minimize the total cost of all the nodes paying for their overall demand.In other words, the aim of (, 2)-CSOP CLP can be reduced to find an optimal set Γ * from  \ { 1 ,  2 } to minimize the value of (Γ), that is,

A Parallel Exact Algorithm for (𝑝,2)-CSOP CLP
In the scenario of (, 2)-CSOP CLP, every client knows the location of the closest server to itself and connects to it via a shortest route.If its service request encounters the closest cache on the route, it will get information therein.Otherwise, it get information from the server.Therefore, (, 2)-CSOP CLP can be viewed as the combination of two CLPs when  1 and  2 are predesignated to two locations of network.One is CLP with  1 as the server and the other is CLP with  2 as the server.

Preprocessing.
In this subsection, we give a new method of transforming an arbitrary rooted tree into a binary tree.Let  be a rooted tree.For any nonleaf node  of , the subgraph of  is composed of the edges between  and all its children are called a star of  with center , denoted by ().Let () denote the number of children of .We process  in the following way: We use the above way recursively to process every node of  = (,,,) top-down to obtain a binary tree () = (  ,   ,   ,   ).This idea can be described as algorithm BINY.Our way improves that one proposed by Chen et al. [18].Moreover, we will analyze the performance of BINY in the following while they provided no analysis of their algorithm [18].Theorem 3.For each  of  with () =  ≥ 3, the subtree of () derived from transforming () by BINY has a height of (log ) and has 5/3 − 2 dummy nodes added in the worst case.
Proof.The essence of BINY processing () is to bisect all the children of  recursively.At the final step, two dummy nodes are added if four nodes are bisected into two groups of two nodes and three dummy nodes are added if three nodes are bisected into one group of two nodes and one group of one node.So, BINY adds the most dummy nodes in the worst case of  = 3⋅2  .In this case, the subtree of () derived from transforming () by BINY has a height of  + 2 and the number of dummy nodes added is In fact, we investigate that the subtree of () derived from a star with  nodes satisfying that 2 +1 +1 <  ≤ 2 +2 has a height of  + 2. So, log  ≤  + 2 < log  + 1.Therefore, the subtree of () derived from () has a height of (log ).
The number of all the dummy nodes added by BINY in the worst case is Step 0. Use DFS based algorithm in [24]

Theorem 5. CLP in a general tree 𝑇 is equivalent to CLP in 𝑏(𝑇).
Proof.This theorem is equivalent to the proposition that no cache is located at one dummy node in any optimal solution to CLP in ().Suppose that a cache is located at a dummy node  in an optimal solution  and  is added by transforming () by BINY.The cost of a new solution   obtained by replacing  with  into  is less than the cost of .This causes a contradiction.
The binary tree obtained by applying Tamir's algorithm [15] to a general tree  with  nodes has a height of at most  − 2, while one obtained by applying BINY to  has a height of at most 2( − 1)/3.In terms of height of binary tree, BINY is superior to Tamir's algorithm.This will help to reduce the running time of algorithm SUB (Algorithm 1) shown in Section 3.3.

Algorithm for CLP on Trees
. By Theorem 5, we only need to discuss CLP on binary trees.Let  be a binary tree and  the server (root) node, and let (⋅) denote the node set of  or its subtree.For any node  of , we let   denote the subtree of  rooted at .Let   and   be the left child and right child of , respectively, and then let    (resp.   ) denote the subtree of   rooted at   (resp.  ).We use ℎ to denote the height of  and 1, 2, . . ., ℎ to label the levels of  bottom-up, and we use () to denote the th level of .
Proof.For any node  of , we need to consider whether a cache is located at  or not when discussing the subproblem of -CLP on   . ( (2) If no cache is located at , then the cost of  paying for its overall demand is () ⋅ (,   We can first use the depth-first search (DFS) based algorithm in [24] to traverse , by which we can record the parent node of every node  (thus record (, ) step by step) and compute the cost of path connecting any node and its ancestor.Based on Theorem 6 and above discussions, we devise a bottom-up dynamic programming algorithm, which can be described as a parallel algorithm SUB by using the techniques in [25].Theorem 7. Given any binary tree  with  nodes and a height of ℎ, SUB runs in (ℎ 2 + ) time for computing -CLP on .

Generalization
In this section, we discuss the -cache location problem with  identical servers under CSOP (abbreviated to (, )-CSOP CLP) on an undirected graph  = (, , , ).Let  = { 1 , . . .,   } be a collection of  identical servers.Given any set Ω ⊂  \ , we let (, Ω ∪ ) denote the cost of node  paying for its per demand and let (Ω) denote the total cost of all the nodes paying for their overall demand; that is, (Ω) = ∑ ∈ ()(, Ω ∪ ).The aim of (, )-CSOP CLP is to find  cache locations in  to minimize the total cost of all the nodes paying for their overall demand.In essence, (, )-CSOP CLP aims to find an optimal set Ω * to minimize the value of (Ω); that is, (Ω * ) = min Ω⊂\ (Ω).

An Illustrative Example.
In this subsection, we first give an example to illustrate our algorithm GLOB for computing (, 2)-CSOP CLP.Considering that (, 2)-CSOP CLP on a general network can be reduced to that on a corresponding tree network, we select a tree network as our example for ease of illustration, shown in Figure 1.The example tree  has 25 client nodes labelled by  1 ,  2 , . . .,  25 and two server nodes labelled by  1 and  2 .The number   , 1 ≤  ≤ 25, on client node   represents the demand account of   , and the number on every edge represents the cost of one node paying for its per demand.For instance, the demand account of  5 is 0.76, and the total cost of  5 paying for its overall demand is 1.55 × 0.76 = 1.1780.
First, we make preparation works.The unique path Π[  2 and the left subfigure of Figure 3, respectively.Both the heights of ( 1 ) and ( 2 ) are three.We apply BINY to transform ( 1 ) into a binary tree (( 1 )) shown in the right subfigure of Figure 2, where eight dummy nodes added by BINY are labelled by  1 ,  2 , . . .,  8 .Similarly, we obtain (( 2 )) shown in the right subfigure of Figure 3, where nine dummy nodes are labelled by  1 ,  2 , . . .,  9 .All the dummy nodes have a weight of zero and all the edges between a dummy node and its parent node have a weight of zero.Both the height of ( 1 ) and ( 2 ) are five.

Comparison of Running Times.
In this subsection, we make a large number of numerical experiments to compare the running times of our algorithm GLOB and EXTD, respectively.In view of the fact that (, )-CSOP CLP with  ≥ 2 on a general graph can be reduced to multiple CLPs on binary trees, we select a series of complete binary trees as examples for ease of comparison.All the binary trees are generated randomly and have almost same number of nodes.We build a centralized parallel computer system (i.e., a star network with one central computer and five parallel computers) by connecting six identical PCs equipped with 2 GB RAM and Intel core i5 CPU using a Windows 7 operating system.Our numerical experiments were carried out on this computer system.
For (, 2)-CSOP CLP, we consider different inputs of  and :  = 100, 102, 104, . . ., 200 and 2 ≤  ≤ 20.All the binary trees we select have odd nodes.Given  = 200 and  = 20, there are 100 different combinations of  1 and demand provided by the server.Hence, the cost of node  paying for its overall demand is equal to %⋅(, Ω(  ) ∪ {  })+ (1−%)⋅(,   ).Let (Ω) denote the total cost of all the nodes paying for their overall demand.We have  This version of (, )-CSOP CLP aims to find an optimal set Ω * such that (Ω) is minimized.The problem remains as one future research topic.

Theorem 6 .
Let [][][] denote the minimum cost of the subproblem of -CLP on   when the closest cache to  on (, ) is located at  and  caches are placed in   .Similar to the idea of solving the -proxy problem in [18], we propose our way of computing [][][] and giving a new proof, shown in Theorem 6.For each node  of  other than leaves, each node  on (, ), and each 0 ≤  ≤ min{, |(  )|}, one has

Lemma 9 .
The number of ⃗ -CASs in (, )-CSOP CLP is( +−1  ).Proof.The problem of allocating  caches to  distinct subtrees can be reduced to the model of putting  same balls into  distinct boxes.We first draw  +  − 1 dots one by one in a line and then select  dots to place balls and the other  − 1 dots to place baffles.The line is partitioned into  sections (boxes) by these  − 1 baffles together with two immaterial baffles at two ends of the line.There are ( +−1 −1 ) ways in all to partition the line into  boxes.Every way of partitioning the line produces a ⃗ -CAS.Therefore, (, )-CSOP CLP contains ( +−1  ) ⃗ -CASs.
Once  1 and  2 are fixed at two predesignated locations of , it is certain that some nodes of  are closer to  1 and the other nodes are closer to  2 .Let ( 1 ) be the set of nodes that are closer to  1 and ( 2 ) be the set of nodes that are closer to  2 .Thus, Lemma 1 follows immediately.Let ( Specifically, we let Π[ 1 ,  2 ] denote the unique path on tree between  1 and  2 when  is a tree graph.Let ( 1 ) (resp.( 2 )) be the set of nodes on Π[ 1 ,  2 ] which are closer to  1 (resp. 2 ) and let   ( 1 ,  2 ) be the subset of nodes which reach  1 and  2 via  for any node  on Π[ 1 ,  2 ].We investigate that every node in   ( 1 ,  2 ),  ∈ ( 1 ) belongs to ( 1 ) and every node in   ( 1 ,  2 ),  ∈ ( 2 ) belongs to ( 2 ), and vice versa.So, Lemma 2 follows.By Lemmas 1 and 2, we can compute ( 1 ) and ( 2 ) by applying the depth-first search (DFS) procedure to the tree, which only takes a linear time.
to traverse , and record the parent node of every node  of ; Step 1. for  from 1 up to ℎ do Use |(())| processors to work simultaneously and all processors do the same work as follows for all  ∈ (()): if  is a leaf node then Use DFS based algorithm in [24] to compute (, ) and initialize [][][1] ← 0, [][][0] ← () ⋅ (, ) for any  ∈ (, ); Step 2. for  from 1 up to ℎ do Use |(())| processors to work simultaneously and all processors do the same work as follows for all  ∈ (()): if  is a nonleaf node then for each  from 0 up to  do if 0 ≤  ≤ min{, |(  )|} then Use DFS based algorithm in [24] to compute (, ) and then compute [][][] by (6) for any  ∈ (, ); ) If a cache is located at , then  needs no paying for its overall demand.So, the cost of the subproblem of -CLP on   is equal to the sum of that on    and that on    .When the closest cache to  on (, ) is located at , we observe that  is the closest cache on (, ) to   and   .The possible number  of caches placed in   is at most  while at most |(  )|.The number   of caches in    plus the number   of caches in    is equal to  − 1.The key work is to find the optimal (  ,   )-CAS with   +   =  − 1, from which the cost of the subproblem of -CLP on   results is equal to min ).So, the cost of the subproblem of -CLP on   is equal to the sum of that on    and that on    and the cost of .When the closest cache to  on (, ) is located at , we see that  is also the closest cache on (, ) to   and   .As discussed above,  ≤ min{, |(  )|}.The number   of caches in    plus the number   of caches in    is equal to .The key work is to find the optimal (  ,   )-CAS with   +   = , from which the cost of the subproblem of -CLP on   results is equal to min 0≤  ≤,   =−  { [  ] [] [  ] +  [  ] [] [  ] +  () ⋅  (, ) } .
CLP on  and runs in at most ((4/9) 2  2 ) time in the worst case.