Transcoding Based Video Caching Systems: Model and Algorithm

,


Introduction
The explosive demand of online video watching from mobile users brings huge bandwidth pressure to cellular networks.A common way to relieve such pressure is to deploy cache servers close to the end users to help video diffusion [1,2].Caching algorithms can be categorized into two types: online algorithms and offline algorithms, which typically operate at small and large time scales, respectively.Typical online algorithms include Least Recently Used (LRU) [3] and Least Frequently Used (LFU) [4], both of which make caching decisions based on dynamically arrived user requests.In contrast, offline caching algorithms make caching decisions for all videos: caching each of them or not based on their historical data of user fetching without consideration of users' real-time requests.In this paper, we will focus on design of offline caching algorithm.Unless otherwise specified, the term caching algorithm in this paper means offline caching algorithm.
Traditional offline caching algorithms (e.g., [5,6]) assume that all cached video files are independent and treat them separately.They compute each file's popularity by counting the time the file has been accessed in the past and cache the most popular files.However, different users may request different versions (also with different resolutions) of a video, and the contents of different versions of a video are not independent but relevant.Thus, caching of files in such cases may contain a lot of redundancy since one version of a video 2 Wireless Communications and Mobile Computing can be utilized to produce some other versions of the video by using certain video encoding techniques such as scalable video coding (SVC) [7] and transcoding [8].
SVC has been used to improve caching performance [9].SVC encodes a video into different layers.With SVC, one version of a video contains one or more video layers, and different quality versions of a video share certain low video layers.Based on this observation, the popularity can be tuned from a per-file perspective to a per-video-layer perspective.With such an understanding, [9] proposed a caching algorithm for providing video services layered by SVC so that caching decision is made on a per-layer level instead of a per-video level.However, SVC-based caching has the following disadvantages.First, it requires SVC-encoded video files stored on cache servers and also SVC-decoding capability at mobile terminals; however, due to the high decoding complexity and excessive overhead [7], SVC is not widely deployed in online VoD.This largely limits the wide application of SVC-based caching algorithm in reality.Second, the number of the quality versions supported by a SVC-encoded video exactly equals the number of layers in the video.This largely restricts the granularity of services that a SVC-encoded service platform can provide to users, who may desire various quality versions of a video.
Transcoding has also been used to improve the caching performance of an online VoD system.Different from SVC encoding/decoding, transcoding has been a mature technology for converting a video from high-quality to any lowqualities [8].The conversion can be easily done at cache servers without involvement of mobile clients.Compared with SVC-based caching, transcoding based caching has two advantages.First, video quality transforming can be conducted on much finer granularity to meet diverse user requirements.Second, transcoding is compatible with existing video formats, without need to upgrade the software at mobile terminals (client side) and video source (source side).In [10], Shen et al. proposed a transcoding based online caching algorithm, with which a caching system transcodes the cached videos according to real-time user requests and makes cache replacement with LRU algorithm.However, this algorithm is just a simple combination of LRU and transcoding, which largely restricts its caching performance.Moreover, the use of LRU in [10] cannot ensure each video has at most one quality version in a cache server, and thus much redundancy still remains among cached files.
Recently, fog computing pushes computing power to edge of network to reduce distance between service provider and users.In this paper, we take advantage of fog computing and deploy cache at networks edge.Specifically, we focus on studying transcoding enabled caching.The objective is to enable cache servers to keep most valuable video versions so as to minimize the video delivery delay for video requests from all users subject to constraints of cache sizes and limited bandwidth on links between different base stations (for cooperative caching).We first formulate the transcoding based caching problem as integer linear programming problem.Then we propose a Transcoding based Caching Algorithm (TCA), which iteratively finds the placement leading to the maximal delay gain for video fetching among all possible choices by considering request arrival rates of videos, file sizes of different video versions, and also delivery delays between different cache servers.To reduce content redundancy, TCA restricts each cache server to keep at most one quality version of a video.We deduce the computational complexity of TCA.Simulation results demonstrate that our TCA algorithm can significantly reduce the video transmission delay by up to 40% compared with traditional offline greedy algorithm in [5].
The rest of the paper is organized as follows.Section 2 gives a brief review of related work.Section 3 describes the caching system model and formulates optimal transcoding based independent caching and cooperative caching problems, respectively.In Section 4, we propose the TCA caching algorithm.In Section 5, we perform simulations to evaluate the performance of TCA.We conclude this paper in Section 6.

Related Work
Traditional offline caching algorithms usually treat individual video files separately and tend to keep the most popular video files in cache [5,6].Reference [5] proposed a cooperative caching algorithm for a distributed cache system.Staring with an all-zero caching vector, this algorithm then iteratively updates the caching vector by placing a file to a cache server such that this placement leads to the maximum performance improvement, which is computed based on the popularity of each video.Reference [6] proposed a caching algorithm which makes use of both user interests and video popularities, in order to achieve a good balance between cache efficiency and user preference.
However, in the traditional caching algorithms, several files kept at a cache server may belong to the same video but they have different qualities and thus lead to a lot of redundancy since one version of a video can actually be used to produce some other versions of the video by using certain video encoding/decoding technique, such as SVC [7] and transcoding [8].
SVC encodes a video into different layers, and different quality versions of a video share certain low layers.Thus, for SVC-encoded videos, the popularity calculation is changed from a per-video perspective to a per-layer perspective.Following this direction, in [9], a SVC-based caching algorithm was proposed to improve the caching performance.However, due to the high decoding complexity and the additional overhead [7,11], SVC is not widely deployed in online VoD in particular for resource-limited mobile terminals.Moreover, the amount of quality versions that a SVC-based video can be transformed is quite limited and it equals the number of its encoded layers.This largely restricts the service granularity that a SVC-based encoding system can provide.
Transcoding has been a mature technology, which can convert a high-quality video to a low-quality video in real time [12].By applying transcoding at cache servers, video quality transforming can be conducted on much finer granularity to meet diverse user requirements, and it is compatible with existing video formats without need to upgrade the software at mobile clients or video sources.
In this aspect, [10] designed an online caching system, which uses LRU to make cache replacement and uses transcoding to covert cached video files.Reference [13] designed an augmented radio access network model to evaluate the performance of such transcoding based online caching systems and showed that the caching performance can be further improved.Reference [14] studied a multipleparameter optimal model subject to the cache storage and transcoding capacity constraints and proposed a cooperative LRU-based online video caching algorithm (online JCCP), which uses the collaboration among the cache servers to further improve cache performance.However, in these studies, LRU is just simply combined with transcoding, and the caching or replacing of a file is still based on LRU itself, which cannot ensure that each video has at most one quality version cached in individual cache server.For example, when a high version needs to be cached, removing the low version of the file from the cache (if any) should be a better choice.However, the use of LRU in [10,13,14] fails to incorporate such operation.Thus, redundancy still remains in caches.
Transcoding also causes certain extra cost.Reference [15] pointed out this problem and proposed an online partial transcoding caching scheme used in cloud computing network with an aim to minimize the total extra cost caused by the transcoding and storage.Reference [16] in 2018 proposed a cloud-based architecture to allocate transcoding tasks among virtual machines in a cache system to decrease the cost of streaming service provider and pointed out transcoding based video caching is worth being further studied.Moreover, due to the significant improvement of capability of mobile edge computing [17,18] and the appearance of new transcoding technologies [19,20], it is now feasible to conduct full transcoding in caching systems at mobile edge.
In this paper, we focus on studying transcoding based offline caching algorithm.Our algorithm in this paper differs from the above work in the following ways.First, it restricts one cache server to keep at most one quality version of a video in order to remove unnecessary redundancy.Second, it made caching decision on the quality version of a video to be cached based on all the requests for different versions of the video.

System Model and Problem Formulation
In this section, we first give an overview of transcoding based video caching system for providing streaming services at mobile edges.Then, we illustrate the key idea for performing transcoding based caching.Finally, we formulate the optimal transcoding based caching problem.

System Model.
We first introduce the caching system under study.As Figure 1 shows, there are  cache servers in the system, which are geographically distributed at the edge of the cellular network to provide online VoD services to mobile users covered by the base stations.We assume each cache server is attached to a base station via a short link with unlimited bandwidth, and for simplicity we assign the same sequence number to a cache server and its attached base station.Each cache server can transcode a cached video from a high-quality version to a low-quality version in real time.These cache servers are connected to a remote video source server, which has all the quality versions of all the videos that users may request, via the backhaul network.In this paper, each user is assumed to be covered by one base station and can fetch video data from its local cache server, the video source server, or another local cache server (in the case of cooperative caching).
We next introduce a typical application scenario as shown in Figure 1 to illustrate the operation of vide fetching in such a system.As the figure shows, a user (say ) sends a request for a video quality version labeled by (1, 2), i.e., a quality version 2 of video 1, to its local cache server 1, which just has a video quality version (1,4).Therefore, cache server 1 can satisfy the request by transcoding (1, 4) to (1, 2) since the locally cached version is higher than the requested version.We assume the transmission rate between a mobile user and its local base station is high enough such that the delivery delay for fetching a video from local cache server to its users is negligible, and we further assume transcoding can be performed in real time on a cache server such that we do not consider the transcoding delay.
Furthermore, assume the  = 3 cache servers in the system work cooperatively.A user, say  in Figure 1, who requests (1, 1) but cannot get it directly from its local cache server 2, can fetch the video also from server 1, which transcodes (1, 4) to (1, 1) and send (1, 1) to user  via base station 2. Along the path from server 1 to user , we assume the link between base station 1 and base station 2 will allocate a certain bandwidth for each session for such video transmission and also cause certain delivery delay.Finally, if the request for (1, 5) of a user  cannot be satisfied by any local cache servers, he/she needs to fetch it from the remote source server.Along the path from the remote video source server to user , we assume the backhaul link will allocate a certain bandwidth for each session of such video transmission and also cause certain delivery delay.
Our objective here is to find a caching vector to place video files on different cache servers, while minimizing the average delivery delay of video transmissions for all users.Such optimization has great challenges on computation complexity.For example, considering there are five levels of qualities with either of two given videos, assume each one of the three cache servers in Figure 1 has a limited space equal to the maximum size of the video versions with the highest quality.Then, for the 2 videos each with 5 different qualities, each cache server needs to select from the 6 * 6 version choices one video or two videos to store, where each digital number of 6 corresponds to the 5 quality versions of either file plus a zero quality meaning the video is not to be stored.Moreover, if the three cache servers cooperate, the video version choices for them are 6 2×3 ; let us further generalize this computation as follows: in a real cache system, assume  is the total number of the videos each having  levels of qualities and  is the total number of cache servers, the video version choices for these cache servers are ( + 1) × , a huge amount, and thus such optimization problem is usually too complex to be solved in polynomial-time.

Wireless Communications and Mobile Computing
In the above system model, we assume transcoding can be performed in real time on a cache server without delay.In fact, the transcoding delay is related to the capacity of the transcoding server as well as the load of the transcoding processes.We will leave this problem in our future work.
Next, we will first define variables and parameters used and then formulate the problem in the following subsection; we will then propose a heuristic algorithm to solve this problem in the next section.

Symbols and Parameters.
Before formulating the problem under study, we list the symbols and parameters used hereafter in Table 1.
We denote the set of cache servers by N = { |  = 1, 2, . . ., }, where  is the total number of cache servers.We assume a cache server ,  ∈ N has a space of   bytes for caching video files.
Assume there are in total  videos in the system, each having  different versions to be requested, and we use V = {1, 2, ⋅ ⋅ ⋅ , } and Q = {1, 2, . . ., } to denote the set of all the videos and the set of all the video versions, respectively.Each quality version of a video, denoted by (V, ), V ∈ V,  ∈ Q, has a given size of  V bytes, and we assume  V1 <  V2 < ⋅ ⋅ ⋅ <  V , ∀V.The number of requests for each video version (V, ) from the users under base station  is denoted by  V , which is assumed to be known in advance like in [5,9].
Delivering a file may cause certain delivery delay.Specifically, delivering a video with a size of  V from the remote video source server to a user in the cell of base station  will cause a delay of  V   , where   is the delay for delivering one unit of video data from the source server to the user via base station , due to the limited bandwidth assigned on backhaul link for the transmission.Furthermore, in the situation where these cache servers work cooperatively, delivering a video with a size of  V from a server   to a user associated with base station  will cause a delay of  V    , where    is the delay for delivering one unit of video data from cache server   to the user via base station , due to the limited bandwidth assigned on the link between base station   and base station  for the transmission.We denote a caching vector by an set of integers  = { V |  ∈ N, V ∈ V,  V ∈ Q + {0}}, where each element  V represents the quality version of video V cached at server , and specially,  V = 0 means server  does not have any version of video V. We use   = { V | V ∈ V} to represent the caching vector for server .

Independent Caching with
Transcoding.We first consider the situation that cache servers operate independently.

Wireless Communications and Mobile Computing 5
In this situation, a user can only fetch a video file either from its local cache server (with high priority) or from the remote video source in the cloud (with low priority).Our goal is to find an optimal caching vector   for each individual cache server  ( ∈ N) while minimizing the average delivery delay of all the video fetching of users covered by the base station .Since the average delivery delay equals the total delivery delay of all the video fetching divided by the total number of requests, and the total number of requests is assumed to be a constant, the objective of minimizing the average delivery delay is equivalent to minimizing the total delivery delay (denoted by   ).The total delivery delay   can be computed as follows: where  condition is an indicator function which equals 0 and 1, respectively, when the "condition" equals false and true.We explain (1) as follows.First, let us consider a given video quality version (V, ) requested by users under base station .
If the cached version for video V at server  can satisfy these requests, i.e.,  V ≥ , we have   V < = 0, and these users can fetch (V, ) from the local cache server directly without delay; otherwise, we have   V < = 1, and these users have to fetch it from the remote video source server with a delivery delay  V  V   , where  V is the total number of the requests for file (V, ), and  V   is the video delivery delay for each such request.Considering all the versions of video V to be requested, we have the following total delivery delay (denoted by  V ) for fetching all these versions.
Considering all the videos and all their quality versions to be requested, we have formula (1).Finally, we formulate the optimal transcoding based independent caching at a cache server  as follows: minimize (3) Formulation (3) has two constraints.First, the size of the total cached files at a server  must not exceed the cache space   .Second, the value of  V must be an integer, and it indicates this is an integer programming problem.As having been mentioned in Section 3.1, the computation complexity for ( 3) is ( + 1)  for each cache server; such computation is too complex to be solved in polynomial-time for large .

Cooperative Caching with
Transcoding.We then consider the situation where the cache servers work cooperatively with each other for providing high-quality online VoD services.In this situation, among all the servers including the local cache servers and the remote video source server, a user can fetch a video file from a best server which can satisfy the request while leading to the minimal delivery delay.Our goal is to find a caching vector  for the local caching system while minimizing the average delivery delay of all users.Again, this objective is equivalent to minimize the total delivery delay (denoted by   ) of all video fetching requests.
The total delivery delay   can be computed as follows: where is the set of the cache severs which can satisfy the requests for a given video version (V, ).
We explain (4) as follows.Let us first compute the delivery delay for a given combination of , V, and , i.e., transmitting video (V, ) to satisfy all user requests under base station .On one hand, if none of the cache servers can satisfy the user requests, we have     V < = 1, ∀  ∈ N, and thus have ∏   ∈N     V < = 1.Thus, the users have to fetch (V, ) from the remote video source server, which leads to a delivery delay of  V  V   .On the other hand, if at least one cache server   can satisfy the user requests, we have     V < = 0 for each such server   and thus have (1 − ∏   ∈N     V < ) = 1.Then, among all such cache severs N  , we choose the cache server with the minimal delay to base station  to satisfy the requests of video (V, ), and we then have the total delivery delay as  V  V min   ∈N     .Here, we assume the delay between a pair of cache servers is smaller than that between the remote source server and a local cache server.Considering all the versions of video V to be requested, we have the following total delivery delay (denoted by   V ) for fetching all these versions: Considering all the videos and all their quality versions to be requested, we have formula (4).Finally, we formulate the optimal transcoding based cooperative caching problem as follows: The two constrains in (6) are similar to those in (3).As having been mentioned in Section 3.1, the computation complexity for ( 6) is (+1) × ; such computation is usually too complex to be solved in polynomial-time.

Transcoding Based Cache Algorithm
In this section, we design a transcoding based caching algorithm (TCA).

Algorithm Overview.
TCA works for generating a caching vector  in a greedy manner such that it iteratively assigns a video quality version to an available cache server, which leads to the maximal performance gain among all choices, until no file can be placed.
TCA works as follows.First, it initializes an all-zero caching vector  with  V = 0,∀,V.Then, it iteratively updates  by placing a video version on a cache server, one video version placed each time.Given a set of video versions to be considered for caching at a set of cache servers, there typically exist multiple placement choices in each iteration.For each possible placement (, V, ) (i.e., placing video (V, ) on server ), we compute the delivery delay for video V (called after-placement delay) by using (2) or ( 5), based on whether independent caching or cooperative caching is used, under the current .Moreover, we have a delivery delay for video V before the placement (called base delay).The reduced delay (i.e., the base delay minus the after-placement delay) is called the delay gain associated with the placement.Among all possible placement choices, TCA chooses the choice leading to the maximal delay gain and accordingly updates the caching vector.This process is repeated until no more file can be cached at any of the cache servers or no more video version needs to be considered for caching.

Algorithm Design.
The procedure of TCA is shown in Algorithm 1, which is described partially based on the code framework in [5].
First, a series of variables are initialized in lines 1-4, including (i)  = { V |  ∈ N, V ∈ V,  ∈ Q} being initialized as a set containing all possible placement choices, where each element  V ∈  represents a possible placement, i.e., it is still applicable to place (V, ) at server , (ii) the possible placements   ⊆  for each cache server , (iii) an all-zero caching vector , and (iv) an all-zero caching vector   for each cache server .
Then, it iteratively updates  (see line 5 to line 24 in Algorithm 1).In each iteration, line 6 finds the best placement   (i.e., placing the  version of video  at server ) among all possible placements in , where function   () computes the delay gain of a possible placement  based on the current caching vector .Following that, if the maximal delay gain equals zero, TCA will break the loop; otherwise, it will continue the following operations.Line 10 adjusts cache size   , by removing the space to be occupied by the new video version (, ) and also retrieving the space occupied by the previously cached video version (,   ) (if any).Lines 11 and 12 update   =  in  and   , respectively, according to the new choice of   ; lines 13 and 14 removes placements  V , ∀ = , V = ,  ≤ , from  and   , since all the user requests for fetching (, ),  ≤  from server  can be satisfied by this new placement and thus it no longer needs to cache any lower version of the video .Next, lines 15-20 remove those placements, which are impossible to be realized due to limited remaining cache space, from  and   .Following that, if  becomes empty (see line 21), TCA will break the loop.Finally, TCA outputs the final caching vector .
The computational complexity of TCA can be easily deduced as follows.The initialization in lines 1-4 obviously takes (NVQ) time.In the "for" loop between line 5 and line 24, line 6 has the highest complexity of (NVQ) and the "for" loop takes at most (NVQ) time.Thus, the computational complexity of TCA is (N 2 V 2 Q 2 ).

Performance Evaluation
In this section, we perform simulations to evaluate the performance of TCA.We compare three instances of two caching algorithms, including two instances of TCA, which adopt independent caching (TCA-I) and cooperative caching (TCA-C), respectively, and the traditional greedy algorithm (Greedy) for cooperative caching [5].

Parameter Settings.
We set  = 3 cache servers for simulations, each having the same cache size.The default rate of transmitting videos between each of the servers and the remote source server is 2 Mbps, and the default rate (i.e., cooperative transmit rate) among the cache servers for video delivery in the mode of cooperative caching was set to 5 Mbps.The default cache size at a cache server is 400 GBytes.We assume there are 1000 videos for caching, all of which have one-hour playback duration.According to the Youtube recommended resolutions and bitrates, each video supports 5 different quality levels of video resolutions, including 240p, 360p, 480p, 720p, and 1080p, which correspond to video bitrates of 400, 750, 1000, 2500, and 4500kbps, respectively [21].Thus, the total size of the 5000 video versions is close to 4 TB.Following typical settings used for empirical studies in VoD system [9,22], the popularity of the 1000 videos is assumed to follow a Zipf distribution; i.e., the  th most popular video will be requested at a rate in proportion to  − , where  is a shape parameter and was set to 0.8 unless otherwise specified.Furthermore, different versions of a video are assumed to be equally requested by users.

Simulation Results. Impact of cooperative transmit rate on delivery delay.
For this test, we conducted a series of simulations with varying cooperative transmit rate ranging from 2 to 11 Mbps at 1 Mbps granularity.The results are shown in Figure 2, where the -axis is the cooperative transmit rate and the -axis is the average video transmission delay.The three curves correspond to TCA-C, Greedy, and TCA-I, respectively.The former two algorithms adopt cooperative caching and can reduce the delivery delay by increasing the cooperative transmit rate.The third algorithm uses independent local caching and thus has a constant performance irrelevant to cooperative transmit rate.The results in this figure clearly demonstrate that TCA-C has the best performance, since it makes full use of both transcoding and cooperative caching at different cache servers.The gains in terms of delivery delay are up to 40% and 38% as compared to TCA-I and Greedy, respectively.
Impact of cache size on delivery delay.In this test, we conduct a series of simulations with varying cache size ranging from 50 GB to 800 GB at 100 GB granularity.Figure 3 shows the results.In Figure 3, all the three curves show similar decreasing trends on average delivery delay with cache space increasing.Moreover, for each given cache size, the results show that TCA-C significantly outperforms TCA-I and Greedy, with a decrease of average delivery delay by up to 50% and 53%, respectively.The results indicate that TCA-C can efficiently utilize available cache space, so as to improve user experience of video fetching.
Impact of video popularity distribution on delivery delay.In this test, we conduct a series of simulations with varying shape parameter Zipf ranging from 0.4 to 1.2 at 0.1 granularity.Figure 4 shows the results.Not surprisingly, all the simulated algorithms lead to decrease on delivery delay with increasing of the shape parameter .This is because a larger  indicates fewer videos are more popular such that the caching of them can satisfy a large proportion of all users' requests.Figure 4 shows that TCA-C always has the best performance.

Conclusion
In this paper, we studied transcoding based caching for improving the performance of a distributed video caching system.We formulated optimal transcoding based video caching problem under two different modes (i.e., independent caching and cooperative caching) as integer linear programming problem.We then proposed a transcoding based caching algorithm TCA, which iteratively places a video version to a cache server, which leads to the maximal delay gain among all possible choices.Simulation results demonstrate that TCA (when working in cooperative caching mode) can significantly reduce the delivery delay compared with traditional greedy algorithm.

Figure 1 :
Figure 1: Example illustrating the application scenario with three cellular cells.In this figure, combination of two figures, in the form of (, ), represents video 's  version.

Figure 2 :
Figure 2: Impact of cooperative transmit rate.

Table 1 :
Symbol definition.), i.e., quality version  of video V.  V Number of requests for file (V, ) from the users under base station .  Unit data delivery delay from remote source server to a user under base station .   Unit data delivery delay from server  to a user under base station     Delivery delay of server  for serving all its local requests in independent caching. V Delivery delay of server  for serving its local requests for all versions of video V in independent caching.Delivery delay for serving the requests from all users for all versions of video V in cooperative caching. Caching vector, where each element  V equals the quality version of video V to be cached at server  while  V = 0 means no file related to video V is cached at server .