Window-Based Popularity Caching for IPTV On-Demand Services

In recent years, many telecommunication companies have regarded IP network as a new delivery platform for providing TV services because IP network is equipped with two-way and high-speed communication abilities which are appropriate to provide on-demand services and linear TV programs. However, in this IPTV system, the requests of VOD (video on demand) are usually aggregated in a short period intensively and user preferences are fluctuated dynamically. Moreover, the VOD content is updated frequently under the management of IPTV providers. Thus, an accurate popularity prediction method and an effective cache system are vital because they affect the IPTV performance directly. This paper proposed a new window-based popularity mechanism which automatically responds to the fluctuation of user interests and instantly adjusts the popularity of VOD. Further, we applied our method to a commercial IPTV system and the results illustrated that our mechanism indeed offers a significant improvement.


Introduction
In the recent few years, there has been an undeniable global trend on developing IPTV (Internet protocol television) network among telecommunication companies because they have believed that IPTV is a new generation of TV industry.Comparing to the traditional broadcast services (or linear programming TV), IPTV possesses its unique competitive advantages.For example, it is more suitable to support personalized service since it transmits the user requests to the control center immediately and it recognizes user identity straightforwardly.On the other hand, all data on IPTV are encoded as a series of IP packets and conveyed to the audience through the residential broadband access network.This characteristic removes the traditional restrictions of watching TV.Any Internet-connectable digital devices equipped with a multimedia player could be a "TV."For telecommunication companies, these unique characteristics create more opportunities for developing new services on the next generation TV industry.
In Taiwan, MOD (media-on-demand) service is one of the most representative IPTV applications developed by Chunghwa Telecom Company (CHT).An important breakthrough is that MOD turns the control back to the audience so that they can book different paid channels according to their program watching custom.MOD provides services over a managed IP (Internet protocol) network and subscribers are constrained geographically.Through IP network, they transmit television channel, movies and different multimedia content to the MOD set-top box (STB) and audience could watch them on the TV.Within this network, service providers must guarantee the quality of service when they transmit video stream to the end user no matter the stream is encoded with standard or high definition.Therefore, when an end user requests a service, a unicast connection will be established from the central video server to the requesting user.If the number of user requests increases dramatically, it might generate overwhelming traffic cost in the network.For preventing this situation, MOD adopts proxy-based architecture and redirection mechanism as its network structure.As a result, all of the client requests will be oriented to the nearest proxy servers (we use term "local servers" or "cache" interchangeably) in each local area at first.If there is no available video at local server, these requests will be redirected to the central servers (we adopt the term "global servers" alternatively).However, the server storage capacity is limited and the cost of local servers is expensive.To determine which VOD is worthy to be stored in the cache should be carefully considered.It is apparent that placing popular content at the cache is the most economical solution because it guarantees that a maximal amount of demands can be supplied from the cache and the traffic cost from central video server to requesting users can be reduced.
In reality, the information of popular content is not acquirable until user requests are generated.On the other hand, the popularity of VOD content may evolve over time according to the diverse audience preferences.Furthermore, the status of a real online system alternates dynamically and the cache decision should be adjusted immediately according to the context.All of these requirements pose new challenges on designing the IPTV caching algorithm.
In this paper, we propose the window-based popularity mechanism (WBPM) which aims to automatically react to the rapid change of user demands and effectively make cache decision.In addition, we ensure that our cache decision is accurate and the local servers should store necessary content only.Moreover, WBPM need not pay any additional maintaining cost and it is simple to be implemented on IPTV system.We build a window-based popularity caching system (WBPCS) which is constructed on the basis of WBPM.WBPCS replenishes the popular VOD immediately and we expect that most demands can be supplied at local servers instantly.
The rest of the paper is organized as follows.Section 2 introduces related studies in the VOD delivery system, and the concept of window-based popularity will be clarified in Section 3. Our proposed popularity computation models and content allocation methods will be presented in Section 4. In Section 5, we experiment our mechanism on MOD and we measure the effectiveness.Finally, we summarize our experiment results and offer conclusions in Section 6.

Related Studies in the VOD Delivery System
2.1.IPTV Architecture.In IPTV network, VOD and other video services consume enormous bandwidth when many users simultaneously request the VOD services.Thus, for IPTV services, to effectively deploy the VOD system is crucial.According to Thouin and Coates [1], there are two common topologies applied to VOD deployments: centralized and proxy based.Centralized deployment method has a central server and all user requests are directed to it.Proxy based architecture consists of a central server and several proxy servers.The former contains all VOD files, while the latter store only part of popular videos.All client requests are oriented to the nearest proxy server at first.If there is no available video at proxy server, these requests will be redirected to the central server.In addition, Chen et al. [2] suggest that P2P (peer-to-peer) may be another helpful choice in providing IPTV services because of its noted file-transferring capabilities.In general, most IPTV network architectures managed by telecommunication companies still adopt proxy-based deployment.After all, it performs well in several aspects so far, such as minimizing end-user latency, reducing bandwidth consumption, and decreasing the central server loading; however, all of these benefits must rely on a well-designed caching strategy.

Caching Strategy.
Caching strategies are effective mechanisms for mitigating the massive loading from global server to local servers.It takes advantage of storage capacity to absorb traffic by locating the most popular video replica at local servers instead of storing them in central location only.When designing caching strategy, popular content prefetching and content allocation are two vital criteria.

Content Prefetching.
Content prefetching relates to selecting popular items and preserving them at the cache before enormous requests arise.The popularity is defined as the likelihood that content might be fetched in the near future.Previous studies modeled the popularity distribution on the assumption of Zipf 's law or Zipf-like law.Actually, Zipf-like law describes the Web object access distribution suitably.For example, Breslau et al. [3] gave six traces ranging from academic, corporate to ISP environments and this evidence supported that Web accessed pattern followed the Zipf-like distribution.In order to assure that the measured popularity is reliable, Shi et al. [4] suggested a time interval which is the most appropriate length to collect the Web accessed information.In IPTV system, the characteristics of VOD have some minor variances with Web objects, such as average file size, object life cycle, and user behavior.Qiu et al. [5] gathered real data and performed an in-depth analysis on several aspects of IPTV user behaviors, including durations for on, off, and channel sessions, time-varying rates of switching-on, switching-off, and channel switching events, and channel popularity.In our work, we study the VOD request pattern and form the popularity distribution from the real data collected from MOD.We tend to indicate that VOD content in IPTV system also follows Zipf 's law in static and 80/20 rule is reasonable when deciding the caching quantity.
In reality, content popularity distribution is hard to know in advance and some previous works asserted that the popularity distribution will evolve and be volatile [6,7].De Vleeschauwer and Laevens [7] proposed a generic model that captures popularity evolution trend and applied exponential smoothing (ES) method to track the momentary popularity.Verhoeyen et al. [6] proposed Alcatel-Lucent 5910 Video Server caching architecture (A-Lu 5910), which also used ES method for tracking each VOD popularity.A-Lu 5910 aimed to allocate content to cache dynamically and respond to adjust popularity automatically.ES considers weighted average between the new popularity and the one obtained just after the previous period: where i represents a specific VOD, P i,new is the popularity of VOD i at the newest period, and P i,old is the past popularity of i measured since item i launched until the end of last  period.The weight r (0 < r < 1) is a parameter which determines how much history is taken into account when tracking popularity distributions.Actually, A-Lu 5910 is not suitable to handle the explosive rise and fall in popularity.
In MOD environment, the demands of VOD usually amass in a short period and user interests change dynamically because the content of VOD is updated frequently under the manipulation by CHT providers.Tracking the popularity of every VOD is not practical for online IPTV system since it will cause additional operating cost for maintaining surplus popularity information.Hence, an essential issue faced by MOD is how to adjust popularity degree immediately and modify caching decision flexibly.such as data placement, data replacement, and equipment allocation.The problem data placement on multiple disks has been proved by Dowdy and Foster [8] to be an NPcomplete problem.Heuristic algorithms such as greedy [9] and sort partition (SP) [10] have been proved to be available solutions.Data replacement algorithm [11] has been developed for improving hit ratio from virtual memory paging to Web caching.In IPTV network, the volume space of local servers is limited so that data replacement is needed.Moreover, most content allocation optimization solutions under VOD environment were given assumption that content popularity distribution was predetermined, and they aimed to minimize other cost problems.For example, Thouin et al. [12] intended to solve equipment allocation problem by determining the number of VOD servers deployed at each network end site.Nimkar et al. [13] proposed a greedy algorithm with INLP model for balancing proxy-server loading which considers server bandwidth and video file size.For IPTV architecture, Sofman et al. [14] proposed a U-turn caching architecture where each proxy server can obtain video not only from upper level servers but also from the equal level neighbor servers.Under this U-turn caching structure, Borst et al. [15] provided a light-weight cooperative cache management algorithm which was aimed at maximizing the traffic volume served from cache and reducing the bandwidth cost as large as possible.Sofman and Krogfoss [16]

An Overview of Window-Based Popularity Caching System
3.1.Concept of Window-Based Popularity.Caching mechanism is regarded as an effective method for alleviating global server loading and network cost.One reason is that caching the most popular objects ensures that most requests can be served while the other is that the traffic volume from global servers can be reduced.The content popularity is defined as the probability that it will be requested in the future.In practical situation, content popularity distribution is not a priori known so that the prediction technique is needed.Most prediction methods measured content popularity by aggregating user requests over time for a certain content.In a real online system, an accurate and fast response prediction is vital because audience requests fluctuate rapidly.On the other hand, since most VOD file sizes are large and the amounts of user requests and media files are tremendous, it is necessary to design a reliable and flexible popularity prediction without any additional loading effort.
According to these issues mentioned, we propose a window-based popularity mechanism which is not only accurate and robust but also facile to be implemented on a real system.We suppose that the content popularity is more related to the recent audience interests than long-term user preference.Moreover, most VOD programs exist with a short life cycle.Hence, we define a window, which is a time period starting from time T a to time T b where T a < T b , and a VOD popularity is gathered during the window.As the window moves on, we obtain the newest content popularity which is derived from the nearest time period.
For the rest of the paper, video request sequence in a certain local area denotes the sequence of VOD accessed by local area users.

Proposed Framework of Window-Based Popularity
Caching System.Figure 1 shows the framework of windowbased popularity caching system (WBPCS) which is built on window-based popularity mechanism (WBPM).WBPCS is an automatic process for distributing video files to central servers in global area (global servers) and proxy servers in local area (local servers).Mainly, it is designed for predicting popularity of VOD content and caching popular VOD at local area.WBPCS contains several processing modules: program manager (PM), user information collector (UIC), WBPM, and content cleaner (CC).At the beginning, the daily program schedules are injected into PM module which is responsible for maintaining the life cycle of programs.When a new program is launched, its media file should be uploaded to global servers.After the file is allocated to global servers, set-top boxes in users' houses are allowed to access video contents from global servers.Considering the network cost and server loading, it is uneconomical to satisfy all VOD requests from global server.Hence, WBPM is created for prefetching popular contents and caching them at local servers.First, UIC gathers audience's watching records from all local servers and data analyzer writes retrieved information back to the database periodically.The information which WBPM needs is what media has been accessed by which set-top box from which area during a specific period.If a VOD is always obtained from global servers, this represents that no replica exist in local servers; moreover, if the request frequency is high, this means that the local area should place a duplicate file in order to reduce the transmission loading.This information is incorporated into WBPM, and popular contents are determined by our window-based popularity algorithm.We propose two kinds of window-based popularity algorithm which consider global popularity and local popularity, respectively.After deriving the content popularity, WBPM filters out unpopular items and makes cache decision to allocate popular content to necessary area.Finally, CC evicts unessential content periodically so that the space will be recycled.

Window-Based Popularity Mechanism
In the following, we will illustrate several content request distributions in MOD, and we will propose two windowbased popularity mechanisms.

Distribution of MOD Content Requests.
In the recent 10 years, MOD has been developed as a mature IPTV system, which has owned over 800,000 subscribers in Taiwan.More than 10,000 of VOD programs and various services have been provided.This VOD request distribution and user behavior information will be a supportive evidence for developing cache mechanism.
We retrieve about eight thousand VODs as sample data set and illustrate the relationship between request frequency and ranking on log-log scale in Figure 2. Rank 1 means the most frequently accessed media file.From the plots, it is obvious that the curve fit the straight line with minor bias.The straight line suggests that the popular video accessing frequency is proportional to 1/i α , where α = 0.765.Hence, we find that the distribution of VOD data in MOD also conforms to Zipf 's law.In Figure 3, we present the cumulative probability of access for the top n% of VOD data, and the result shows that top 20% popular VOD contents are able to satisfy 80% demands.This proves that a small portion of popular contents is able to supply most demands.Furthermore, if we place more VOD at proxy server, such as  more than 40%, the hit ratio will not significantly rise than that if we allocate only 30% VOD. Figure 4 describes user VOD watching behavior in one week and the amplitude oscillate fiercely during weekend but smoothly on usual day.On the other hand, Figure 5 demonstrates the life cycle and access frequency of several VODs.It can be observed that most contents have short life period and user requests may increase rapidly around a specific time.Hence, considering user recent request pattern is helpful when deciding content popularity.In the following, we will propose two methods for calculating content popularity as the basis for WBPCS.

Global Window-Based Popularity (GWBP).
In this section, we illustrate the global window-based popularity computation model and describe the content placement mechanism.Each local area has a request sequence during a time window T a,b , as the window moves forward the popularity of video V j will alter according to the newest audience demands pattern.We collect the popularity of video V j from each local area and determine which popular VOD should be duplicated and allocated at proxy server.
Suppose  of video V j at t a should be redirected to global area A G .We assume that there are M videos provided by servers at A G during T a,b , and we collect the amount of requests of each video file V j from R Ai Ta,b , where i is from 1 to N and r Ai Vj ,t a = 1.We define the popularity of V j at A i as and M popularity degrees will form a set GP AG Ta,b which is expressed as Gp Ai Vj ,T a,b , . . ., Gp AN VM ,T a,b in descending order.Since our data, which is retrieved from MOD system with practical audience behavior records, conforms to Zipf 's model, it is reasonable to place a small portion of GP AG Ta,b at local area so that most demands in the future will be satisfied.We create a popular content set GPCS AG Ta,b , which contains top K percent of elements in GP AG Ta,b and allocate each item with popular degree Gp Ai Vj ,T a,b in GPCS AG Ta,b to the area A i .As the window is moves forward, the popularity of each item at each local area adjusted dynamically with the fluctuation of user demands collected in short term period.Figure 6 shows an example for deriving GWBP for each VOD from area A i , where RT a,b represents the sum of request times b y=a r Ai Vj ,t y .Suppose that we collect content requests at ISRN Communications and Networking global servers from time a to time b and we retrieve top 20% popular items from items gathered from T a,b .From Figure 6, V 3 will be allocated to area A 3 with the highest popularity degree 0.32.As window moves on, we collect popular items again at time window T c,d , and then V 5 will be placed at A 3 .

Local Window-Based Popularity (LWBP).
In this section, we compute the popularity of each VOD item at each local area individually.Considering that the local area audience population may be distinct, we suggest another view, which calculates popularity according to the popularity limited to the local area, to make popular content allocation decision.Suppose that local area A i has a video request sequence R Ai Ta,b , where 1 i N, and we select videos V j with r Ai Vj ,t y = 1, where a ≤ y ≤ b.The popularity of V j at A i will be Figure 7 shows an example for deriving LWBP for each VOD item from area A i , where RT a,b represents the sum of requests b y=a r Ai Vj ,t y and LTR a,b , |R Ai Ta,b |, means the total requests times at local area A i during time T a,b .Suppose that we collect the popularity of every VOD at each local server from time a to time b and we retrieve top 20% popular items from items which must be obtained from global servers.From Figure 7, V 4 will be allocated to area A 1 with the highest popularity degree 0.35.As window moves on, we collect popular items again at time window T c,d and V 3 will be selected as popular data with popularity degree 0.12 and allocated at area A 3 .

Experiments and Evaluations
In this section, we conducted experiments to evaluate the cache quality for our proposed methods and we compared them with A-Lu 5910 [6].We describe the experiment set-up in Section 5.1, and we demonstrate the experimental results in Sections 5.2 and 5.3.

Experiment Setup and Evaluation Metric.
In our experiment, we collected the data set from MOD which is an IPTV system provided by CHT company in Taiwan.The data set contained millions of user requests and thousands of VOD content over time.We extracted requests log among several days as sample data and rebuilt content allocation environment.We are concerned with the hit ratio at local area because if the hit ratio is high and stable, we could infer that most audience can download VOD content from local servers.This also implies that the network cost from audience to global servers is reduced.We define the hit ratio at local area and global area as local hit ratio (4) 5.1.1.Methods Compared in the Experiment.In WBPM, the popularity degree is obtained by aggregating user requests from global area or local area individually.We compare the performance of the two window-based popularity methods with the performance of A-Lu 5910 proposed by Verhoeyen et al. [6].The cache decision is made according to these popularities.These popularity methods are defined as follows.
GWBP.Calculate content popularity with global windowbased popularity mechanism.

LWBP. Calculate content popularity with local windowbased popularity mechanism.
A-Lu 5910.Calculate content popularity with exponential smoothing popularity mechanism [6] but rank popularity locally.

Experimental Results.
In the experiments, we compared GWBP, LWBP, and A-Lu 5910 from different aspects in Sections 5.2.1-5.2.3.First, we analyzed the difference between GWBP and LWBP.Then, pertaining to A-Lu 5910, we examined the best weight coefficient and history requests inspecting period which ensured that its performance attains the most satisfying level.Finally, we evaluated GWBP, LWBP, and A-Lu 5910 together.

Comparison of the GWBP and the LWBP.
In this section, we evaluated the result of GWBP and LWBP.The only difference between GWBP and LWBP is popularity computing methods.The former popularity is divided by the sum of all request counts in global area while the latter is divided by the number of total requests aggregated in local area.From Figure 8, it is obvious to see that GWBP performs better than LWBP at most times, especially around the 49th hour.The reason to cause the gap is that the IPTV providers refresh large amount of VOD content on that day and it is obvious to see that GWBP has much fast adjustment ability.
We infer that if a VOD is popular in a local area, it does not represent that the VOD is really popular.It may be just because there is a small portion of audience watching the program during that period in that area.Hence, LWBP may generate error message especially in the VOD content update time interval.Except for the gap, LWBP performs not far inferiorly than GWBP.

Examination of the Best Coefficient and History Requests
Inspect Period of A-Lu 5910.In this experiment, we evaluated the performance of A-Lu 5910 under different γ coefficients which determines how much the past popularity value influences the newest popularity measurement.From Figure 9, the result implies that the coefficient value makes no significant difference between these γ values, and this conclusion corresponds to the experiment made by Verhoeyen et al. [6].Next, we aim to discover a proper history request inspect period for A-Lu 5910 and the outcome presents that the shorter the period we collected data for A-Lu 5910, the more precise the popularity prediction of A-Lu 5910.The three gaps in Figure 10 emphasize the effects.In the further experiment, we intended to compare the performance of GWBP, LWBP, and A-Lu 5910.We set A-Lu 5910 with parameter γ = 0.2 and inspect period 6 hours.

Comparison with All Methods.
Considering the best inspect period of A-Lu 5910, we applied the same time length as the window size to GWBP and LWBP and we allocated equivalent VOD quantities to local area in each time unit for our methods and A-Lu 5910.From Figure 11, GWBP usually keeps the highest hit ratio in most situations especially around the gap and there is only minor variance between LWBP and A-Lu 5910.Hence, we can draw two conclusions.The first is that the popularity examined by global view is much confident than the gathered in local view.The second is that even though the result of LWBP and A-Lu 5910 seems to have no significant difference, the LWBP often performs better than A-Lu 5910 when the hit ratio of A-Lu 5910 decreases to the lower point.This implies that the recovery ability of window-based popularity mechanism is superior to A-Lu 5910 and our method is able to respond to changeable demands flexibly and immediately.

Comparison with the Preadopt-to-Postadopt WBPCS on
MOD System 5.3.1.MOD System before Adopting WBPCS.From Figure 12, the local hit ratio before applying WBPCS was unsteady and it varied rapidly.MOD system attained only 60 percent hit ratio.On the other side, the hit ratio of global servers was near to 40 percent which meant that the network loading was heavy between audience and global servers.

MOD System after Adopting WBPCS.
After WBPCS was applied to MOD system, it is clear to observe that the hit ratio in local servers increases to 75 percent and the hit ratio in global servers decreases to 25 percent.The gap between them is larger than the MOD system without WBPCS assistance.The data presented in Figure 13 was collected when MOD just adopted WBPCS.When the time goes by, this situation remained steady and the improvement was more significant.In Figure 14, we collected the data after MOD adopted WBPCS for a long time.From the trend, the hit ratio in local area augments to nearly 80 percent and the difference between it and global hit ratio expands by 60 percent.This inspiring result demonstrates that the WBPCS system indeed supplies an apparent contribution.

Conclusions and Future Work
A new challenge in IPTV system is how to design an effective and adjustable cache system since the on-demand object in IPTV are different from the on-demand objects in Web in some aspects.In IPTV environment, the demands of VOD usually amass in a short period and user interests change dynamically.On the other hand, the VOD content is updated frequently under the manipulation of IPTV providers.How to design an effective and adjustable cache system has become a new issue.
In this research, we propose a window-based popularity mechanism to derive the popularity degree of VOD in IPTV environment.Considering the challenge in IPTV system, our method is designed to be equipped with immediate reactivity according to the change of user preferences.Comparing to the method A-Lu 5910, proposed by Verhoeyen et al. [6], WBPM is simpler and robust because it is needless to maintain and update the popularity degree of every VOD in each iterative period.Furthermore, the experiment results demonstrate that the global popularity is more confidential than local popularity since the VOD which is regarded as a popular item in a local area may be only because the number of total users is few.Our proposed methods calculate the popularity of VOD during a specific time window and they respond to context automatically and instantly.From the experiment results, we discover that the popularity prediction accuracy of WBPM is superior to conventional popularity tracking method and WBPM indeed made a significant improvement on practical system.In our future study, we intend to further analyze the cache problem of more diverse VOD which is equipped with distinct business policy.Moreover, our current study does not consider the practical size of VOD and the relationship with server volume, also, the content allocation optimization will be our next interest issue.

Figure 1 :
Figure 1: Proposed framework of window-based popularity caching system.

Figure 2 :
Figure 2: Frequency of VOD content accessing versus content ranking.

Figure 3 :
Figure 3: Cumulative distribution of requests to VOD.

Figure 5 :
Figure 5: A few typical traces of the daily popularity of VOD contents on MOD.

Figure 7 :Figure 8 :
Figure 7: Illustration of local window-based popularity gathered at global area.

Figure 11 :
Figure 11: Comparison of all methods.
b , the popularity of each content from each area will form a set LP AG Ta,b which is expressed as Lp Ai Vj ,T a,b , . . ., Lp AN VM ,T a,b in descending order.After that, a popular content set LPCS AG Ta,b is created, which contains top K percent of elements in LP AG Ta,b .Each item in LPCS AG Ta,b with a popular degree Lp Ai Vj ,T a,b is allocated to area A i .

=
Number of Requests to Local Servers Number of Total Requests × 100%, global hit ratio = Number of Requests to Global Servers Number of Total Requests × 100%.

Figure 14 :
Figure 14: Hit ratio of MOD adopting WBPCS for a long time.
Definition 1.A video request sequence during a time window T a,b , denoted as R Ai Ta,b , is a sequence of video requests that are accessed from local area A i and are ordered by arriving time from t a to t b .R Ai Ta,b is expressed as r Ai Vj ,t a , r Ai Vk,t a+1 , . . ., r Ai Vl,t b and t a < t a +1 < • • • < t b , where r Ai Vj ,t a denotes the request of VOD object V j from area A i accessed at time t a .If r Ai V j should be supplied from central server in global area A G .t a is the starting time index of the time window, and t b is the last time index of the time window.