SDP-Based Quality Adaptation and Performance Prediction in Adaptive Streaming of VBR Videos

,


Introduction
Nowadays, video services are increasingly popular on the Internet.According to a recent study and forecast [1], global Internet video traffic will be 80% of the entire consumer Internet traffic in 2019.Besides, HTTP protocol has become a cost-effective solution for video streaming thanks to the abundance of Web platforms and broadband connections [2,3].Furthermore, for interoperability of HTTP streaming in the industry, ISO/IEC MPEG has developed "Dynamic Adaptive Streaming over HTTP" (DASH) [4] as the first standard for video streaming over HTTP.
DASH requires a video to be available in multiple bitrates and split into small segments each containing a few seconds of playtime.Based on the current network conditions and terminal capacity, the client can adaptively decide a suitable data rate so that stalling is avoided and the available bandwidth is best possibly utilized.If the video is encoded in only one bitrate, either the bitrate is smaller than the available bandwidth resulting in a smooth playback but sparing resources which could be utilized for a better video quality, or the video bitrate is higher than the available bandwidth leading to video stalling.Thus, DASH enables service providers to improve resource utilization and quality of experience (QoE).
So far, existing studies have proposed simple heuristics for adapting video at the client.These heuristics can be divided into two types, buffer-based methods and throughput-based methods.The purpose of buffer-based methods is to maintain the stability of the buffer within a certain range to ensure continuous video playback.However, when the bandwidth is drastically reduced, the buffer-based methods may cause sudden change of bitrate [5][6][7][8].Meanwhile, throughput-based methods adaptively decide version based on the estimated throughput.These methods are generally able to react quickly to the throughput variations; the streaming quality, however, may be unstable [9].

Advances in Multimedia
Recently, several Markov decision-based methods have been proposed to optimize decision making for the streaming client under time-varying network conditions.However, these existing methods mostly focus on constant bitrate (CBR) videos.The authors in [10] are the first to propose an adaptation algorithm in which stochastic dynamic programming (SDP) is employed to find optimal decision policies when streaming VBR videos.The segment requests are ruled by the policies which map a control parameter to every possible state of the system; however, it is limited to videos with weak bitrate fluctuations.To the extent of the authors' knowledge, in the context of adaptive streaming, there have not been any adaptive streaming methods that could (1) support variable bitrate (VBR) videos with strong bitrate fluctuations and (2) predict the streaming performance with different streaming settings in order to select the optimal one.
In this paper, we tackle these challenges by proposing an adaptation method using stochastic dynamic programming.Firstly, we discretize a system including data throughput, buffer level, and bitrate of a VBR video to form the system states.Secondly, we define a cost function that takes into account parameters that affect the subjective perceptual quality of users.In the cost function, the weights are assigned to the difference between data throughput and the bitrate of the next segment, the variance of the buffer from its optimal value, and the quality switch of the video.Finally, we construct an infinite horizon problem (IHP) and solve it to find the optimal policies for all system states.The role of a policy is mapping the control parameter (i.e., the version of the video) to every possible state of the system.This paper is an extended work of our preliminary study in [11].The extension in this work is multifold.First, we predicted the CDF of the requested versions in a streaming session, so the maximum version could be decided for the streaming session.Second, we predicted CDF of the buffer levels to know the variance of the buffer level under the fluctuations of the network.Finally, we also evaluated the proposed method in the online context, where the statistics of bandwidth is updated periodically.Besides, we compared the performance prediction results with measurement ones in in both offline and online contexts.
A policy is optimal when it minimizes an average cost.Based on the obtained policies and the constructed system model, we develop mathematical models that could predict the streaming performance for the new streaming session including average video version, average version switch per segment, average buffer, and average underflow probability.Experiments are conducted to verify the mathematical models by comparing the predicted performance obtained from the models and the measured performance.The proposed method is evaluated in two contexts (1) offline context using statistics of a history bandwidth trace and (2) online context using bandwidth statistics of previous video segments.
The paper is organized as follows.Section 2 briefly reviews related work.Section 3 describes the system and the modeling of the system in detail.Section 4 presents the formulation of the IHP which is solved by SDP.The performance prediction is given in Section 5. Section 6 presents the experimental results and discussions.Finally, Section 7 concludes our work.

Related Work
In recent years, many heuristic adaptation methods for adaptive streaming have been developed (e.g., [5][6][7][8][9]14]).An extensive evaluation of typical adaptation methods has been carried out in [15].Though these methods prove to be effective in their specific settings, they cannot tell quantitatively the streaming performance with different system settings.Furthermore, most of them focus on CBR videos.
For streaming VBR video, several adaptation methods have been proposed based on the sensibility of the receiver's buffer [6,16].Dubin et al. [16] propose an adaptation logic that supports its bandwidth estimation decisions based on the client buffer redundancy.This method considers the fluctuations of mobile network without emphasizing the characteristics of bitrate fluctuation of VBR videos, which leads to the lack of smoothness.In [6], a partial-linear trend prediction model is developed to estimate the trend of client buffer level variation.The client will continue to have the current version for the next segment when the estimated buffer level has no significant change.The drawback of this method is the sudden version switch when the actual buffer level drops dramatically.In [14], the authors propose an adaptive logic for VBR videos based on bitrate estimation.The method demonstrates an effective adaptation behavior as it keeps the buffer at a stable and high level; however, it is still qualitative and has no mechanism to balance the streaming performance.
Several recent studies have proposed mathematical model based adaptation methods in streaming video [17][18][19][20][21].A mathematical model is proposed in [17] to calculate the underflow probability of VBR streaming under VBR channel based on initial delay and maximum buffer.However, this work only considered VBR videos with one version and constant play-out curve without developing any adaptive logic.Meanwhile, Kang et al. [18] present a no-reference, contentbased QoE estimation model for video streaming service over wireless networks using neural networks.Nonetheless, neural networks are computationally complex and require large training data and long training time.Besides, Xiang et al. [19] propose a rate adaptation method using the Markov Decision Process to obtain an optimal streaming strategy for VBR videos.Nevertheless, their proposal is not able to adapt to the real bandwidth changes and has no performance prediction.The prediction of streaming performance was first proposed for streaming CBR videos in [20] and VBR videos in [21] by Liu et al.In their studies, a video session is divided into subsessions.With a given target rebuffering probability, the video bitrate/the average video bitrate for each streaming subsession is predicted when streaming CBR videos/VBR videos, respectively.The results show that the average actual rebuffering probability achieved by these methods is reasonably close to the target.However, they have not done any assessments in terms of video quality and video quality switch, so the QoE can be affected.
Recently, SDP has been known as an effective technique for solving optimization problems in video streaming [10,22,23].For instance, the authors in [22] apply SDP to find the optimal policy for choosing sending rates when streaming on-demand scalable VBR videos over wireless network.Nevertheless, they have not considered the effect of channel-state aware adaptation.Meanwhile, García et al. [10] construct an infinite horizon problem and apply SDP to solve it specifically for HTTP adaptive streaming.They propose a channel model in which transitions are only possible to adjacent states, with equal probabilities, which is only suitable for the stable bandwidth, with little fluctuation.In addition, by observing the histograms of segment size encoded with VBR, the authors assume that segment size (which is proportional to segment bitrate) can be modeled through a discrete Gaussian distribution.Then, the probability distribution of segment sizes is used to calculate transition probabilities.However, the probability distribution of segment sizes is not taken into account in the cost function.Instead, the average bitrate representing the bitrate for each version is used.This is only reasonable when the deviations of segment sizes (i.e., segment bitrates) are small.In another study, Xing et al. [23] use SDP to find the optimization for Advance Video Coding content when streaming through several wireless connections at the same time.They offer a cost function in terms of QoE, but the computational complexity of their model is significantly caused by eight system variables in each state.
The SDP-based method proposed for HTTP streaming in this paper is different from the previous studies in several points.First, our method considers an actual time-varying bandwidth of mobile networks.Second, the drastic bitrate fluctuations of actual VBR videos are effectively supported.Third, we develop mathematic models to predict the performance of a streaming subsession, which helps to select the maximum allowed version parameter in advance.

System Modeling
3.1.System Overview.Figure 1 shows the functional diagram of a general DASH system consisting of a server and a client.The server holds the media files with different quality versions.Each version is further divided into small segments.The client has the information about the characteristics and locations of media segments and can request any of them during a streaming session.For the next segment, the client makes a decision of what video version to request based on current status of client buffer and data throughput to provide the best streaming experience possible.In this paper, the segment selection policy for the client aims at maintaining the client buffer at a reasonable level while balancing between average version (i.e., average quality) and average version switch (i.e., quality variations).In order to apply SDP, our system is modeled as a discretetime stochastic one.Specifically, the timeline is divided into time stages.At each stage , the system is represented by a state variable   .When the next segment is completely downloaded at stage  + 1, the system moves to the state  +1 .As the system transits to the next state, a certain cost occurs.Besides, the channel, the buffer, and the media are discretized as explained in the following subsection.

Channel, Buffer, and Media Model.
In our work, we discretize the bandwidth range into  levels.The bandwidth trace before and after discretization is shown in Figure 2 with W = 10.With this level of quantization, the quantized bandwidth covers the original bandwidth well.We then create  different bandwidth states BW  (1 ≤  ≤ ) from these  bandwidth levels.The value of each bandwidth state is the average of the maximum value and minimum value of the corresponding bandwidth level.To represent the transition from one bandwidth state to another on a bandwidth state space, we use the Markov chain model which has been used widely in previous studies [10,24,25].
Figure 3 presents a general Markov-chain model which consists of three bandwidth states.Each state is represented by one data throughput value.There is a transition probability when the bandwidth moves from one state to another after  each time step.Thus, by simply extracting the statistics from the bandwidth history, the transition probability between all bandwidth states is generated.
Similar to the bandwidth trace, we divide the buffer into  levels from 0 to   , with   being the buffer size.In addition, we denote the video version by  and represent the version with lowest quality as V = 1 and the highest quality as V = Vmax assuming that the video has Vmax different versions.The bitrates of different versions of a VBR video are shown in Figure 4.
Because the segment bitrate of a VBR video version fluctuates very strongly, we divide the bitrate of each version into  intervals (from interval 1 to interval I), each of which is represented by its average bitrate value.For example, if a version that has the highest bitrate of 5000 kbps is divided into 10 bitrate intervals, the interval 1 will range from 0 to 500 kbps and its representative bitrate will be 250 kbps.
We assume that all versions at a bitrate interval represent a separate video flow.If the current segment bitrate belongs to one interval, the next segment bitrate will also belong to that interval regardless of segment version.With this assumption, we generate  different policy sets for  bitrate intervals.When a segment is completely downloaded, the client measures its bitrate to find the bitrate interval it belongs to.Then, the client will determine the policy set corresponding to that bitrate interval.

Problem Formulation and Solution
4.1.System State.With the system being discretized above, we observe the system state variable s  (  , bw  , V  ) when a video segment is completely downloaded at stage .Here,   is the buffer level representing the number of segments available in the buffer, bw  is the bandwidth whose value belongs to {BW  }, and V  is the version index of the downloaded segment.The case where   = 1 corresponds to the buffer underflow event.In each state   , the system may choose any action .For our system, an action is basically a decision about the version for the next segment.As there are  max versions to choose from, we have totally  max possible actions.
The system then randomly moves into a new state  +1 at the next time step, resulting in a corresponding cost (  ,  +1 , ).With each bitrate interval  (1 ≤  ≤ ), we have a system state set s  and a policy set   .Let  be the number of states in s  , and we have  =  *  *  max . (1) 4.2.Transition Probabilities.Since the system is stochastic, which means the system outcome of each action  is not deterministic, the state transition probability between every two states that depends on action  must be constructed.We denote the probability that state   will lead to state  +1 given action  as follows: Due to the independence among (  , bw  , V  ), we have In the right hand side of (3), the first term can be calculated as follows: where   is the next buffer level estimated based on the current system status and action .We calculate   as follows: where   is the bitrate of the target version.
When the throughput significantly drops, meaning a very low value of  +1 ,   could be lower than zero.However, at the beginning of a new stage, there is one segment being downloaded resulting in at least one segment being always in the buffer.Therefore, (5) can be modified as follows: The second term is easily obtained from the bandwidth model.And the third term can be simply calculated by Thus, expression (3) can be simplified as follows: until  +1 =   and ℎ +1 (  ) = ℎ  (  ), ∀  ∈ s Algorithm 1: Finding the policy set for one interval.

Cost Function.
In this section, a cost function is defined to punish the situations that may cause a decrease in users' QoE.We focus on three objective parameters that affect strongly the subjective perception of the users which are quality level, video stalling, and quality switch.First, the cost function should favor the selected bitrate to be close to the current bandwidth, so it punishes the difference Δ between the current bandwidth and the bitrate of the next segment selected by action , with Second, to prevent video stalling, the buffer level should never be underflow.We define an optimal buffer level  opt that is a desired value the client should try to keep during a streaming session.When the buffer level is close to  opt , the buffer underflow is avoided.Therefore, the cost function penalizes the deviation Δ of the current buffer level from the optimal buffer level, where Third, in order to reduce the quality switches, the cost function should contain the difference Δ between the selected quality and the last one.To punish a QoE reduction because of high quality variations, we define Δ as follows: Let , ,  be the trade-off parameters of three objects, namely, quality level, video stalling, and quality switch, respectively.The cost incurred when the system changes from state   to state  +1 given action  can be calculated by  (  ,  +1 , ) = Δ + Δ + Δ. (12)

Optimization Solution.
As the system is discrete and the number of states is large, we can formulate an infinite horizon problem.For every state   , the most appropriate action , called policy for state   , has to be decided so that the mean cost per state is minimum.As mentioned above, our system has  system state sets corresponding to  bitrate intervals, so we have to find  corresponding policy sets.For simplicity, we only present the optimization solution for a general bitrate interval with the system state set s and a corresponding policy set . Finding optimal solutions for all bitrate intervals will be done similarly.Mathematically, we have to minimize   which is the average cost per state obtained after downloading  video segments.  is calculated as follows: with   (  ,  +1 , ) being the cost incurred after downloading the th segment.Here,  is the number of state transitions and is also the number of video segments in the session.Based on the standard policy iteration algorithm (PIA) [26], we solve the IHP problem by using an algorithm as presented in Algorithm 1.
Applying PIA of SDP for  bitrate intervals, we would generate  policy sets which serve like a look-up table mapping each state to an optimal action.Thus, the client is able to decide an appropriate version for the next segment based on the current system condition.
Let  prob ,  cost be the computational complexity of the calculation of transition probability and the cost from one state to the remaining states, respectively.Let  PIA be the computational complexity of Algorithm 1.The complexity of our model is  =  prob +  cost +  PIA .For each interval, each action, and each state, we consider the cost and the transition probability to −1 remaining states.So the complexity of the calculation of transition probability and the cost is described as follows:

Advances in Multimedia
Based on [27], we have Therefore, the complexity of our model is

Performance Prediction
After Section 4, we achieve  policy sets corresponding to  intervals of a video bitrate.In this section, we use the Markov chain model to predict the streaming performance for a session.Similar to Section 4.4, this section only presents the performance prediction for a general bitrate interval with a system state set s and a corresponding policy set . Predicting performance for all bitrate intervals will be done similarly.
After carrying out the calculation for all  bitrate intervals, we take the average values as the final results.
The key is to determine the average state probability p  = [ 1 ,  2 , . . .,   ] with   (1 ≤  ≤ ) being the average probability that the system is at the th state throughout the streaming session.The probability p  is the average value of state probabilities after downloading  segments p  = [ 1 ,  2 , . . .,   ], (1 ≤  ≤ ).Here,   (1 ≤  ≤ ) is the probability that the system is at the th state after downloading  segments.From the Markov chain theory, the state probability after downloading  + 1 segments can be computed as the product of the state probability after downloading  segments and a transition matrix p +1 = p  P TS .Assuming that the initial probability  0 is known.The transition matrix P TS is a  ×  matrix which represents the transition probability from state   to state  +1 .P TS is defined as follows: The average state probability p  is calculated as follows.
Currently, most of the adaptation algorithms developed for HTTP streaming are qualitative in the sense that the performance metrics could only be obtained after the streaming session.In this study, the predicted streaming performance could be calculated based on the average state probability and the information inside every state.Specifically, we mainly focus on the following aspects: bitrate prediction, quality switch prediction, and buffer prediction.

Quality Prediction.
The video quality that the users perceive is presented through the selected version.The higher the version is selected, the better the video quality is perceived by the users.Furthermore, setting the maximum version for the streaming session also affects the perceptual quality of the users.Obviously, a very low value of the maximum version may result in a poor perceptual quality while a very high value may increase the chance of buffer underflow.In this section, we predict the quality performance of the streaming session based on the average version  V that is calculated using (19) and the cumulative distribution function (CDF) of the versions (V) (V ∈ [1; max]) that is shown in (20).Based on the predicted probability of the versions throughout a streaming session, the maximum version could be decided for the session

Quality Switch Prediction.
Quality switch is an important factor affecting the perception of the users.The users often expect a smooth playback with the minimum number of quality switches and small switch amplitude from one segment to the next.We can predict the average version switch per segment  sw as follows:

Buffer Prediction.
Video stalling is one of the important objective parameters that affect the subjective perception of the users.Stalling occurs when the play-out buffer underruns.
To prevent this event, the buffer must be maintained within a safe range.In this session, we evaluate the buffer performance through the average buffer level   , the CDF of the buffer level f() ( ∈ [1; max]), and the buffer underflow probability Pr und (i.e., when the system stays at buffer level 1) which are described as follows: represents the safety of the buffer.If   is small, the buffer level is often in low levels, which may cause playback interruption when the current bandwidth drops dramatically.f() reflects the variance of the buffer level under the fluctuations of the network.Pr und shows the probability that the playback would be interrupted in the streaming session.

Experimental Results
In order to evaluate the proposed system model and performance prediction accuracy, in this section, we perform a number of experiments in both offline and online contexts and compare the performance predicted results with the measurement ones.We also compare our proposed method with two existing ones, namely, the SDP method presented [10] which could obtain the best performance among the SDP methods and the bitrate estimation based method presented [14], which is the best among the qualitative methods.

Experiment Setup.
For the simulation, our test-bed consists of a client running Java 8.0 which implements the adaptation and a server running Apache2 which holds the media segments.The client runs on a Window 7 computer with an Intel i5-1.7 GHz CPU and 4 GB memory and the server runs on Ubuntu 12.04LTS (with default TCP CUBIC) with 1 G RAM.The channel bandwidth is simulated using DummyNet [28].We use the Tokyo Olympic video from [6].
For the video, max = 9, and, for the bandwidth,  = 10.
Since we measure the buffer size and compute the buffer cost in the segment duration unit, we implement our adaptation method with one setting of the segment duration.In our experiments, we select a segment duration of 2 seconds, which is similar to those of [7,8,10].The impacts of segment durations on adaptation performance have been considered in some recent studies [29,30].Further evaluation of different segment durations with a fixed buffer size (in seconds) will be reserved for our future work.Maximum buffer level is set to 5 segments (i.e., 10 seconds) and optimal buffer level is set to 4 segments (i.e., 8 seconds).
In our method, a streaming provider can adjust the balance between the requirements for high quality level, preventing video stalling, and reducing quality switches by changing the trade-off parameters , , and .Since selecting an optimal combination of trade-off parameters of the cost function involves solving a hard optimization problem, it will be investigated in our future work.In this paper, we select qualitatively the trade-off parameters of cost function as follows.Initially, we fix  and select the other two parameters.Since we want to prioritize requirement of smooth quality switch,  is selected so that the contribution of the quality switch cost Δ is higher than that of buffer cost Δ.With parameter , because the bitrate cost Δ can be up to thousands, parameter  should be small to reduce the contribution of the bitrate cost to the overall cost .Based on our experience, good empirical values on parameters , , and  are 0.003, 4, and 20, respectively.

Experimental Results.
In the first part of the experiment, we evaluate the accuracy of performance prediction in offline context using a given bandwidth trace obtained from a mobile network [12].In this context, the number of video segments  is 300.
Figures 5 and 6 show the predicted performance using the formulas presented in Section 5 and the measured performance obtained from the experiments when the maximum allowed version is set to 7, 8, and 9.These figures point out that the prediction results are close to the measurement ones.We can see from Figure 5 that when the maximum version increases the average version as well as the average version switch also increases.Figure 6 shows that, in both prediction and measurement cases, when the maximum version is 7, the underflow probability is almost zero and increases very slowly when the maximum allowed version increases.This analysis implies that setting the maximum version to 8 is reasonable in terms of balancing between average video quality and quality switch; meanwhile setting the maximum version to 7 ensures a very stable streaming experience.Table 1 shows more detailed statistics of the experimental results in three cases of maximum version.It can be drawn  from the table that there is no significant difference between the predicted performance and the measured performance.
In the second part of the experiment, we use two history bandwidth traces recorded from two previous streaming sessions of the client.The CDFs of both bandwidths are shown in Figure 10.
Bandwidth  1 is used for calculating the statistical models and  2 is used in the simulation for measuring performance parameters.Figures 11,12     In the third part of the experiment, we consider the online context in which the prediction of the future bandwidth is based on the statistical parameters of all previous segments.Specifically, we divide an entire session into chunks, each of which has  video segments.We treat each chunk individually as a mini streaming session.Assuming that, initially we have enough statistical data to predict the performance of the first chunk.The prediction of the subsequent chunks will be done based on the previous chunks.At the beginning of each chunk, the bandwidth statistics are updated leading to a recomputation of the policy set and the performance.In the experiment, we set  to 100 video segments.It can be observed from our experiments that it took only several seconds to (re)compute the whole model.Therefore, the computational overhead is about 3%, which is appropriate for the online context.Figures 14, 15, and 16 show the adaptation, bitrate, and version switch behavior of the proposed method in the three cases of maximum version.The predicted performance and the measured performance in the online context in the three cases are presented in detail in Tables 3, 4, and 5.It is very clear that, in the online context, the predicted performance is also very close to the measured performance.
Next, we compare our proposed method with the SDP method [10] and the bitrate estimation based method [14] using the same simulation settings as in the first part of our experiment.The experimental results obtained by simulating the SDP method [10] and bitrate estimation based method [14] are shown in Figures 17 and 18, respectively.We can see that both methods provide very fluctuating version switches.The detailed statistics of these adaptation methods with the maximum version set to 9 and our proposed method with the

Conclusion
In this paper, we have proposed an adaptation method for HTTP streaming based on stochastic dynamic programming.The system model was targeted at real bandwidth trace with strong bitrate fluctuation of VBR videos.Furthermore, we have developed a model to predict the system performance with the aim of choosing the best setting based on the performance requirements.The experimental results have shown that our method can effectively adapt VBR videos and perform accurate performance prediction which is useful in planning adaptation policy.

Figure 5 :
Figure 5: Predicted performance and measured performance in terms of average version and average version switch per segment in offline context using a given bandwidth trace.

Figure 6 :Figure 7 :
Figure 6: Predicted performance and measured performance in terms of average buffer and underflow probability in offline context using a given bandwidth trace.

Figures 7 , 8 ,
Figures 7,8,and 9 show the bitrate and version switch behavior of the proposed method in the three cases of maximum version.It can be seen very clearly from these figures that when the maximum version is reduced, the number of version switches (or quality changes) decreases.Table1shows more detailed statistics of the experimental results in three cases of maximum version.It can be drawn

Figure 8 :Figure 9 :
Figure 8: Experimental results of proposed method in offline context using a given bandwidth trace with Vmax = 8.
, and13 show the bitrate and version switch behavior of the proposed method in the three cases of maximum version.The detailed results are listed in

Figure 13 :
Figure 13: Experimental results of proposed method in offline context using a history bandwidth trace with Vmax = 7.

Table 1 :
Compare predicted performance and measured performance in offline context using a given bandwidth trace.

Table 2 .
Based on these figures and table, we affirm once again that the mathematical performance prediction model agrees well with the measurement.

Table 2 :
Compare predicted performance and measured performance in offline context using a history bandwidth trace for statistics.