Video Broadcasting Using Queue Proportional Scheduling

Queue Proportional Scheduling (QPS) has been shown to be throughput optimal for Gaussian Broadcast Channels. This paper examines the use of QPS for Video Broadcasting. First, the behavior of QPS is examined as the scheduling frequency is reduced and a method is proposed that uses statistics on the arrival rates to improve its performance. The reduction of the scheduling frequency simplifies the scheduler and decreases the required operations. Then, the packet delay variation is modeled using a Markov Chain approach leading to a method for approximating the packet delay distribution. Based on the resulting distribution, it is discussed how the video encoding rate can be chosen in order to reduce the expected distortion of streams transmitted through Broadcast Channels.


INTRODUCTION
Wireless systems have been experiencing constant growth and increased popularity during the past decade.Cellular telephones are now part of most people's everyday life.Fueled by their success and the increased appetite of customers for new and improved services, next-generation cellular systems are targeting broadband applications such as data transfers and video streaming.The aim is to provide mobile users with high rates and seamless roaming [1][2][3].When the users are stationary, even higher rates can be offered [4].
One of the services that is expected to gain popularity over the following years is video broadcasting.Digital Video Broadcasting systems will eventually replace analog transmission.In addition to DVB services, video-on-demand download services will be offered to mobile phone or computer network users.Therefore, a base station that is serving a cell will have to broadcast different video streams to the mobile users of the cell.Cellular system downlinks are typical examples of Broadcast Channels (BCs) [5] where a single transmitter sends data to more than one receivers.A well-known information theoretic result is that the attainable rate vectors in a Gaussian BC form a capacity region that can be achieved using superposition coding at the transmitter and successive interference cancellation (SIC) at each receiver.The performance of practical systems often deviates from the optimal bound.For example, TDMA systems use time division that is sub-optimal, in general, whereas CDMA systems use superposition, but do not use SIC at the receiver where all other users are treated as noise.In this paper it is assumed that the optimal BC performance is achieved by the transceiver architecture.If this is not the case the loss in performance can be taken into account using a nonzero gap value.It is also assumed that both the transmitter and the receiver have perfect Channel State Information (CSI).This is done because the focus of this study is not on how to achieve the Gaussian BC capacity, but on how to manage the available resources in a BC in order to deliver video to the mobile users.
The capacity region of the Broadcast Channel is the union of the rate vectors that can be achieved assuming that the traffic is regular, that is, that during each time period the number of bits transmitted to a user at the physical layer is the same as the number of bits that are sent for transmission by the link layer.However, in practice, the physical layer of a communications system may be receiving data in bursts.For example, if the wireless link is the last hop of a TCP link, the packets may be arriving at irregular intervals at the transmitter due to delays along the data pipe, different routings, packet losses, and so forth.In this case the system may become unstable and the lengths of the queues of some users may not be bounded even if the average rate of each user lies inside the BC capacity region.A significant amount of research effort has been devoted to the problem of achieving the capacity region when the incoming traffic is random.Luckily, it turns out that the set of all arrival rate vectors for which it is possible to keep each queue length finite, referred to as the network capacity region, is the same as the BC capacity region.The scheduling policies that achieve the network capacity region are called throughput optimal.Several throughput optimal policies have been proposed for the BC [6][7][8].Their common characteristic is that they rely not only on CSI, but also use Queue State Information (QSI).Therefore, they are cross-layer approaches.Among the throughput optimal cross-layer approaches, Queue Proportional Scheduling (QPS) [7][8][9] has been shown to have very desirable delay properties.Although its delay optimality for Gaussian Broadcast Channels has not been proved to date, it results in the smallest average packet delay among the known throughput-optimal algorithms, thus making it a good candidate for video transmission where large delays may lead to packet losses, and, consequently, distortion.As will be explained in more detail in Section 2, QPS allocates resources in the BC based on the channel state as well as the queue lengths.In this paper, a simplified version of QPS is proposed that uses Queue State Information less frequently in order to reduce the computational burden.This way the scheduler becomes simpler, since it does not require access to the queue during each scheduling period.It is shown that, under some conditions on the average arrival rate, the modified algorithm is throughput optimal.However, as is expected from the fact that less information is used, it exhibits performance degradation compared to QPS with continuous use of QSI.This is verified using simulation.Then, the packet delay is modeled using Markov Chains.More specifically, a Markov Chain model is fitted to simulation data and is then used to approximate the probability distribution of the delay of the packets.It is shown that, although the service rate depends on the queue size as well as on the states of the other queues, the approximation is satisfactory.Using information on the expected delay and the corresponding distortion it is possible to choose the video encoder rate in a system employing QPS in order to control the quality of video that is delivered to the users of a BC.
This paper is organized as follows.Section 2 examines the degradation of the performance of QPS as the frequency of using QSI for scheduling decreases and proposes a modification that reduces the performance gap.It is also shown that the modified scheme is throughput optimal under a condition on the average arrival rates.In Section 3 the packet delay is modeled using a Markov Chain model leading to a method that approximates the delay distribution.Section 4 discusses how the distribution of the packet delays can be used to predict the video distortion corresponding to a given encoder rate leading to a discussion on choosing the encoder rate for video streams that are sent to users of a Broadcast Channel.Finally, Section 5 contains concluding remarks.

QPS WITH LESS FREQUENT USE OF QUEUE STATE INFORMATION IN GAUSSIAN BROADCAST CHANNELS
Figure 1 depicts the system model that is used in this article.Packets arrive randomly to each queue and are scheduled for transmission.The scheduler allocates the resources of the BC using information on the channel taps h i (CSI) and the queue states Q i (t) (QSI).In this article, the scheduler uses Queue State Information only periodically.Moreover, the channel taps are assumed to be constant.The output signal X(t) is broadcast to the channel, and the signal at each receiver i is equal to ( This paper assumes a Gaussian BC, that is, the n i (t) are i.i.d.zero-mean Gaussian random variables with doublesided power spectral density equal to N 0 /2.The capacity region of a Gaussian BC in bits/s, assuming, without loss of generality, that |h i | 2 ≤ |h j | 2 for i < j, and that the available bandwidth is equal to 2W is given by [5] where P is the average power of X(t) and the α i ≥ 0 trace the whole simplex, that is, i α i = 1.The capacity region is achieved by superposition coding at the transmitter and by successive interference cancellation at each receiver.When the traffic is regular, the transmitter can accommodate any rate vector R that is inside the capacity region (2).In the following, R is the number of bits transmitted during the scheduling interval (that is assumed to be equal to 1 for simplicity), so, it is expressed in bits and not in bits/sec.
In this paper it is assumed that traffic arrives irregularly and in packets.The number of packets A i (t) arriving at queue i during a time period is a Poisson process with rate λ i , and arrivals at each queue are independent.The packet lengths M i in bits are assumed to be i.i.d.exponentially distributed with E[X i ] = μ i and independent of A i (t).Therefore, the arrival rate of bits is λ i μ i .Infinite-capacity queues are considered.At the end of each interval, the scheduler decides on the rate R i of each queue based on the channel gains h and the vector of bits (or packets) Q(t).Other than that, no knowledge of the statistics of the arrival process is required as long as the arrival rate lies inside the capacity region C BC .In this article, Q(t) may not be used by the scheduler during some scheduling periods as is explained in more detail in this section.The resulting queue state after transmission is where a + = max {a, 0}, and Z i (t) is the number of bits arriving at queue i during one scheduling period T s .Queue Proportional Scheduling (QPS) calculates the rate vector is the intersection of the ray xQ(t) and the boundary of C BC .In [8] the bit-based QPS is considered, and it is shown that R(t) is the solution of a Geometric Program and is therefore globally optimal.Note that, since R(t) ≤ Q(t) when QPS is used, (3) can be rewritten as In [8] the bandwidth W, the scheduling period T s and the average packet length for all queues are set to 1.In this article the scheduling period remains equal to 1, but Queue State Information is only used once every L scheduling periods.Naturally, if, during scheduling period t, the information on Q(t − L) is used, the scheduling will not be done based on the current needs of the user corresponding to each queue.It is expected (and verified by simulation) that this will lead to larger fluctuations of the service rates, and, consequently, larger average queue sizes and packet delays.However, if the scheduler knows the average arrival rate of each queue it can approximate the queue size Q(t) by From this point on, λ is the bit arrival rate, that is, the product of the packet arrival rate and the average packet size.This is done for simplicity and for compatibility with the notation used in some of the references.Although not as accurate as the actual Q(t), this approximation will, on the average, be better than Q(t−L).As L grows, that is, as use of QSI becomes less frequent, Q(t) will be close to λ, assuming that λ ∈ C BC .Based on the above observations, the following heuristic modification of QPS is proposed in order to reduce the use of QSI for scheduling.
Let the Queue State Information be used once every L times.Also, assume that the modified QPS starts operating at time t = 0.Then, the rate vector R(t) is equal to for (t mod L) = 0, and subject to λx ∈ C BC (7) otherwise, where vec with elements x i .Therefore, the optimization is similar to QPS, with the difference that, for times nL+l, l = 1, 2, . . ., L− 1 the average arrival rate λ is used instead of the actual QSI Q(t).If the packet arrival rate is constant or changes relatively slowly, that is, if λ can be estimated accurately and does not need to be updated often, R(t) can be precalculated and stored.Therefore, the computational complexity is reduced roughly by a factor of L. However, a practical system will need to update an estimate of λ, so the reduction in complexity will be less pronounced.Similar to QPS, the rate R(t) does not exceed Q(t).This is easily implemented by stopping transmission in a given queue if it becomes empty before the end of a transmission period.
In the following two theorems the throughput optimality of the modified QPS algorithm is established under the condition that the arrival rate λ is constant and satisfies a constraint on its distance from the boundary of the capacity region C BC .The proof is constructed using the same approach as in [9].First, it is shown that E[Q(t)] becomes proportional to λ as t→∞.
Theorem 1. Assume that the modified QPS policy is used in a Gaussian BC, and that λ is such that αλ is at the boundary of C BC .Then, α < L/(L − 1), and, as t→∞, E[Q(t)|q 0 ]→w(t)λ, where q 0 is any initial state of the queue and w(t) is a function of time.
Proof.Given in the appendix.
Note that, as L increases, the average rate λ should be closer to the boundary of C BC for throughput optimality to be guaranteed by the theorem.
Having proved the convergence of E[Q(nL)|q 0 ] to the direction of λ, throughput optimality is shown along the lines of [9].Theorem 2. In a Gaussian BC, the modified QPS policy is throughput optimal, as long as the conditions of Theorem 1 hold.
Proof.Given in the appendix.
For the evaluation of the performance of the modified QPS algorithm, a two-user scenario is chosen, similar to the  one in [9].The SNR of user 1 is equal to 19 dB, whereas the SNR of user 2 is 13 dB.Moreover, λ 1 = 2λ 2 .The capacity region of the Gaussian BC for this scenario is shown in Figure 2. What is also shown is the line λ 1 = 2λ 2 .During the periods where QSI is not used, the modified QPS chooses a rate vector along the segment formed by the intersection of the line and the capacity region.Therefore, the maximum average bit rate that can be achieved is equal to λ max = [ 4.1 2.05 ] bits/s.Figure 3 compares the average packet delay of Queue 1 for different scheduling methods.λ is varied from [ 3.7 1.85 ] to [ 4 2 ].Due to the nature of QPS, the delays of Queue 2 are similar and their behavior is similar.The dotted lines in Figure 3 depict the degradation of the performance of QPS as the QSI is used less frequently.It is assumed that Q(nL) is used to compute R(nL + l), l = 0, 1, . . ., L − 1.The dashed lines correspond to the performance of the modified QPS.The modified QPS obtains an estimate of λ by averaging the arrivals during each scheduling period.The performance is evaluated after a sufficient number of iterations of the simulation in order to allow the queues to reach a steady state.Moreover, it is assumed that λ does not change during the simulation.A total of 10 5 scheduling periods proved to be satisfactory for simulation.The queue is allowed to converge during the first 10 4 scheduling periods before delay samples are taken.
As can be seen from the figure, as the scheduling based on QSI becomes less frequent, the average packet delay increases for each queue.The modified QPS bridges the gap in performance, especially as L grows.For relatively small values of L use of the modified QPS reduces the average delay by 2 to 3 times compared to the case where Q(t − L) is used for all L subsequent schedulings.For very infrequent use of QSI (L = 100) the improvement is much more pronounced.Note that, for the case of L = 100, throughput optimality is not guaranteed by the proofs in this paper, since it only holds for λ 1 > (99/100)4.1 = 4.06.However, it appears that the modified QPS does not diverge even for average rates less than 4.06.

APPROXIMATION OF PACKET DELAY USING MARKOV CHAINS
For video that is transmitted in packet form, what is important is not only the average delay of the packets, but also their delay distribution.More specifically, if a given packet does not arrive within a specific delay window the video decoder may need to decode without using the packet, since either the user cannot tolerate a large delay, or the storage capacity of the receiver buffer will be exceeded.Missing packets result in video quality degradation.Therefore, for the problem of broadcasting examined in this paper, it is useful to be able to obtain the distribution of the packet delay in order to make predictions about the quality of the video stream.A Markov Model is developed in this section whose state denotes the delay of the first (head) packet of the queue in terms of scheduling periods.It is assumed that the only possible transitions are to neighboring states.Again, this is an approximation that is found to work well for QPS.
Clearly, because of queuing, the delays of neighboring packets of a video stream are correlated.During the periods when the queue lengths, and, consequently, the delays become large, it is possible that more than one packet will be delayed.Hence, a model assuming that the delays of neighboring packets of the encoded video stream are independently distributed is not exact unless a sufficient interleaving depth is present.However, in this article it will be assumed that the delays are independent.First, this will provide a lower bound on the video quality that one can expect.Moreover, in order to obtain an accurate estimate of the video quality, one would need to take into account the particular encoder and decoder that are used, the intra frame ratio, and so forth.One could consider a priority queue where different priorities are given to packets to make sure that enough packets are available for the decoding of a given Group of Blocks (GOB) or frame.Such a scheme will not be accurately described by the Markov Chain presented below, but the approximation may be satisfactory.Another particularity of the system in this article is that the service rate does not depend only on the state of the queue that is being considered, but also on the states of the other queues, since all of them are taken into account for scheduling.However, and despite all the above, simulation results show that the Markov model, albeit simplified, provides a good approximation to the distribution of the delay for a system using a QPS-like scheduler and can therefore be used for the prediction of the video quality.
Figure 4 presents the distribution of the packets delay for the scenario of the previous section, arrival rate λ = [ 4 2 ] and average packet size μ = 1 for both queues.Again, SNR = 19 and 13 dB, respectively.Scheduling uses QSI during all periods.The reason why the probability distribution has a peak at 1 and not 0 is because the scheduler operates only at the end of a period, so packets that have arrived during an interval may have to wait till the end of that interval in order to be able to leave the queue.This skews the peak of the distribution that would otherwise be at 0. In terms of the Markov Chain the service rate is not constant and depends on the queue state.
From Markov Queue theory, and assuming that the service rate depends on the delay of the first packet of the queue, The values of ρ i are obtained using the p i 's that result from the simulation, and are plotted in Figure 5 for both queues. 2  million scheduling periods are used for the simulation.Note that they converge as i increases.This is expected since, as the queue grows, the scheduled packet will need to wait for a time longer than a scheduling period in order to leave the system.Hence, in this case, the fact that the scheduling happens in specific instants does not influence the service rate.The oscillations as the delay increases are due to the inaccuracy of the p i 's due to the fewer number of samples for the less probable states.
Based on Figure 5, the following approximation is used: Then, the delay distribution probabilities are calculated as follows The P i 's are approximated using K = 6 and ρ K = 0.86 for both queues.The resulting approximation of the packet delay distribution is shown in Figure 4.As can be seen, although the queues for the scenario considered in this paper are not independent, the approximation is good, and can be used to predict the delay of the packets scheduled by QPS provided that the arriving traffic rates are known.In Figure 6 more detail is shown for the states corresponding to higher delays.The approximation is slightly pessimistic but still very close to the actual values.In the following section, the distribution of the packet delay is used in order to decide on the encoding rate of video streams that are broadcast to different users and are scheduled using QPS.

CHOOSING THE ENCODER RATE FOR VIDEO STREAMS SCHEDULED BY QPS
Video streams transmitted in packets are subject to distortion.Distortion results from several sources such as encoder compression, corrupted data and lost or delayed packets.In [10] the authors develop models and derive expressions for the overall distortion of a video stream D d = D e + D v .The first term D e is the distortion because of signal compression at the encoder.It depends on the INTRA frame rate β and the rate R e at the output of the decoder.R e may need to be lowered in order to allow for a more redundant channel code, and, therefore, better protection against noise in the channel.D v is the distortion occurring at the decoder and is related to the lost or corrupted packets that cannot be used by the decoder to reconstruct the transmitted video stream.
In [10], the channel capacity was assumed to be fixed and the reason for varying R e was in order to leave more (or less) room to the channel code.The stronger the channel code is the smaller the probability of erroneous packets will be, leading to reduced decoder distortion.Therefore, the choice of R e (and the associated channel code rate) leads to a tradeoff between D e and D v .By choosing R e , the channel code and the INTRA rate β appropriately, the smallest value of the overall distortion D d can be found.
In the scenario examined in this paper, a new tradeoff is created between D e and D v .In a BC where many users compete for the resources, a larger channel rate also means larger average (and maximum) delays.Hence, allowing the video encoder to send with a faster rate also increases the probability that a packet will not arrive early enough for the decoder to be able to use it.The number of packets with delays that exceed a given threshold adds to the number of packets that are corrupted in the channel and, therefore, the overall number of unusuable packets increases.Consequently, this leads to larger distortion.
As explained in [10] the exact value of the distortion depends on many factors such as the particular stream that is being transmitted, the video encoder and the spatial filters of the decoder, all of which are outside the scope of this paper.Therefore, in this article, it is briefly suggested how the effect of the channel delay can be added to the calculation of D v .Then, the system optimization can proceed along the lines of [10].From [10], , where β is the INTRA rate, γ is the leakage parameter that is determined by the loop filter of the decoder, T = 1/β is the INTRA update interval, σ 2 u0 describes the sensitivity of the video decoder to an increase in the error rate and P L is the residual packet error rate.When QPS is used, the proportion (1 − P L ) of packets that are not corrupted in the channel or are lost for other reasons, are subject to delays at the queue of the scheduler.The probability of the delay of a packet exceeding a given threshold T del can be found by forming P late = ∞ d=D p d = 1 − D−1 d=0 p d , where D is the first value of the delay that exceeds T del .Alternatively, the approximate probabilities p d that were derived using the Markov Chain model can be used.Hence, the new P L that should be used for the calculation of D v is equal to P L = P L + (1 − P L )P late .

CONCLUSION
In this paper, Queue Proportional Scheduling was considered with video transmission in mind.First, it was shown that, if an increase in the average packet delay can be tolerated, the use of Queue State Information can become less frequent, therefore simplifying the scheduler.A modified QPS scheduler was proposed that performs better than the approach of simply using outdated QSI for scheduling.The modified scheduler performs better than the simplistic approach without increasing considerably the implementation complexity.Moreover, it was proved that, under certain conditions, the modified QPS is throughput optimal.It was also shown using simulation that, in systems using QPS, the distribution of the packet delay can be approximated satisfactorily by a Markov Chain model.This model makes it easier to obtain an estimate for the tail of the probability distribution, and, consequently calculate the video distortion caused by packets whose delay exceeds the buffer size of the decoder.It was also discussed how the effect of late packet arrivals can be included in the calculation of the distortion.

Proof of Theorem 1
Let Q(t) = q t = [ q t,1 q t,2 • • • q t,K ] T , where K is the number of users of the BC.Assume, without loss of generality, that q t,1 / =0 and λ 1 / =0.Then, q t can be written as q t = w(t)[λ 1 , λ 2 + Δλ 2 , . . ., λ K + Δλ K ] T , where w(t) = q t,1 /λ 1 and Δλ i are such that w(t)(λ i + Δλ i ) = q t,i for i = 2, . . ., K. Therefore, When the queue state is used for scheduling, R(t) = r(t)(q t /w(t)) where r(t) is such that R(t) is inside the capacity region, and does not exceed q t .Hence, r(t) ≤ w(t).As is shown in [9], where γ(t Consider now the case of the modified QPS algorithm, and assume that the queue state information gets used once every L = 2 transmission periods, that is, for t, t + 2, . . ., t + 2n, information on Q(t) is used, whereas during the other periods scheduling is based on λ and γ(t) = 0. Assuming that 2 − α(t) can never become negative, that is, that the average arrival rate is large enough so that it is more than halfway between zero and the boundary of the capacity region, γ(t) < 1 in all other cases.Therefore, similar to the QPS proof of [9], it can be deduced that the slope of Q(t) converges to the slope of λ in the sense that For the general case where queue State Information is used every L scheduling periods, where This can be guaranteed if, for each l, α l (t) < L/(L − 1) which means that if the boundary of C BC is at αλ, then α < L/(L − 1).

Proof of Theorem 2
It will be shown that, for any λ ∈ C BC such that αλ is at the boundary of C BC and 1 ≤ α ≤ L/(L − 1), the queue lengths of all users can be kept finite.The following Lyapunov function is chosen: Then, assuming that t is a scheduling period when the Queue State Information is used, L(Q(t + 1)) ]. Again, since the queue lengths cannot become negative, l=0 R i (t +l)+ L−1 l=0 Z i (t +l)}.Assume that Q(0) = q 0 , where max{q i,0 } is sufficiently small.Then, the expected drift of the Lyapunov function conditioned on q t is equal to (A.7) It will be shown that, as q t ∞ = max{q i,t }→∞, the Lyapunov drift (A.7) becomes strictly negative.q t ∞ →∞ also implies that t→∞ since a queue cannot grow to infinity during a finite time interval.As was shown in Theorem 1, if a condition on λ holds, Q(nL) converges to w(t)λ as t→∞.Hence, at time t→∞, QPS will use the value of Q(t) and the rate R(t) will be equal to r(t)Q(t)→r(t)w(t)λ = W(t)λ.Regarding the rates R(t+l), 1 ≤ l ≤ L−1, these are, by definition of the modified QPS, equal to α(t + l)λ.Therefore, for t→∞, Since λ is in the interior of C BC , W(t) and the α(t + l) will be strictly larger than 1 because the modified QPS algorithm chooses the longest vector along the direction of λ that belongs to the BC Capacity region.When q t ∞ →∞ this vector reaches the boundary of the capacity region, and is, therefore, longer than λ.Hence, the Lyapunov drift is strictly negative for any λ satisfying the condition that αλ ∈ C BC , 1 < α ≤ L/(L − 1).

Figure 2 :
Figure 2: C BC for the 2-user scenario.

Figure 3 :
Figure 3: Comparison of performance of QPS, QPS with reduced use of QSI and modified QPS.

Figure 4 :
Figure 4: Distribution of the packet delay.