ARQ-Aware Scheduling and Link Adaptation for Video Transmission over Mobile Broadband Networks.

This paper studies the e ﬀ ect of ARQ retransmissions on packet error rate, delay, and jitter at the application layer for a real-time video transmission at 1.03 Mbps over a mobile broadband network. The e ﬀ ect of time-correlated channel errors for various Mobile Station (MS) velocities is evaluated. In the context of mobile WiMAX, the role of the ARQ Retry Timeout parameter and the maximum number of ARQ retransmissions is taken into account. ARQ-aware and channel-aware scheduling is assumed in order to allocate adequate resources according to the level of packet error rate and the number of ARQ retransmissions required. A novel metric, namely, goodput per frame , is proposed as a measure of transmission e ﬃ ciency. Results show that to attain quasi error free transmission and low jitter (for real-time video QoS), only QPSK 1/2 can be used at mean channel SNR values between 12dB and 16 dB, while 16QAM 1/2 can be used below 20 dB at walking speeds. However, these modes are shown to result in low transmission e ﬃ ciency, attaining, for example, a total goodput of 3Mbps at an SNR of 14 dB, for a block lifetime of 90ms. It is shown that ARQ retransmissions are more e ﬀ ective at higher MS speeds.


Introduction
Mobile WiMAX (IEEE 802.16e) [1] and 3GPP LTE (Long-Term Evolution) [2] represent mobile broadband standards that offer high user data rates and support for bandwidth hungry video applications. Both standards use very similar PHY and MAC layer techniques, especially for downlink (DL) transmission. In order to provide strong QoS, crosslayer adaptive strategies must be implemented in the wireless network [3,4]. Video applications demand a low Packet Error Rate (PER), which may be achieved via the use of MAC layer Automatic Repeat ReQuest (ARQ) and the choice of suitable Modulation and Coding Schemes (MCS). However, ARQ consumes additional bandwidth and causes increased end-to-end latency and jitter. ARQ is controlled in the MAC layer by the block lifetime and ARQ Retry Timer parameters, which define how many and how frequently retransmissions may occur. Link adaptation is used in mobile broadband networks to improve the PER by matching the QAM constellation and forward error correction coding rate to the time varying channel quality. The impact of specific ARQ parameters and mechanisms has been extensively studied in the literature, for example, [5][6][7][8][9].
In [8], the authors analyze delay and throughput using probabilistic PHY layer error modelling. In [9], packet errors were modelled as an uncorrelated process in time. Often packet errors are modelled using statistical channel models, such as Markov chains, for example, [3,10,11], based on statistical measurements that have limited scalability and adaptability to a variety of fading, shadowing, or mobility circumstances. However, this type of modelling fails to represent the bursty nature of errors in a fading channel and the impact it has on ARQ retransmission performance.
To deliver video QoS the mobile WiMAX and LTE standards specify a number of scheduling mechanisms, such as Unsolicited Grant Service (UGS), rtPS (real-time Polling Service), and BE (Best Effort). As scheduling of resources is not specified in the standards, but instead left open for vendor implementation, this is an area of considerable research interest. In [12], a survey of several scheduling 2 Journal of Computer Networks and Communications algorithms showed that, due to the nature of the wireless medium and user mobility, the scheduler should take into account the PER and the Carrier-to-Interference-plus-Noise Ratio (CINR) reported by the channel quality indicator (CQI) per connection. These schedulers are denoted as "channel aware." A channel-aware scheduler must take into consideration the MCS mode selected through link adaptation. The scheduling of resources must also take into account ARQ retransmissions, as discussed in [6,8,12,13]. The authors of [8,13] propose an ARQ-aware scheduler, where ARQ retransmissions have priority over new data. For applications that are very sensitive to jitter and delay, such as video, the QoS guarantees a maximum delay and error rate for a given bitrate. If ARQ is enabled on these connections, the BS scheduler must allocate sufficient resources in each frame to accommodate new data and ARQ retransmissions. The resources required per connection vary also according to the MCS mode selected by the Link Adaptation (LA) process.
Many recent publications have studied video streaming over WiMAX, for example, [10,11,14], but very few investigate unicast video with ARQ retransmission [4,9]. In [4], the authors proposed cross-layer parameter optimization to achieve the required QoS, using queuing theory to minimize the required bandwidth while assuming stop-and-wait ARQ retransmission. None of the ARQ mechanisms specified in the 802.16e standard were considered in [4]. In [3,14], the issue of "bandwidth hungry" video applications was highlighted. Nevertheless, the only video transmissions considered in recent publications are based on low resolution video (CIF, QVGA) with bitrates up to 400 kbps [8,11,14].
This work focuses on the transmission of high resolution real-time video, at a bitrate of 1.03 Mbps, over the downlink (DL) of a mobile broadband connection. The simulated transmission of a flow of UDP packets corresponds to the flow of video packets. Simulations are performed for a UDP unicast DL transmission, with Selective ACK (S-ACK) ARQ enabled. Moreover, multicast transmission without ARQ enabled is also included in the analysis. The transmission efficiency of ARQ enabled mobile WiMAX networks is computed by proposing a novel efficiency metric, the goodput per frame, which takes into account the amount of radio resources required per DL subframe and the PER attained. Channel-aware and ARQ-aware scheduling at the MAC layer is assumed. Very importantly, block errors are timecorrelated, based on the use of the accurate time-correlated 3GPP SCM fading channel model [15]. Results are based on the WiMAX Forum recommendations [16,17] for the ARQ Retry Timeout parameter and the maximum number of ARQ retransmissions. The study shows for the first time how PER and delay/jitter are affected by scheduling sufficient (or insufficient) channel resources per frame, to cater for ARQ retransmissions, according to the MCS mode selected. The work identifies which MCS modes are suitable to deliver QoS for real-time video, by maintaining quasi-zero PER and low jitter at the application layer. Our previous work [18] focused on received video quality (based on PSNR), for a 7.63 Mbps HD video sequence, when no limitations were applied to the ARQ Retry Timeout parameter and the maximum number of ARQ retransmissions, as assumed in [8,9] (since the frequency of ACK is not specified in the IEEE 802.16e standard [1]). The effect of ARQ-aware scheduling was not investigated in our previous work.
In this paper, the effect of MS velocity on ARQ retransmissions is explored. This is made possible by the use of an accurate time-correlated fading channel model. MS velocities of 1 and 10 km/h are considered.
Mobile WiMAX [1], together with 3GPP LTE [2], is key technology for next-generation broadband wireless access (BWA) networks [19]. Both technologies have very similar DL PHY layers and strong similarities in their MAC layers. For both technologies, radio resource management techniques, such as scheduling and resource allocation, are pivotal in research work on QoS support for multimedia services [19].
In the next section, key aspects of the mobile WiMAX PHY and MAC layers are described along with the timevarying channel model. In Section 3, the MAC-PHY simulator is described, detailing the assumptions made. Sections 4, 5, and 6 present an analysis of the simulation results. Conclusions are presented in Section 7.

Overview of Mobile WiMAX and the SCM Channel Model
Medium Access Control (MAC) Layer. The 802.16e MAC layer [20] includes a number of adjustable features, such as adaptive MCS, ARQ, packet fragmentation and aggregation, variable size MAC Protocol Data Units (PDU), and application-specific service flows and PDU scheduling based on QoS. Packets from the higher layers arrive in the convergence sublayer (CS) of the MAC as MAC Service Data Units (SDUs). Based on their QoS requirements, MAC SDUs are classified into service flows. There is the option for SDU fragmentation into PDUs, and this feature is assumed here. SDUs are partitioned into ARQ blocks of fixed size when ARQ is enabled. The MAC PDU is the data unit exchanged between the BS and MS MAC layers. Once a PDU has been constructed, it is placed in the appropriate service flow queue and managed by the scheduler, which determines the PHY resource allocation (i.e., bandwidth and OFDMA symbol allocation) on a frame-by-frame basis. Each transmitted PDU is either received correctly or in error, depending on the channel response at the time of transmission. The time-varying PHY layer PER is accurately calculated based on the channel model for each ARQ block in the transmitted PDU. The standard specifies a number of ARQ feedback mechanisms, such as Cumulative ACK, Cumulative and Selective ACK, and Selective ACK (S-ACK) [1]. Here S-ACK feedback is used. An S-ACK feedback message is generated for each transmission burst and any PDUs containing errors are placed in the retransmission queue [6]. No block rearrangement is enabled. The retransmission of PDUs in error continues until they are received correctly or their ARQ block lifetimes expire. The number of retransmissions is determined by the block lifetime and the ARQ Retry Timeout Timer, as shown in (1). The ARQ Retry Timeout represents the minimum number of OFDMA frames a transmitter will wait to retransmit an 102.9 Number of OFDMA symbols (5 ms frame) 47 unacknowledged block [1]. This retry period begins from the frame when the ARQ block was last transmitted. If the block lifetime expires before it is received correctly then the block is discarded.
Physical Layer (PHY). The mobile WiMAX standard has adopted Scalable-OFDMA (S-OFDMA) [1]. Table 1 shows the relevant parameters for the S-OFDMA PHY. Simulations for this paper were performed for the 10 MHz channel bandwidth profile (highlighted in italics in Table 1). The payload data is modulated using the full range of link speeds (MCS modes) as defined in the standard [1] and shown in Table 2. Assuming a PUSC DL [1], the modulation symbols allocated to a sequence of slots in each DL OFDMA frame are assigned to a number of logical subchannels. An OFDMA slot is the minimum possible data allocation unit. For PUSC DL, it is defined as one subchannel by two OFDMA symbols. For the 10 MHz channel, an OFDMA symbol consists of 30 subchannels for PUSC DL, each containing 24 data subcarriers [21]. Hence, a slot contains 48 data subcarriers. Based on this, the slot payload capacity P sl for each MCS mode is computed for PUSC DL. It is shown in Table 3, where m represents the MCS modulation order and r the coding rate.
The channel resources (in terms of slots) required for data transmission over a mobile WiMAX network are evaluated based on the slot payload capacity for each MCS mode.

PHY Layer Abstraction.
To simplify the interface between the link and system level simulators, whilst still modelling dynamic system behaviour, a technique known as Effective SINR Mapping (ESM) is used. This method, which can also be used to model the LTE PHY layer, compresses the SINR (per subcarrier) vector into a single effective SINR (ESINR). The technique is described in detail in [22]. The PHY abstraction model is described and validated in [23]. This PHY abstraction model allows the instantaneous Block Error Rate (BLER) to be computed for each channel realization, based on the instantaneous fading channel and the length of the ARQ block. Although many commercial network simulators exist, such as OpNet and QualNet, these tend to provide simplified physical layer support. For example, QualNet uses bit error rate look up tables that average the effects of time-varying fast fading. Video analysis requires the use of time-varying instantaneous BLER in a fading channel (not averaged BLER), since the bursty nature of the errors has a detrimental effect on video quality, as shown in [24].
Wideband Channel Model. The channel model follows the ETSI 3GPP spatial channel model (SCM), as described in [15]. A time varying "urban micro" tapped delay line (TDL) was generated for each channel snapshot. The TDL consists of 6 time-correlated fading taps with nonuniform delays. The carrier frequency is 2.3 GHz and the FFT size is 1024. Each radio channel is made up of a number of channel samples (sampled every 2.5 ms) corresponding to a duration of 85 seconds.

WiMAX MAC-PHY Simulator
Unicast and multicast transmission of high resolution video is simulated over the mobile WiMAX system. This work is based on a MAC-PHY simulator developed according to the standard [1] and presented in [18]. The mobile WiMAX PHY layer simulator is described in [25]. As discussed in Section 2, the PHY layer PER is generated from the ESM PHY layer abstraction method developed in [23].

Simulator Assumptions.
The following key assumptions were made for the design of the mobile WiMAX MAC-PHY simulator. MAC SDU fragmentation (not packing) is assumed, according to the 802.16e standard [1]. The MAC PDU size is fixed for all MCS modes. It is small (less than 200 Bytes) to improve the error rate, according to [5]. The simulated ARQ mechanism is Selective ACK (S-ACK) [1]. It is assumed that no errors occur in the ARQ feedback messages. When errors occur during PDU transmission, no block rearrangement is performed within the PDUs. PDUs that are not acknowledged are placed in a separate queue within the user flow, known as the retransmission queue [6,26]. The scheduler gives priority to PDUs from the retransmissions queue. SDUs are delivered in order at the receiver, since the ARQ-DELIVER IN ORDER MAC parameter is enabled. Video is assumed to be sent at a constant bit rate (CBR), with fixed size packets; therefore, the use of UGS scheduling is assumed. The scheduler allocates a fixed amount of resources per MAC frame for each DL burst, according to the operation of UGS scheduling. As the standard [1] does not specify how the scheduling of resources is performed, it is assumed that a "channel-aware scheduler" is used that allocates resources according to the MCS mode selected with an overallocation for ARQ retransmissions [13]. If additional resources (i.e., overallocation) are not provided, when retransmissions occur they will take up resources from the new arriving data, since retransmissions have priority. This would result in a queuing delay that is unacceptable for real-time video applications. It is assumed that adequate additional resources for ARQ are available per frame, as required to cater for the expected number of retransmissions depending on the BLER. The data payload that each slot can carry for each MCS mode is given in Table 3. The number of PDUs that can fit within the allocated resource is calculated according to the MCS mode and the size of retransmission PDU queue. For each DL burst the retransmission PDUs are included in the allocated resources, taking priority over new PDUs.

Simulator
Functionality. The 802.16e MAC-PHY simulator provides an error modelling tool that predicts the loss patterns for a sequence of RTP/UDP packets and thus the losses in the sequence of video packets at the receiver. The PHY abstraction model allows the instantaneous BLER to be computed for each channel realization, based on the instantaneous fading channel and the length of the ARQ block. Thus, the computed BLER is time-correlated. A flow of fixed size RTP/UDP packets arrive at the MAC layer and are passed into the simulator. It is assumed that UDP packets arrive at a constant rate. Each UDP packet corresponds oneto-one to a MAC SDU. More details on the MAC-PHY simulator are given in [18].
At the receiver, SDUs are reassembled from the appropriate ARQ blocks. Since the MAC parameter ARQ DELIVER IN ORDER is enabled, SDUs are delivered in sequential order to the transport layer, as UDP packets. This means that an SDU cannot be delivered to the higher layers unless all the SDUs preceding it in the flow have been received correctly, or have been discarded. If an ARQ block is finally discarded, despite retransmissions, the IEEE 802.16e standard mandates that the SDU to which it belongs cannot be delivered to the higher layers [1]. A block diagram of the simulator is shown in Figure 1.
The MAC-PHY simulator provides an accurate way to determine the MAC BLER and SDU error rate (SER) for the SDUs that are discarded, taking into account the MAC layer parameters, data encapsulation, and the ARQ process. Importantly, the BLER for contiguous blocks is not independent due to the time-correlated nature of the fading channel, and this is enabled by modelling the instantaneous PHY PER P e (t). The simulator computes the following as a function of mean channel SNR, MCS mode, ARQ block lifetime, and MS velocity, taking into account MAC layer parameters such as packet size, the ARQ Retry timer, and the ARQ feedback time: (i) block error rate (BLER), (ii) SDU error rate (SER), equivalent to UDP PER (oneto-one mapping of SDUs to UDP packets), (iii) the time pattern of the ARQ block and SDU losses, (iv) end-to-end delay and jitter for blocks and SDUs, (v) channel capacity consumed during each DL subframe, measured in physical slots, (vi) transmission throughput and goodput.
The simulator records the transmission times for each ARQ block and SDU for a flow of N SDUs. The number of ARQ retransmissions k re is estimated from the total ARQ block transmission time T ARQ , computed as where T OFDMA is the duration of an OFDMA frame (i.e., 5 ms) and ARQRetry is the ARQRetry Timeout parameter (number of frames). The number of ARQ retransmissions k re is estimated when a bound is imposed on T ARQ by the block lifetime l bl , as In general, if insufficient resources are allocated for all retransmissions in the frame where the PDU is to be resent, according to the ARQ Retry Timeout timer, an additional queuing time, Tq, will be added in (1). This represents the block queuing time in the retransmission queue. In this case, the number of retransmissions that will take place will be less than max(k re ), as T ARQ is limited by (2). Hence, k re is not deterministic for all blocks.
The simulator calculates the end-to-end SDU latency and jitter for a flow of N SDUs ∈ {1, 2, 3, . . . , j, j +1, j +2, . . . , N}. The SDU end-to-end latency for SDU j, D j is calculated as the time difference between the arrival of the MAC SDU j at the MAC layer transmitter, T arr, j , and the delivery of the SDU j to the transport layer at the receiver, T rec, j , as shown in (4) The end-to-end latency of an SDU j, D j , consists of the transmission time D tx and the delivery time D del .

PDUs with
SDU re-assembly P e (t) The transmission time D tx includes the retransmission time for the PDUs containing blocks of SDU j and the waiting time in the retransmission buffer, if any of the ARQ blocks contained in the SDU were retransmitted. An SDU is delivered to the receiver (at the transport layer) when all the ARQ blocks it consists of have been correctly received, after retransmission. Also, SDU packets are delivered in order. This is because the receiver must first receive correctly and reassemble all the ARQ blocks of SDU j and then deliver the SDUs following j. This means that if the ARQ blocks in SDU j have undergone retransmission, the SDUs j + 1, j + 2, . . . which follow SDU j will be delayed as well, even if no errors and retransmissions occurred for them. Therefore, retransmissions can cause a build up of delay, not only for the SDU which suffered the transmission errors, but also for SDUs following it. If the channel is poor and errors occur frequently, the delay build-up can be significant. Figure 2 depicts the delay build-up for a number of SDUs, when some of the ARQ blocks of SDU S3 are lost and later retransmitted. Although SDUs S4, S5, and S6, succeeding S3, are received correctly, they are not delivered to the higher layers until all the ARQ blocks from SDU S3 have been received correctly (after ARQ retransmission). So SDUs S4 to S6 are delayed.
The variation in the end-to-end delay (latency) of the SDUs, for a flow of SDUs, is referred to as jitter. Another term commonly used is Packet Delay Variation (PDV), defined in ITU-T Recommendation Y.1540. Jitter is calculated as the variance of the SDU delay as follows where D SDU is the discrete function of the SDU latency. D i is the end-to-end delay of SDU i, μ D is the mean SDU delay, and N is the total number of SDUs transmitted.  In accordance with 802.16e recommendations [16,17], here it is assumed that retransmissions cannot occur in the next frame but are sent at the earliest on the 4th DL subframe after transmission. Furthermore, the maximum number of retransmissions is limited to 4. These values result from processing time at the receiver and transmission delays in the radio network. The MAC parameter ARQ Retry Timeout is set to 4 and performance is simulated for 1-4 retransmissions, corresponding to 30-90 ms block lifetimes, according to (3).

Analysis of BLER and UDP PER
In Figures 3 and 4, the BLER and SDU error rate are compared for an ARQ lifetime of 65 ms (i.e., up to 3 retransmissions may have occurred). The simulated MS speed is 1 km/h. It is observed that the BLER seen at the MAC layer after the retransmissions is projected to a much higher SER and UDP PER at the higher layers. For example, at SNR = 16 dB 16QAM 3/4 results in 0.063 BLER, which corresponds to 0.089 SER. This is because, according to the standard [1],  even if just one ARQ block in an SDU is discarded it will result in the whole SDU being discarded. The SDU error rate is accentuated more for higher BLER, for example, at 14 dB 64QAM 2/3 gives 0.53 BLER and 0.65 SER. This shows that in order to achieve high video quality and quasi-zero PER at the video receiver (i.e., SER), the ARQ retransmissions must achieve quasi-zero BLER. However, with the number of retransmissions limited to 3, only lower modes can deliver error free data in a slowly time-varying channel. From Figure 3, it is obvious that only the QPSK modes and 16QAM 1/2 can deliver SER ≤ 10 −2 for SNR ≤ 18 dB, when the MS speed is 1 km/h and the block lifetime is 65 ms.
In Figure 5, SER versus channel SNR is shown for an MS speed of 1 km/h, when up to 4 retransmissions are allowed, for a block lifetime of 90 ms. It is shown that the SER is lower with a longer block lifetime. QPSK 1/2 delivers error free data for SNR ≥ 12 dB and 16QAM 1/2 attains an SER < 0.02 for SNR ≥ 14 dB.
Our previous work in [18] focused on the BLER attained (rather than SER) when ARQ retransmissions occurred in the next DL subframe, without limitation on the maximum number of retransmissions (since the ARQ retransmission frequency is not specified in the 802.16e standard [1]). In [18], it was shown that the BLER achieved was below 10 −2 at a mean channel SNR of 8 dB, for a block lifetime of 100 ms, with MCS modes 16QAM 1/2 or lower. All MCS modes up to 64QAM 1/2 attained quasi error free transmission when the mean channel SNR was 12 dB and block lifetimes were greater than 70 ms. This was possible because, in that scenario, the maximum number of permitted retransmissions was 7 for a block lifetime of 70 ms; this resulted in more favorable ARQ performance. In this work, it is shown that imposing practical limitations on the ARQ Retry Timeout parameter and the maximum number of retransmissions (as   recommended in [16,17]) results in a residual BLER. This residual BLER is further accentuated as UDP PER at the higher layers. In Figure 6, the SER versus SNR across all MCS modes for an MS speed of 10 km/h is compared with the SER versus SNR for an MS speed of 1 km/h, as shown in Figure 4. The maximum number of retransmissions in both cases is determined by a 65 ms block lifetime. It is clear that the SER attained for the 1 km/h channel is higher than the SER for Hence, more ARQ retransmissions are required at the slower speed to achieve a quasi-zero level of SER. The effect of channel coherence time on ARQ retransmissions was also studied in [24] for 802.11 a/g networks, where ARQ retransmission was implemented according to a stop-and-wait mechanism and was governed by the CSMA/CA access protocol. Next, the BLER and SER attained when multicasting is studied. In Figures 7 and 8 ARQ is not enabled, as is the case for multicasting. The MS speed is 1 km/h. It can be seen that the SER in Figure 8 is much higher than the BLER for the same channel SNR and MCS mode. For example, at 14 dB QPSK 3/4 delivers BLER = 0.039, while the SER is 0.15. The lack of ARQ error correction limits the video broadcast performance over mobile WiMAX, as explained in [11]. Without additional error correction, real-time video multicasting could not be offered for an SNR range below 16 dB (as SER > 10 −2 [28]), and even then only with the lowest throughput mode QPSK 1/2, which consumes considerable channel capacity.

ARQ-Aware Scheduling and Latency/Jitter
Another very important aspect of video transmission with ARQ is the latency and jitter that occurs. As discussed in [29][30][31], for video applications the playback buffer that masks network jitter can take values in the order of 250 ms, while latency is acceptable up to 100-150 ms, depending on  the specific video characteristics and applications. Assuming ARQ-aware scheduling, here the latency and jitter associated with ARQ retransmissions is studied over an 802.16e network.
For the 2000 transmitted SDUs, the simulator calculates the total latency for each SDU. Figure 9 shows the PDF of the end-to-end SDU delay computed during the transmission of 2000 SDUs at a channel SNR = 12 dB, MCS mode 1 (QPSK 3/4), and MS speed 1 km/h, for block lifetimes of 30 ms (up to 1 retransmission), 65 ms (up to 3 retransmissions), and 90 ms (up to 4 retransmissions).  The simulator computes the latency during transmission and the jitter as the variance of the SDU latency across the 2000 SDUs, for each SNR, MCS mode, and block lifetime. Figure 10 shows the maximum latency attained versus block lifetime, when the mean channel SNR is 12 dB and the MS speed is 1 km/h. It is observed that at SNR = 12 dB (when the channel is poor) the maximum latency is fixed to approximately 82 ms for block lifetimes of 30-90 ms when QPSK 1/2 is used. This MCS mode delivers SER = 0.003 (see Figure 4). If QPSK 3/4 is used, the maximum delay increases for each block lifetime, reaching over 300 ms for a block lifetime of 90 ms. This occurs because a very large number of ARQ blocks are in error and many retransmissions occur. This MCS mode attains SER = 0.05 when the block lifetime is 65 ms, despite the retransmissions that take place (see Figure 4). This mode would not be selected for transmission at SNR = 12 dB by the link adaptation algorithm because the BLER is very high for the amount of retransmissions allowed. The amount of resources allocated by the scheduler in this case does not cater for the very large number of retransmissions that occur, resulting in a buildup of queuing delay and also a large increase in jitter. The amount of "overallocation" of resources the ARQ-aware scheduler predicts is related to the level of PER attained by the selected MCS mode and the ARQ block lifetime. In Table 5, the overallocation γ required for different MCS modes is given for mean channel SNR values of 10 dB and 22 dB, when the block lifetime is 65 ms. If S is the number of slots required per DL frame, for a given bitrate and MCS mode, the scheduler needs to allocate (1 + γ) · S slots per DL frame. The overallocation γ is calculated by dividing the number of slots required per DL frame for the desired number of ARQ retransmissions (according to the MCS mode), by S. From this table, it is obvious that, for example, if mode 1 was selected at SNR = 10 dB, the scheduler would need to allocate γ = 0.42 more resources than that required if ARQ was not enabled. A smaller allocation than this results in queuing delays. It is also obvious that the overallocation required for the higher modes at low SNR values is unacceptable (i.e., three times the amount of resources required for the video bitrate). Figures 11 and 12 show jitter versus block lifetime for SNR = 22 dB at an MS speed of 1 km/h and for SNR = 12 dB at an MS speed of 10 km/h, correspondingly. Jitter is studied for the MCS modes that attain SER ≤ 0.02 and that can deliver quality video. In Figure 11, where the mean channel SNR = 22 dB, the jitter is below 100 ms when the block lifetime is 65 ms, for MCS modes 0 to 4. All of these modes deliver quasi error free data. Figure 12 shows the jitter when the MS speed is 10 km/h at SNR = 12 dB. The jitter is approximately 100 ms for mode 1 when the block lifetime is 65 ms. Both QPSK 3/4 and 16QAM 1/2 attain an SER of approximately 0.02, but for 16QAM 1/2 more retransmissions occur and therefore higher jitter ensues.
The study of the simulation results on SER, latency and jitter, when the recommendations from [17,32] regarding ARQ parameters are applied, leads to the conclusion that when the channel SNR is poor (SNR ≤ 14 dB) only the lower QPSK modes can deliver quality video with acceptable jitter. For a UDP unicast video transmission, in order to attain quasi error free SER, and at the same time limit jitter to below 100 ms, QPSK modes should be used with up to 3 retransmissions. When the channel has a long coherence time more retransmissions are required in order to deliver the ARQ blocks error free, and only QPSK 1/2 can be used. However, even then the SER attained (0.03) is not quasi error free.

Transmission Efficiency
Having discussed the mobile WiMAX performance in terms of SER and jitter when ARQ is enabled, the channel resources required during unicast video transmission are now studied. The simulator estimates the total number of physical slots required for the transmission of 2000 SDUs, at each mean channel SNR and MCS mode, including all ARQ retransmissions. Then the channel capacity required is calculated as a percentage of the total number of DL slots available, for the duration of the transmission. This work focuses on the goodput attained. A novel transmission efficiency metric is proposed, namely the goodput per frame, which takes into account the goodput achieved for the amount of channel resources required per DL subframe. Figure 13 shows the channel capacity required, as a percentage of the total physical DL slots available per DL subframe, versus the mean channel SNR, for all MCS modes when up to 3 retransmissions are allowed. It is shown that for QPSK 1/2 there is a very small differentiation in the channel resources required for SNR ≥ 14 dB, as resource requirements drop from 38.5% of slots at SNR = 8 dB to 33%. This is because very few retransmissions occur for higher channel SNR values with QPSK 1/2, therefore the resources required are constant, corresponding to the new data arrivals. In other cases, the differentiation of resources required across SNR values is much greater. For example, for 64QAM 1/2 at SNR = 8 dB approximately 55% of the total slots are required, whereas for SNR = 22 dB only 11.5% of the total slots are required. This is because more ARQ retransmissions are required at low SNR values. It is also obvious that the higher throughput modes require less resources than the lower modes, even at low SNR values when retransmissions occur. For example, 16QAM 3/4 requires 29% of the total resources at SNR = 12 dB, while QPSK 1/2 requires 34% of the resources. This is because the lower modes pack less data bits per slot, as shown in Table 3.
It is clear that greater bandwidth efficiency can be achieved when higher MCS modes are used. However, as shown by the SDU loss rate, in order to support QoS for real time video transmission, only lower MCS modes can be successfully used at lower SNR values.
The goodput delivered per OFDMA frame for one flow of data, G flow (s, l, m), is computed as the correct number of bits received, CorrectBits, divided by transmission duration, F T , in number of OFDMA frames, required for the transmission of N UDP packets. The goodput delivered per OFDMA frame is calculated for each mean channel SNR s, MCS mode m, and block lifetime l, as where T OFDMA is the duration of an OFDMA frame (i.e., 5 ms). The average channel capacity, in slots, required for the data flow per OFDMA frame, θ flow,fr , is calculated as the total number of required physical slots C T divided by the transmission duration, F T , for each mean channel SNR s, MCS mode m, and block lifetime l Therefore, a θ flow,fr capacity per DL frame delivers G flow goodput per frame, for each channel SNR, MCS mode, and block lifetime. The goodput-per-frame efficiency metric measures the goodput G flow delivered per frame taking into account the average capacity required per frame (in slots), θ flow,fr . Hence goodput per frame, g fr , is defined as g fr (s, l, m) = G flow (s, l, m) θ flow,fr (s, l, m) .
If the total capacity of the DL frame, S fr , is used, the system can support a total goodput per frame Gd p fr , calculated using (10). The total capacity of a DL frame, S fr , in slots, is a system dependent parameter and for the mobile WiMAX system simulated, assuming a PUSC enabled DL, S fr = 330 slots ( Table 1). The system therefore can support a total goodput per frame Gd p fr given by Gd p fr (s, l, m) = G flow (s, l, m) · S fr θ flow,fr (s, l, m) .
(10) Figure 14 shows the total goodput per frame versus channel SNR, estimated for all MCS modes when the block lifetime is 90 ms and the MS speed is 1 km/h. It can be seen that QPSK 1/2 offers the highest transmission efficiency for SNR ≤ 10 dB and 16QAM 1/2 for SNR values in the range 10 dB to 16  In order to select the most efficient MCS mode that also delivers zero PER, or the PER below a minimum acceptable value (such that video QoS can be guaranteed), a constraint should be applied based on the PER attained at the application layer per channel SNR. Loss of packets can seriously degrade the quality of received video [28,33]. In the following, UDP PER is constrained to less than 1%. Figure 15 shows the total goodput per frame versus channel SNR, for all MCS modes with UDP PER ≤ 10 −2 , when the block lifetime is 90 ms and the MS speed is 1 km/h. If the PER attained for a particular MCS mode and SNR value is higher than 10 −2 , then the total goodput is set to zero. It is observed that below an SNR value of 12 dB no MCS mode can achieve quasi error free transmission. Therefore, quality unicast video cannot be offered for channel SNR below 12 dB. For SNR values in the range 12 dB to 16 dB only QPSK 1/2 can achieve the desired PER. For SNR values in the range 16 dB to 20 dB the most efficient MCS mode that offers quasi error free transmission is 16QAM 1/2, whereas for SNR = 22 dB mode 64QAM 1/2 offers the highest transmission efficiency, for a PER below 10 −2 . In Figure 15, the total goodput attained for multicast transmission (no ARQ enabled) is also shown. The multicast transmission also observes the PER ≤ 10 −2 constraint. It can be seen that the multicast video service cannot be offered quasi error free for SNR values below about 16 dB. For SNR values in the range 16 dB to 18 dB only QPSK 1/2 can achieve the desired PER. For SNR values in the range 20 dB to 22 dB mode 16QAM 1/2 offers the highest transmission efficiency while keeping the PER below 10 −2 . This comparison shows that using ARQ with a block lifetime of 90 ms offers a significant gain of approximately 3 Mbps (i.e., double the goodput) for

Conclusions
From the simulation results presented in this work, it has been possible to study the performance of different MCS modes and block lifetimes for various channel SNR values in a mobile broadband network. The performance of ARQ retransmissions was shown to depend on the MS velocity. It was shown that at walking speeds (e.g., 1 km/h) ARQ retransmissions are less efficient. Only mode 0 succeeds in delivering data with PER ≤ 10 −2 for 12 ≤ SNR < 16 dB, and only modes 1 and 2 for SNR ≤ 20 dB, with up to 4 retransmissions. The SER attained is lower for the same MCS mode, block lifetime and channel SNR, when the channel coherence time is short. It was demonstrated that a channel-aware and ARQaware scheduler should be used in order to predict and provide sufficient resources per frame for delay sensitive services, such as real-time video. The novel efficiency metric, goodput per frame, has enabled a performance comparison in terms of PER achieved and radio resources required when S-ACK ARQ is enabled, for various MCS modes. This efficiency metric showed the total goodput that each MCS mode could support for each channel SNR for a given block lifetime. The goodput per frame metric was found to be a valuable tool for radio resource management in broadband wireless networks.
Insight has been gained on the importance of the ARQ Retry Timeout MAC parameter and how it affects system performance, not only in terms of SDU delay, but PER as well. When practical considerations are taken into account regarding the frequency and number of possible retransmissions based on the WiMAX Forum recommendations, only modes 0, 1, and 2 can deliver quasi error free data. Jitter in these cases is maintained within acceptable limits for video QoS. However, the lower modes, 0 and 2, that attain quasi error free transmission, require 35% and 18% of the total channel resources, correspondingly, to successfully deliver video at 1.03 Mbps quasi error free. The total goodput attained is 3Mbps for SNR ≤ 16 dB (with MCS mode 0) and 6 Mbps for channel SNR values in the range 16 dB to 20 dB (with MCS mode 2). A quality unicast video service cannot be offered for channel SNR values below 12 dB.
Multicasting real-time video, while observing a QoS without ARQ retransmissions, is ineffective below about 16 dB SNR. Additional error correction mechanisms are necessary in order to support high quality multicast video, such as the Application Layer Forward Error Correction (FEC) mechanism based on Raptor codes, endorsed by the 3GPP MBMS [34].