Capacity Evaluation for IEEE 802.16e Mobile WiMAX

—We present a simple analytical method for capacity evaluation of IEEE 802.16e Mobile WiMAX 𝑇𝑀 networks. Various overheads that impact the capacity are explained and methods to reduce these overheads are also presented. The advantage of a simple model is that the effect of each decision and sensitivity to various parameters can be seen easily. We illustrate the model by estimating the capacity for three sample applications - Mobile TV, VoIP, and data. The analysis process helps explain various features of IEEE 802.16e Mobile WiMAX. It is shown that proper use of overhead reducing mechanisms and proper scheduling can make an order of magnitude difference in performance. This capacity evaluation method can also be used for validation of simulation models.


I. INTRODUCTION
I EEE 802.16eMobile WiMAX is the standard [1]   for broadband (high-speed) wireless access (BWA) in a metropolitan area.Many carriers all over the world have been deploying Mobile WiMAX infrastructure and equipment.For interoperability testing, several WiMAX profiles have been developed by WiMAX Forum.
The key concern of these providers is how many users they can support for various types of applications in a given environment or what value should be used for various parameters.This often requires detailed simulations and can be time consuming.In addition, studying sensitivity of the results to various input values requires multiple runs of the simulation further increasing the cost and complexity of the analysis.Therefore, in this paper we present a simple analytical method of estimating the number of users on a Mobile WiMAX system.This model has been developed for and used extensively in WiMAX Forum [2].
There are four goals of this paper.First, we want to present a simple way to compute the number of users supported for various applications.The input parameters can be easily changed allowing service providers and users to see the effect of parameter change and to study the sensitivity to various parameters.Second, we explain all the factors that affect the performance.In particular, there are several overheads.Unless This work was sponsored in part by a grant from Application Working Group of WiMAX Forum."WiMAX," "Mobile WiMAX," "Fixed WiMAX," "WiMAX Forum," "WiMAX Certified," "WiMAX Forum Certified," the WiMAX Forum logo and the WiMAX Forum Certified logo are trademarks of the WiMAX Forum.
C. So-In, R. Jain, and A. Tamimi is with the Department of Computer Science and Engineering, Washington Univeristy in St.Louis, St.Louis, MO, 63130.E-mail: cs5, jain, and aa7@cse.wustl.edusteps are taken to avoid these, the performance results can be very misleading.Note that the standard specifies these overhead reduction methods; however, they are not often modeled.Third, proper scheduling can make an order of magnitude difference in the capacity since it can change the number of bursts and the associated overheads significantly.Fourth, the method can also be used to validate simulation models that can handle more sophisticated configurations.
This paper is organized as follows.In Section II, we present an overview of Mobile WiMAX physical layer (PHY).Understanding this is important for performance modeling.In Section III, Mobile WiMAX system and configuration parameters are discussed.The key input to any capacity planning and evaluation exercise is the workload.We present three sample workloads consisting of Mobile TV, VoIP, and data applications in Section IV.Our analysis is general and can be used for any other application workload.Section V explains both upper and lower layer overheads and ways to reduce those overheads.The number of users supported for the three workloads are finally presented in Section VI.It is shown that with proper scheduling, capacity can be improved significantly.Both error-free perfect channel and imperfect channel results are also presented.Finally, the conclusions are drawn in Section VII.

II. OVERVIEW OF MOBILE WIMAX PHY
One of the key developments of the last decade in the field of wireless broadband is the practical adoption and cost effective implementation of an Orthogonal Frequency Division Multiple Access (OFDMA).Today, almost all upcoming broadband access technologies including Mobile WiMAX and its competitors use OFDMA.For performance modeling of Mobile WiMAX, it is important to understand OFDMA.Therefore, we provide a very brief explanation that helps us introduce the terms that are used later in our analysis.For further details, we refer the reader to one of several good books and survey on Mobile WiMAX [3], [4], [5], [6], [7].
Unlike WiFi and many cellular technologies which use fixed width channels, Mobile WiMAX allows almost any available spectrum width to be used.Allowed channel bandwidths vary from 1.25 MHz to 28 MHz.The channel is divided into many equally spaced subcarriers.For example, a 10 MHz channel is divided into 1024 subcarriers some of which are used for data transmission while others are reserved for monitoring the quality of the channel (pilot subcarriers), for providing safety zone (guard subcarriers) between the channels, or for using as a reference frequency (DC subcarrier).In OFDMA, each MS is allocated only a subset of the subcarriers.The available subcarriers are grouped in to a few subchannels and the MS is allocated one or more subchannels for a specified number of symbols.The mapping process from logical subchannel to multiple physical subcarriers is called a permutation.Basically, there are two types of permutations: distributed and adjacent.The distributed subcarrier permutation is suitable for mobile users while adjacent permutation is for fixed (stationary) users.Of these, Partially Used Subchannelization (PUSC) is the most common used in a mobile wireless environment [3].Others include Fully Used Subchannelization (FUSC) and Adaptive Modulation and Coding (band-AMC).In PUSC, subcarriers forming a subchannel are selected randomly from all available subcarriers.Thus, the subcarriers forming a subchannel may not be adjacent in frequency.
Users are allocated a variable number of in the downlink and uplink.The exact definition of slots depends upon the subchannelization method and on the direction of transmission (DL or UL).Figs. 2 and 3 show slot formation for PUSC.In uplink (Fig. 2), a slot consists of 6 where each tile consists of 4 subcarriers over 3 symbol times.Of the 12 subcarrier-symbol combinations in a tile, 4 are used for pilot and 8 are used for data.The slot, therefore, consists of 24 subcarriers over 3 symbol times.The 24 subcarriers  3.In the downlink, a slot consists of 2 clusters where each cluster consists of 14 subcarriers over 2 symbol times.Thus, a slot consists of 28 subcarriers over two symbol times.The group of 28 subcarriers is called a subchannel resulting in 30 DL subchannels from 1024 subcarriers at 10 MHz.
The Mobile WiMAX DL subframe, as shown in Fig. 1, starts with one symbol-column of preamble.Other than preamble, all other transmissions use slots as discussed above.The first field in DL subframe after the preamble is a 24-bit Frame Control Header (FCH).For high reliability, FCH is transmitted with the most robust MCS (QPSK 1/2) and is repeated 4 times.Next field is DL-MAP which specifies the burst profile of all user bursts in the DL subframe.DL-MAP has a fixed part which is always transmitted and a variable part which depends upon the number of bursts in DL subframe.This is followed by UL-MAP which specifies the burst profile for all bursts in the UL subframe.It also consists of a fixed part and a variable part.Both DL MAP and UL MAP are transmitted using QPSK 1/2 MCS.

III. MOBILE WIMAX CONFIGURATION PARAMETERS AND CHARACTERISTICS
The key parameters of Mobile WiMAX PHY are summarized in Table I  If of these are used for DL then 47 − are available for uplink.Since DL slots occupy 2 symbols and UL slots occupy 3 symbols, it is best to divide these 47 symbols such that 47 − is a multiple of 3 and is of the form 2 + 1.For a DL:UL ratio of 2:1, these considerations would result in a DL subframe of 29 symbols and UL subframe of 18 symbols.In this case, the DL subframe will consists of a total of 14×30 or 420 slots.The UL subframe will consist of 6 × 35 or 210 slots.
Table II lists the number of data, pilot, and guard subcarriers for various channel widths.A PUSC subchannelization is assumed, which is the most common subchannelization [3].
Table III lists the number of bytes per slot for various MCS values.For each MCS, the number of bytes is equal to [#bits per symbols × Coding Rate × 48 data subcarriers and symbols per slot / 8 bits].Note that for UL, the maximum MCS level is QAM-16 2/3 [2].
This analysis method can be used for any allowed channel width, any frame duration or any subchannelization.We assume a 10 MHz Mobile WiMAX TDD system with 5 ms frame duration, PUSC subchannelization mode and a DL:UL ratio of 2:1.These are the default values recommended by Mobile WiMAX forum system evaluation methodology and are also common values used in practice.The number of DL and UL slots for this configuration can be computed as shown in Table IV.

IV. TRAFFIC MODELS AND WORKLOAD CHARACTERISTICS
The key input to any capacity planning exercise is the workload.In particular, all statements about number of subscribers supported assume a certain workload for the subscriber.The main problem is that workload varies widely with types of users, types of applications and time of the day.One advantage of the simple analytical approach presented in this paper is that the workload can be easily changed and the effect of various parameters can be seen almost instantaneously.With simulation models, every change would require several hours of simulation reruns.In this section, we present three sample workloads consisting of Mobile TV, VoIP, and data applications.We use these workloads to demonstrate various steps in capacity estimation.
The VoIP workload is symmetric in that the DL data rate is equal to the UL data rate.It consists of very small packets that are generated periodically.The packet size and the period depend upon the vocoder used.G723.1 Annex A is used in our analysis and results in a data rate of 5.3 kbps, 20 bytes voice packet every 30 ms.Note that other vocoder parameters can be also used and they are listed in Table V.
The Mobile TV workload depends upon the quality and size of the display.In our analysis, a sample measurement on a small screen Mobile TV device produced an average packet size of 984 bytes every 30 ms resulting in an average data rate of 350.4 kbps [11], [12].Note that Mobile TV workload is highly asymmetric with almost all of the traffic going downlink.Table VI also shows other types of Mobile TV workload.
For data workload, we selected the Hypertext Transfer Protocol (HTTP) workload recommended by the 3rd Generation Partnership Project (3GPP) [13].The parameters of HTTP workload are summarized in Table VII.
The characteristics of the three workloads are summarized in Table VIII.In this table, we also include higher level headers, that is, IP, UDP, and TCP, with a header compression mechanism.Detailed explanation of PHS (Payload Header Suppression) and ROHC (Robust Header Compression) are presented in the next section.Given ROHC, the data rate with higher level headers ( ℎ ) is calculated by: Here, is the MAC SDU size and is the application data rate.Given the , number of bytes per frame per user can be derived from ℎ × .For example, for Mobile TV, with 983.5 bytes of MAC SDU size and 350 kbps of application data rate, with ROHC type 1, MAC SDU size with header is 983.5 + 1 bytes and as a result, the data rate with header is 350.4 kbps and results in 216 bytes per frame.

V. OVERHEAD ANALYSIS
In this section, we consider both upper and lower layer overheads in detail.

A. Upper Layer Overhead
Table VII which lists the characteristics of our Mobile TV, VoIP, and data workloads includes the type of transport layer used: either Real Time Transport Protocol (RTP) or TCP.This affects the upper layer protocol overhead.RTP over UDP over IP (12 + 8 + 20) or TCP over IP (20 + 20), both can result in a per packet header overhead of 40 bytes.This is significant and can severely reduce the capacity of any wireless system.
There are two ways to reduce upper layer overheads and to improve the number of supported users.These are Payload Header Suppression (PHS) and Robust Header Compression (ROHC).PHS is a Mobile WiMAX feature.It allows the sender to not send fixed portions of the headers and can reduce the 40-byte header overhead down to 3 bytes.ROHC, specified by the Internet Engineering Task Force (IETF), is another higher layer compression scheme.It can reduce the higher layer overhead to 1 to 3 bytes.In our analysis, we used ROHC-RTP packet type 0 with R-0 mode.In this mode, all RTP sequence numbers functions are known to the decompressor.This results in a net higher layer overhead of just 1 byte [5], [14], [15].
For small packet size workloads, such as VoIP, header suppression and compression can make a significant impact on the capacity.We have seen several published studies that use uncompressed headers resulting in significantly reduced performance which would not be the case in practice.
-PHS or ROHC can significantly improve the capacity and should be used in any capacity planning or estimation.-Note that one option with VoIP traffic is that of silence suppression which if implemented can increase the VoIP capacity by the inverse of fraction of time the user is active (not silent).As a result in this analysis, given a silence suppression option, a number of supported users are twice as much as that without this option.

B. Lower Layer Overhead
In this section, we analyze the overheads at MAC and PHY layers.Basically, there is a 6-byte MAC header and optionally several 2-byte subheaders.The PHY overhead can be divided into DL overhead and UL overhead.Each of these  three overheads is discussed next.
1) MAC Overhead: At MAC layer, the smallest unit is MAC protocol data unit (MPDU).As shown in Fig. 4, each PDU has at least 6-bytes of MAC header and a variable length payload consisting of a number of optional subheaders, data and an optional 4-byte Cyclic Redundancy Check (CRC).The optional subheaders include fragmentation, packing, mesh, and general subheaders.Each of these is 2 bytes long.
In addition to generic MAC PDUs, there are bandwidth request PDUs.These are 6 bytes in length.Bandwidth requests can also be piggybacked on data PDUs as a 2-byte subheader.Note that in this analysis, we do not consider the effect of polling and/or other bandwidth request mechanisms.
Consider fragmentation and packing subheaders.As shown in Table IX, the user bytes per frame in downlink are 219, 3.5, and 9.1 bytes for Mobile TV, VoIP, and Web, respectively.In each frame, a 2-byte fragmentation subheader is needed for all types of traffic.Packing is not used for the simple scheduler used here.
However, in the enhanced scheduler, given a variation of deadline, packing multiple SDU is possible.Table IX also shows an example when deadline is put into consideration.In this analysis, the deadline of Mobile TV, VoIP, and Web traffic are set to 10, 60, and 250 ms.As a result, 437.9, 42.0, and 454.9 bytes are allocated per user.These configuration results in one 2-byte fragmentation overhead for Mobile TV and Web traffic but two 2-byte packing overheads with no fragmentation for VoIP.Table IX also shows the detailed explanation of fragmentation and packing overheads in downlink.Note that the calculation for uplink is very similar.
2) Downlink Overhead: In DL subframe, the overhead consists of preamble, FCH, DL-MAP, and UL-MAP.The MAP entries can result in a significant amount of overhead since they are repeated 4 times.WiMAX Forum recommends using compressed MAP [3], which reduces the DL-MAP entry overhead to 11 bytes including 4 bytes for CRC [1].The fixed UL-MAP is 6 bytes long with an optional 4-byte CRC.With a repetition code of 4 and QPSK, both fixed DL-MAP and UL-MAP take up 16 slots.
The variable part of DL-MAP consists of one entry per bursts and requires 60 bits per entry.Similarly, the variable part of UL-MAP consists of one entry per bursts and requires 52 bits per entry.These are all repeated 4 times and use only QPSK MCS.It should be pointed out that the repetition consists of repeating slots (and not bytes).Thus, both DL and UL MAPs entries also take up 16 slots each per burst.
Equations ( 2) to ( 5) show the details of UL and DL MAPs overhead computation.
Here, is the repetition factor and is the slot size (bytes) given ith modulation and coding scheme.Note that basically QPSK1/2 is used for the computation of UL and DL MAPs.
3) Uplink Overhead: The UL subframe also has fixed and variable parts (See Fig. 1).Ranging and contention are in the fixed portion.Their size is defined by the network administrator.These regions are allocated not in units of slots but in units of .For example, in CDMA initial ranging, one opportunity is 6 subchannels and 2 symbol times.
The other fixed portion is Channel Quality Indication (CQI) and ACKnowledgements (ACK).These regions are also defined by the network administrator.Obviously, more fixed portions are allocated; less number of slots is available for the user workloads.In our analysis, we allocated three OFDMA symbol columns for all fixed regions.
Each UL burst begins with a UL preamble.Typically, one OFDMA symbol is used for short preamble and two for long preamble.In this analysis, we do not consider one short symbol (a fraction of one slot); however, users can add an appropriate size of this symbol to the analysis.

VI. PITFALLS
Many Mobile WiMAX analyses ignore the overheads described in Section V, namely, UL-MAP, DL-MAP, and MAC overheads.In this section, we show that these overheads have a significant impact on the number of users supported.Since some of these overheads depend upon the number of users, the scheduler needs to be aware of this additional need while admitting and scheduling the users [4], [16].We present two case studies.The first one assumes an error-free channel while the second extends the results to a case in which different users have different error rates due to channel conditions.

A. Case Study 1: Error-Free Channel
Given the user workload characteristics and the overheads discussed so far, it is straightforward to compute the system capacity for any given workload.Using the slot capacity indicated in Table III, for various MCSs, we can compute the number of users supported.
One way to compute the number of users is simply to divide the channel capacity by the bytes required by the user payload and overhead [4].This is shown in Table X.The table assumes QPSK 1/2 MCS for all users.This can be repeated for other MCSs.The final results are as shown in Fig. 5.The number of users supported varies from 2 to 82 depending upon the workload and the MCS.
The number of users depends upon the available capacity which depends on the MAP overhead, which in turn is determined by the number of users.To avoid this recursion, we use equations ( 6) to ( 8) that give a very good approximation for the number of supported users using a ceiling function: Here, is the data size (per frame) including overheads, is the bytes per frame, ℎ is 6 bytes.Subheaders are fragmentation and packing subheaders, 2 bytes each if present.and are the sizes of downlink and uplink map information elements (IEs).Note that and are fixed MAP parts and also in terms of bytes.Again, is the repetition factor and is the slot size (bytes) given ℎ modulation and coding scheme.# is the total number of DL slots without preamble and # are the total number of UL slots without ranging, ACK, and CQICH.
For example, consider VoIP with QPSK 1/2 (slot size = 6 bytes) and repetition of four.Equation ( 6) results 35 users in the downlink.The derivation is as follows: For uplink, from equations 7 and 8, the number of UL users is 87.
⌉ Finally, after calculating the number of supported users for both DL and UL, the total number of supported users is the minimum of those two numbers.In this example, the total number of supported users is 35, (minimum of 35 and 87).In this case, the downlink is the bottleneck mostly due to the large overhead.Together with silence suppression, the absolute number of supported users can be up to 2 × 35 = 70 users.Fig. 5 shows the number of supported users for various MCSs.
The main problem with the analysis presented above is that it assumes that every user is scheduled in every frame.Since there is a significant per burst overhead, this type of allocation will result in too much overhead and too little capacity.Also, since every packet (SDU) is fragmented, a 2byte fragmentation subheader is added to each MAC PDU.
What we discussed above is a common pitfall.The analysis assumes a dumb scheduler.A smarter scheduler will try to aggregate payloads for each user and thus minimizing the number of bursts.We call this the enhanced scheduler.It works as follows.Given users with any particular workload, we divide the users in groups of users each.The first group is scheduled in the first frame; the second group is scheduled in the second frame and so on.The cycle is repeated every frames.Of course, should be selected to match the delay requirements of the workload.
For example, with VoIP users, a VoIP packet is generated every 30 ms but assuming 60 ms is an acceptable delay, we can schedule a VoIP user every 12 ℎ Mobile WiMAX frame (recall that each Mobile WiMAX frame is 5 ms) and send two VoIP packets in one frame as compared to the previous scheduler which would send 1/6 ℎ of the VoIP packet in every frame and thereby aggravating the problem of small payloads.Two 2-byte packing headers have to be added in the MAC payload along with the two SDUs.
Table XI shows the capacity analysis for the three workloads with QPSK 1/2 MCS and the enhanced scheduler.The results for other MCSs can be similarly computed.These results are plotted in Fig. 6.Note that the number of users supported has gone up significantly.Compared to Fig. 5, there is a capacity improvement by a factor of 1 to 20 depending upon the workload and the MCS.
-Proper scheduling can change the capacity by an order of magnitude.Making less frequent but bigger allocations can reduce the overhead significantly.- The number of supported users for this scheduler is derived from the same equations that were used with the simple scheduler.However, the enhanced scheduler allocates as large size as possible given the deadlines.For example, for Mobile TV with a 10-ms deadline, instead of 219 bytes, the scheduler allocates 437.9 bytes within a single frame and for VoIP with 60 ms deadline, instead of 3.5 bytes per frame, it allocates 42 bytes and that results in 2 packing overheads instead of 1 fragmentation overhead.
In Table XI, the number of supported users for VoIP is 228.This number is based on the fact that 42 bytes are allocated for each user every 60 ms: With the configuration in Table XI, the number of supported users is ⌈ 175 9 ⌉ × 60 5 = 228 users.With silence suppression, the absolute number of supported users is 2 × 228 = 456.Note that the number of DL users is computed using equations 6, 7, and 8; and then equation 9 can be applied.The calculations for Mobile TV and Data are similar to that for VoIP.
The per-user overheads impact the downlink capacity more than the uplink capacity.The downlink subframe has DL-MAP and UL-MAP entries for all DL and UL bursts and these entries can take up a significant part of the capacity and so minimizing the number of bursts increases the capacity.
Note that there is a limit to aggregation of payloads and minimization of bursts.First, the delay requirements for the payload should be met and so a burst may have to be scheduled even if the payload size is small.In these cases, multi-user bursts in which the payload for multiple users is aggregated in one DL burst with the same MCS can help reduce the number of bursts.This is allowed by the IEEE 802.16e standards and applies only to the downlink bursts.
The second consideration is that the payload cannot be aggregated beyond the frame size.For example, with QPSK 1/2, a Mobile TV application will generate enough load to fill the entire DL subframe every 10 ms or every 2 frames.This is much smaller than the required delay of 30 ms between the frames.

B. Case Study 2: Imperfect Channel
In Section A, we saw that the aggregation has more impact on performance with higher MCSs (which allow higher capacity and hence more aggregation).However, it is not always possible to use these higher MCSs.The MCS is limited by the quality of the channel.As a result, we present a capacity analysis assuming a mix of channels with varying quality resulting in different levels of MCS for different users.
Table XII lists the channel parameters used in a simulation by Leiba et al. [17].They showed that under these conditions, the number of users in a cell which were able to achieve any particular MCS was as listed in Table XIII.Two cases are listed: single antenna systems and two antenna systems.
Average bytes per slot in each direction can be calculated by summing the product (percentage users with an MCS × number of bytes per slot for that MCS).For 1 antenna systems this gives 10.19 bytes for the downlink and 8.86 bytes for the uplink.For 2 antenna systems, we get 12.59 bytes for the downlink and 11.73 bytes for the uplink.
Table XIV shows the number of users supported for both simple and enhanced schedulers.The results show that the enhanced scheduler still increases the number of users by an order of magnitude, especially for VoIP and data users.

VII. CONCLUSIONS
In this paper, we explained how to compute the capacity of a Mobile WiMAX system and account for various overheads.We illustrated the methodology using three sample workloads consisting of Mobile TV, VoIP, and data users.Analysis such   as the one presented in this paper can be easily programmed in a simple program or a spread sheet and effect of various parameters can be analyzed instantaneously.This can be used to study the sensitivity to various parameters so that parameters that have significant impact can be analyzed in detail by simulation.This analysis can also be used to validate simulations.
However, there are a few assumptions in the analysis such as the effect of bandwidth request mechanism, two-dimensional downlink mapping and the imprecise calculation of slot-based vs. bytes-based.Moreover, we do not consider (H)ARQ [18].
In addition, the number of supported users is calculated with the assumption that there is only one traffic type.Finally, fixed UL-MAP is always in the DL subframe though there is no UL traffic such as Mobile TV [4].
We showed that proper accounting of overheads is important in capacity estimation.A number of methods are available to reduce these overheads and these should be used in all deployments.In particular, robust header compression or payload header suppression and compressed MAPs are examples of methods for reducing the overhead.
Proper scheduling of user payloads can change the capacity by an order of magnitude.The users should be scheduled so that their numbers of bursts are minimized while still meeting their delay constraint.This reduces the overhead significantly particularly for small packet traffic such as VoIP.
We also showed that our analysis can be used for loss-free channel as well as for noisy channels with loss.

Fig. 5 .
Fig. 5. Number of users supported in a lossless channel (Simple scheduler)

Fig. 6 .
Fig. 6.Number of users supported in a lossless channel (Enhanced Scheduler)

TABLE X EXAMPLE
OF CAPACITY EVALUATION USING A SIMPLE SCHEDULER

TABLE XI EXAMPLE
OF CAPACITY EVALUATION USING AN ENHANCED SCHEDULER