ANALYSIS OF A MULTI-SERVER QUEUEING MODEL OF ABR

In this paper we present a queueing model for the performance analysis of Available Bit Rate (ABR) traffic in Asynchronous Transfer Mode (ATM) networks. We consider a multi-channel service station with two types of customers, denoted by high priority and low priority customers. In prin ciple, high priority customers have preemptive priority over low priority customers, except on a fixed number of channels that a.re reserved for low priority traffic. The arrivals occur according to two independent Poisson processes, and service times are assumed to be exponentially distributed. Each high priority customer requires a single server, whereas low priority customers are served in processor sharing fashion. We derive the joint dis tribution of the numbers of customers (of both types) in ·the system in steady state. Numerical results illustrate the effect of high priority traffic on the service performance of low priority traffic.


Introduction
It is our pleasure to contribute this paper to the special issue in honor of Ryszard Syski. During a time spa.n of more than forty years, Professor Syski has made many lasting cont:ributions to Applied Probability in general and Teletraffic in particular.
The second author fondly remembers the very pleasant cooperation with Ryszard Syski in editing the book Queueing Theory and its Applications -Liber Amicorum for J. W. Cohen (North-Holland Pub!. Cy., Amsterdam, 1988).
The diverse characteristics and service requirements "of the different traffic types that are carried by ATM (Asynchronous Transfer Mode) networks have led to the definition of diffe!ent service categories that should be offered to users of such a network. We briefly discuss those differences, distinguishing three large categories: Con-stant Bit Rate (CBR) traffic, Variable Bit Rate (VBR) traffic and Available Bit Rate (ABR) traffic.
The CBR service class guarantees a fixed pre-determined transmission capacity to its users. Therefore, this service is useful for traffic that requires both very small (or no) delays and very small (or no) losses. At the burst-level (where we distinguish different bursts of traffic coming from the same connection, but not the separate ATM-cells that form a burst), it is reasonable to assume that all CBR traffic requires a fixed amount of capacity over time. In a.ll further considerations, we will leave out the CBR traffic and use the term "capacity" to indicate the total capacity minus the capacity reserved for CBR traffic.
For VBR traffic, we make a subdivision into real-time and non real-time connections. For both these subclasses, the users must specify many characterizing parameters such as minimum cell rate, mean cell rate 1 pea.k cell rate and maximum burst size. The difference lies in the requirements. The main issue for real-time connections such as voice and possibly video is the delay of the transmission; the loss of small amounts of information during the transmission is less important for these connections. This traffic lends itself very well for multiplexing. On the other hand, non real-time VBR traffic requires small losses and the delays are less important. To ensure that losses are small, large buffers a.re used to store non real-time VBR traffic when the communication network is heavily loaded.
The last category, ABR traffic, was introduced to cope with specific problems that arise when transmitting data. For this traffic, losses lead to retransmission of data (because of the extreme sensitivity to losses), which introduces a lot of overhead in implementations. Since transmission delays are of less importance for data traffic, the setting of non real-time VBR seems to be the appropriate one to carry data. traffic. However, data traffic is very bursty and the required para.meters for VBR connections are difficult to specify by the users. For ABR connections, no parameters need to be specified. Only a small amount of capacity is reserved for the transmission of ABR traffic. Additionally, the capacity that is not currently being required by VBR (and CBR) traffic is used for ABR traffic. When the total capacity currently available to ABR is too small, ABR traffic is stored in very large buffers, ensuring a small loss probability, until the available capacity increases a.gain. The advantage here is that ABR traffic gets all the ea.pa.city that is left over. For the server, this means a. higher utilization of the network's resources. As pointed out above, the main service guarantee for ABR traffic is a. very small loss fraction or, in principle, no loss at all. No guarra.ntee can be given on transmission delays.
A special issue of ABR is that the available capacity should be shared fairly among all ABR users. In queueing models, it seems reasonable to incorporate this with the queue discipline of processor sharing. In this discipline, a.II "customers" receive a.n equal share of the service capacity.
In addition to the large storage buffers, some feedback control mechanism can be used to keep the loss of information small. The buffers can store incoming data. that can not be transmitted immediately, due to a temporarily overloaded system. Feedback control can be used to slow down data sources when the buffers are heavily loaded and an overflow may occur. We refer to the A TM Forum (1,2] for more detailed specifications of ABR. Since the conceptual introduction of ABR, many papers on the subject have been published. Most studies so far emphasize the modeling and (feedback) control aspects, see for instance Ilia.dis (12) and Ritter [20]. In [21], Ritter investigates the problem of dimensioning the huffer for ABR traffic in order to a.void large losses. In [22] and [23], Ritter considers the case with feedback control, under the a55Umption that the source of ABR traffic is saturated (i.e., it sends continuously at the allowed rate).
A drawback in most studies is the assumption of a fixed available capacity for transmission of ABR traffic. As it was pointed out a.hove, one of the essential features of ABR is tha.t it makes use of the capacity that is left over by VBR traffic. Therefore, there i.s a need for a detailed performance analysis of ABR in the presence of other tra.ffic. In the present paper, our goal is to devise and ana.lyze a model that captures the influence of real-time VBR traffic on ABR traffic. We compare the performance of the ABR traffic in our model under variable available capacity, with the performance in an equivalent model with fixed available capacity.
Our model is basically a multi-server queue with two types of customers: (i) high priority customers (real-time VBR traffic); and (ii) low priority customers (ABR traffic). We &BBume that the high priority customers have a waiting room with a finite, typically small capacity -thus modeling the real-time requirement -and each accepted customer is served by a single server. Low priority customers have an infinite waiting room (buffer) and equally share the remaining capacity according to the processor sharing principle -this models the large storage buffers for ABR traffic and the fair sharing of the available capacity between ABR users. The servers a.t the service-station a.re divided into two groups: (i) there are N servers that are dedicated principally to the high priority customers (we call these the normal servers); and (ii) NL servers a.re purely reserved for the low priority traffic (we call these the £.servers). On the normal servers, the high priority customers have preemptive priority over the low priority customers.
We point out that this is a call-level model: A customer represents a request of an ABR source to transmit data., and the service requirement of the customer is identified with the a.mount of data to be transmitted. In our analysis, we assume that arrivals occur according to two independent Poisson processes. This assumption is justified in the case where many sources are connected to the communication network.
Although we present the model in the context of (future) ABR traffic, it can just as easily be seen in the context of existing situa.Uons, where real-time VBR has priority over non real-time VBR. In this case, the processor sharing discipline for the low priority traffic should be replaced by the First Come First Served (FCFS) discipline. Also, the processor sharing among ABR sources is interesting in the light of per VC (Virtual Connection) queueing, where sources do not queue behind one another, but each source gets a separate access to the server (parallel to one another). The feature of processor sharing can further be generalized to weighted fair queueing (generalized processor sharing), where the total capacity is divided between the active sources according to some weighting factors.
Related two-dimensional Markov models have been studied in a number of papers. The case where both types of customers have an infinite waiting space, and within each customer type the service discipline is FCFS, was solved first by Mitrani and King [15], and later by Gail,Hantler,and Taylor [8]. The non-preemptive variant of that model was studied by Gail,Hantler,and Taylor [7]. Falin,Khalil,and Stanford [5] treated the preemptive case with processor sharing among the low priority customers. A discrete-time variant modeled as a.n M/G/1-type Markov Chain is considered in Gail,Hantler,Konheim,and Taylor [6]. A more exLensive treatment of the spectral analysis of M/G/1-type Markov Chains is given in Gail,Hantler,and Taylor [9]. A model related to the one presented in this paper, addressing the case with finite buffer capacity, is treated in Nunez-Queija [18]. In [3], Blaabjerg et al. consider a model similar to ours and give various performance measures in terms of the steady-state distribution, rather than analyzing this distribution in greater detail. Our main goal is to give a detailed analysis of the steady-state distribution itself.
In our analysis we are inspired by Gail,Hantler,and Taylor [8], but we make use of methods from other approaches. Instead of transforming the involved distributions into generating functions, the present work focuses directly on the distribution itself. It does so relying mainly on the matrix geometric approach of Neuts [17] and the spectral expansion approach (see for instance Mitrani and Chakka (14] and Mitrani and Mitra [16]).
The paper is organized as follows. We give a description of the model in Section 2. In Section 3, we mention some relevant results of the theory of matrix-geometric solutions for the steady-state analysis of GI/M/1-type Markov Chains developed by Neuts [17]. We use this in Section 4 as a starting point of our analysis. In Section 5, we give a complete characterization of the joint distribution of the numbers of customers of both types in the system at steady state. Numerical results are presented in Section 6 to illustrate the effect of high priority traffic on the service performance of low priority traffic.

The Model
Consider a service station consisting of N + NL identical servers ( N and NL both are positive integers) that are divided into two groups: (i) A number of N servers, which we call the normal servers; and (ii) the remaining NL servers, henceforth called L--servers. Two types of customershigh and low priority customersrequire service from the station. At the station, there is a waiting room for K high priority customer (K being a nonnegative integer) and a room of infinite capacity for low priority customers.
High priority customers arrive at the station according to a Poisson process with rate ,\ H· If the N normal servers are all occupied by other high priority customers, then a newly arrived high priority customer takes his place in the finite waiting room. If there are already K other high priority customers in the waiting room, then the new customer is rejected and leaves the system without receiving service. If there are less than N other high priority customers currently being served, then a new high priority customer is immediately taken into service by one server. Also, if the service of a high priority customer is completed and the waiting room is not empty, then one of the waiting high priority customers immediately enters service. Service times of the high priority customers are assumed to be exponentially distributed with mean 1/ µH and independent of everything else.
Low priority customers arrive according to a Poisson process with rate AL, independent of high priority customers. Their service requirement is assumed to be exponentially distributed with mean 1/ µ L> independent of everything else. Furthermore, they are served according to the processor sharing discipline by the L-servers, and the normal servers that are not occupied by a high priority customer. Thus, if there are i high priority and j ~ 1 low priority customers present, then each of the low priority customers receives service at rate: (The servers work at unit rate).
We will further use the notation pH:= >.Hf µH and PL= =>.if µL. We are interested in the steady-state behavior of the numbers of both types of customers in the system. Let X H(t) (X dt)) be the number of high priority (low priority) customers present in the system at time t. Then the process (XH(t),XL(t)) is an irreducible and aperiodic Markovian process. Moreover, we note that the high priority customers are not influenced by the low priority customers, and therefore follow an

The process (XH(t),XL(t)) is ergodic if and only if the following (intuitive)
condition holds: We come back to this at the end of this section.
Note that 7f j_ is associated with the states in which j low priority customers are present.
This partition enables us to write the equilibrium vector as i = (i 0 , i 1 , ii' 2 , ... ). The corresponding infinitesimal generator is given by: The matrices T( + ) , T( -) , r(o) and Q 00 are of dimension N + K + 1. T( + ) , with QH being the infinitesimal genera.tor of the M/M/N/(N + K) queue of the high priority traffic: then all eigenvalues of R should lie inside the complex unit disc; and (ii) Q 00 + RT( -) should have a. positive vector in its left null space. The same theorem also gives us that the first statement is equivalent with (2). As for the second statement, it can be seen that if the first statement holds, then Q 00 +RT( -) is a generator, and considering Q 00 we see that it is an irreducible generator. Therefore the second requirement is immediately satisfied. We will not go further into the details of this, but refer the interested reader to Neuts [17). In the sequel, we assume that (2) holds.

Preliminaries
From the final results in Section 2, it is clear that the unique probability vector 7i' = (7r 0 ,'ii' 1 ,1i' 2 , ... ) satisfying 7i'C = 0 has the matrix-geometric form: where the matrix R is the minimal nonnegative solution to (6). Equation (7) can also be argued using basic results on irreducible Markov chains. In our analysis we shall use a different, but highly related representation based on the spectral expansion approach, see for example Mitrani and Chakka (14] and Mitrani and Mitra [16]. The essence of this approach is that we can rewrite Relation (7) to the "spectral expansion" form:

Spectral Analysis
In this section we investigate the eigenvalues of R. In the ergodic case, all these eigenvalues lie inside the complex unit disc (see Neuts [17]). We shall show that there are N + K + 1 of Lhem, and that they are all real a.nd positive.

The starting point of the analysis is (11). We investigate the zeros of det[T(z)],
showing that there are 2(N + K + 1) zeros: N + K + 1 zeros in (0, 1 ), one at z = 1, and N + K in (l,oo). The zeros in (0, l) a.re then identified with the eigenvalues of R. Proof: Note that T(z) is a tri-diagonal matrix with off-diagonal elements: 2, ... ,N +K. We denote the ith diagonal element T(z\i by t;(z): Here, i = 0, 1, . .. ,N +K -1. We now observe that the matrix T(z) is similar to a real symmetric matrix (i.e.,  (19)). First, since S(z) is symmetric, all its eigenvalues are real, and it is non-defective (the geometric multiplicity of each eigenvalue is equal to the algebraic multiplicity). Second, since S(z) is tri-diagonal, with non-zero elements directly above and below the diagonal, each eigenvalue has a unique eigenvector (up to multiplication by a scalar), i.e., the geometric multiplicity of each eigenvalue is 1. Combining these two facts, we are done.
Since the eigenvalues of T(z) and S(z) coincide, the same holds for T(z). D The fact that the eigenvalues of T(z) are real for real z simplifies the analysis considerably. In the sequel, we only consider the eigenvalues as real functions of the real variable z. Therefore, for real z, denote the eigenvalues of T(z) by: Since the rk(z) are continuous functions of z, together with (15) this immediately gives us that all the r k( z) for k = 0, 1, ... , N + K -1 cross the horizontal axis (at least once) somewhere in (0, 1).
If z increases to infinity, the matrix T(z) becomes strictly diagonally dominant with positive diagonal elements (the diagonal elements are convex quadratic functions in z and the off-diagonal elements are linear in z), and so for z large enough, all the eigenvalues of T(z) are positive. Therefore, all the rk(z) for k=O,l, ... ,N+K-1 must cross the horizontal axis again somewhere in (1,oo). 0 Theorem 4.3: Under the ergodicity condition in (2), TN+K(z)=O for some z E (0,1).

Proof:
Because of the continuity of rN + K(z) and the fact that rN + K(O) = >..L > 0, it is sufficient to show that r N + K( 1 -) < 0. First we write: where g(z) is the determinant of the matrix obtained by replacing the last column of T(z) by the sum of all columns and then dividing that column by 1z: We want to evaluate g(l). Therefore, we manipulate the above matrix evaluated in z = 1. First divide the last column by µL, and all the other columns by µH. Then add to each column (except for the first and the last one) all columns to the left of it. We now have: The last equality follows by expanding the determinant in its last column. Rearranging some terms, we rewrite this to: Under the ergodicity conditions in (2) On the other hand, det[T(l -)] = IT f = oTk(l -), and we know that rk(l -) < 0 (because of continuity and Tk(l) < 0) for k = 0, 1,, .. , N -1. Thus, we have proved that T N + K(l -) < 0, and hence that T N + x{z) has a zero in (0, 1). D Theorem 4.4: det[T(z)] has N + K + 1 roots in (0, 1 ), one at z = 1, and N + K in (1, oo). The roots inside (0,1) are precisely the eigenvalues of R.
Proof: By Theorems 4.2 and 4.3, we have found 2(N + K + 1) roots of det[T(z)] with the required positions. Since the degree of det[T(z)] is 2(N + K + 1) these are all the roots, proving the first assertion. Using (11), we see that the roots of the characteristic polynomial of R appear with at least the same multiplicity in det[T(z)]. Since all the roots of det(T(z)] have multiplicity one, the second assertion follows.  [O,oo), it follows that (at least) N -1 of them must cross the horizontal axis somewhere in w E (0, 1).

The Equilibrium Distribution
In Section 4, we have shown that R has N + K + 1 different eigenvalues in the interval (0,1); therefore, the equilibrium distribution can be written as in (8). We order the eigenvalues of R as 0 < r 0 < r 1 < ... < r N + K < 1 (note that r le is the root of The equilibrium distribution is fully determined a.s soon as we have 1f 0 , which must satisfy: We already mentioned at the end of Section 2 that Q 00 +RT( -) is a.n irreducible genera.tor, and therefore (19) has a. positive solution, which is unique up to multiplication by a. scalar. Obviously, if we let e be the (N + K + 1)-dimensiona.l vector with all elements equal to 1, it must be that: a[VQ 00 +AVT(-)] = 0, This determines a = ( <:ro, al,· ·.,er N + K ).
An alternative way of finding the coefficients Ctk in the present model is by using (1). Denoting by p the vector (p 0 ,p 1 , ... ,pN+K), with Pi= P{XH =i} (which a.re known quantities, see (1)

Numerical Results
In this section, we present some numerical results to illustrate the influence of varying server availability on the performance of low priority traffic. For normalization purposes we choose µ L = 1, and in all cases we take N :::::: 17 (in accordance with data supplied by KPN Research for The Netherlands). Further, we choose the extreme cases where there is no waiting room for high priority customers (K = 0), and no reserved capacity for low priority customers (NL= 0). Before discussing the numerical experiments, we first make some intuitive remarks about the cases when >.H and PH are very largeor very smallcompared to >.i, and µL. These intuitions can be proved formally. First, we fix >.L, µL, and PH and let µH (or equivalently .>.H) go to infinity. Note that with fixed PH• the mean number of servers available to the low priority customers, N -E[X HJ, is also fixed. As µH-+ oo, low priority customers are (with respect to the service times of high priority customers) in the system so long, that the mean number of available servers during the sojourn time of a low priority customer will be dose to the mean number of available servers in steady state (i.e., close to N -E[XH]). Therefore, it is to be expected that the low priority traffic in the limit (as µw-•oo) experiences the system as if it were an M/M/1 processor-sharing queue with server capacity c = N -E[X 11 ]. For the queue length distribution, this model coincides with that of the regular M/M/1 queue with traffic load PcL.
On the other hand, if we let µH-.O (again for fixed AL, µL, and Pn), the opposite happens: the number of servers available to low priority customers changes very slowly compared to their sojourn times. An arriving low priority customer finding no available server (there are N high priority customers present) must wait until one becomes available before receiving any service. The mean of this waiting time is ~, 1vµH and tends to inlinit.y as µH--+O. Since ihe probability of finding all servers occupied is positive (and completely determined by PH), the expected sojourn time of the low priority customers also goces to infinity, and by Little's Jaw, so does E[X i].
In our experiments, we are interested in the behavior of the mean and variance of the number of low priority customers in the system, at some fixed .~ystem load p: = p L + E[X 11 ]. Therefore, for different values of µH and with µL normalized to 1, we vary >.Land )..H, keeping p constant.
In Figures 1 and 2  In both figures, t.he top curve belongs to the case PH = 5, the second to PH = 1, the third to µ H = 5, aad the bottom curve to pH = oo. We see that E{ XL] and var[X L] are pa;rticularly aemsiti.ve to PH when PL and E{X HJ are of the same order.
In Figures 3 and 4, the same procedure is repeated for a system load of p = /oN. Nole that from (20)