Modeling and Optimization of M/G/ 1 -Type Queueing Networks: An Efﬁcient Sensitivity Analysis Approach

. A mathematical model for M/G/ 1-type queueing networks with multiple user applications and limited resources is established. The goal is to develop a dynamic distributed algorithm for this model, which supports all data tra ﬃ c as e ﬃ ciently as possible and makes optimally fair decisions about how to minimize the network performance cost. An online policy gradient optimization algorithm based on a single sample path is provided to avoid su ﬀ ering from a “curse of dimensionality”. The asymptotic convergence properties of this algorithm are proved. Numerical examples provide valuable insights for bridging mathematical theory with engineering practice.


Introduction
In the past decades, great efforts have been devoted to model and optimize the networkbased communication systems with the increasing transmitting demands and sophisticated performance criteria.However, technical challenges abound in designing such systems due to the limited network resources and the stochastic network characteristics.It is wellstudied that queueing theory is one of the primary tools used to deal with traffic engineering problems over both wired and wireless packet networks 1-3 .Factors affecting performance of network systems, based on the models in queueing theory, include the arrival rates or the interarrival time distributions , the service rates or the interservice time distributions , and the queue discipline.In this paper, we will concentrate on how to optimally and efficiently allocate the service rates to all concurrent user queues in each network element according to the arrival rates, so that the the lowest possible performance cost is achieved.The rest of paper is organized as follows.Section 2 starts by presenting the M/G/1type queueing system model.Next, Section 3 proposes the cost-benefit resource allocation optimization algorithm.To evaluate the performance of the proposed algorithm, numerical examples are provided in Section 4. Finally, the paper concludes with a short discussion in Section 5.

System Model
In this section, we model the queueing system according to the dynamic transmission procedure of each network user.With the formulation of M/G/1 user queues, firstly, we derive the steady-state Markov transition probability matrix.We then define the performance cost measure, based on which the objective function of the optimization algorithm is presented.Before digging into details, we summarize the used notations in Table 1.
Consider a network modeled by a topology graph, that is, G V, E , where V denotes the set of network elements nodes , and E represents the set of links.Note that if a link ι ∈ E, there exist a work-conserving server with time-invariant service capacity C ι , which serves packets and transfers them from source element to end element of ι.More precisely, consider that each element v ∈ V keeps a separate queue for every user traffic going through it, which is illustrated in Figure 1.For the simplicity of exposition, let N t be the number of user queues served at any given time t ≥ 0. Without loss of generality, consider that the capacity of each user queue i ∈ {1, . . ., N t } is upper bounded by a constant K, that is, M/G/1/K queueing model with limited backlog capacity K.The embedded Markov chain of the semi-Markov process The semi-Markov kernel P μi The transition probability matrix

A μi
The infinitesimal generator π μi The steady-state probability vector f μi f i The performance cost function η μi f The performance cost measure The performance potential vector D μi f The realization matrix The feasible region of the service rates allocated to the ith user queue Ω t The policy space for all user queues ν t A feasible resource allocation policy The stopping criterion of the policy algorithm l The iteration index of the policy algorithm sp The span seminorm Denote the service time allocated to user queue i as a general distribution G i s, t for any given time t ≥ 0. Since the users arrive at and depart from each network element randomly, we should allocate the service rates μ i t dynamically for all t ≥ 0 so that the performance cost at the element is minimized.
Then, we have where μ i t represents the mean service rates of user queue i at time t, satisfying ρ i λ i /μ i t < 1, λ i is the long-term average rate of the ith user arrivals, namely, the intensity of the Poisson arrival process.For ease of presentation, hereafter, μ i t and μ i will be used interchangeably.Let Γ i t ⊂ R be the feasible region of the service rates allocated to user queue i.To be more precise, we introduce some definitions to accurately modeling the ith user queue.
Definition 2.1 semi-Markov queueing process .A semi-Markov process X i {X i t , t ≥ 0} characterizes the ith user queue's behavior on the state space Φ {0, 1, . . ., K}, where X i t represents the number of packets in the queue after the latest packet left a network element at time t ≥ 0.
Remark 2.2.The semi-Markov kernel of X i can be further represented as where

2.3
Let Y i {Y i m ; m 0, 1, 2, . ..} be the embedded Markov chain of X i , where Y i m is interpreted as the number of packets in the ith user queue when the mth packet has been served.It is essential to note that Y i is positive recurrent, irreducible, and aperiodic under the condition of ρ i < 1 since X i is.Moreover, Y i has both the same steady-state probability vector π μ i π μ i 0 , . . ., π μ i K , π μ i k > 0, k ∈ Φ and the same steady-state performance cost measure discussed later as X i .According to 3 , X i has standard transfer probabilities p kj t , k, j ∈ Φ, and for any j with respect to k ∈ Φ, p kj t /t converges consistently to a constant for k / j as t → 0. In this state, the transition probability matrix P μ i of Y i can be derived as follows

2.5
Remark 2.3.Note that the symbol a i k represents the probability of k packets arriving at the time interval when a packet of the ith user queue is being served.The balance equation of each concurrent user queue i can be further expressed as where e 1, 1, . . ., 1 T is a K 1 -dimensional column vector whose all components are 1's, and the superscript "T " denotes transpose.
In principle, the performance cost measure is based on the definition of performance cost function.Note that performance cost is a commonly used term that changes its meaning with different network environments.Considering there is limited backlog space in each network element, therefore, the more backlog is occupied, the higher cost is paid.In this state, with the increasing of service rates, the backlog-related cost can be reduced accordingly.However, in many practical networks, especially considering the various wireless environments, transmission cost cannot be neglected.In general, the transmission power is considered as a convex increasing function with respect to the service rates.Thus, the design of performance cost function should trade off both the backlog related and the service rates related costs.Conceptually, we associate to each user queue a performance cost function defined as follows.
Definition 2.4 performance cost function .Consider a general performance cost function f i : Φ × Γ i → R associated to the ith user queue, which is the sum of the backlog-related cost ϕ 1 k and the service rates-related cost ϕ 2 μ i , that is, Suppose that the performance cost function f i is differentiable with the service rates μ i on Γ i .For ease of notation, hereafter, the terms f i and f μ i are used interchangeably throughout the paper.Now, it is imperative to define the performance cost measure as our objective function for each user queue.Motivated by 3, 14 , the definition is as follows.
Definition 2.5 performance cost measure .The performance cost measure η μ i f with respect to the service rates μ i for each user queue i is denoted as where f μ i f μ i 0 , . . ., f μ i K T is a K 1 -dimensional column vector, and E π μ i denotes the expectation with respect to the steady-state probability π μ i of the semi-Markov process X i in Definition 2.1.
Since the state space of X i is finite, we should note that for each nonnegative bounded performance cost function, there is Remark 2.6.Each user queue i has been modeled as a semi-Markov queueing process X i , based on which the transition probability matrix as an infinitesimal generator of X i under the service rates μ i , where a kk and a kj for k / j are elements of A μ i K 1 × K 1 and satisfy a kk < 0 and a kj ≥ 0 for k / j, k, j ∈ Φ.Thus, the infinitesimal generator is differentiable with respect to μ i .Note that the elements of represent the transition rate of the packet number in each user queue.More importantly, the cost-benefit optimization algorithm can be further developed for all M/G/1-type user queues in Section 3 by changing the service rates μ i allocated so that the corresponding infinitesimal generator of each user queue is modified.

Resource Allocation Algorithm
In this section, we take a fresh look at the problem of resource allocation from the perspective of system performance cost and explore a cost-benefit gradient algorithm that minimizes the performance cost for all concurrent M/G/1-type queues, subject to the bandwidth constraint.

Problem Formulation and Optimality Criterion
Since we focus on the stochastic dynamic queueing system, the estimation of its statistical properties is essential.In addition, such estimation needs not only to be accurate but more importantly, to be efficient when taking into consideration the delay sensitiveness of the realtime network applications.Consider that the main tenet of perturbation analysis PA is that a great deal of information is contained in the sample paths of a dynamic system, beyond the usual statistics collected such as the means and variances of various variables 15 .Thus, in essence, we can estimate the performance gradient with respect to the service rates and further minimize the user queue's performance cost measure based on a single sample path with PA.
In particular, several PA approaches have been introduced in solving network problems see, e.g., 16, 17 .However, a general approach that supports a wide range of stochastic optimization problems awaits to be proposed.A new approach was proposed in 18 to analyze a number of Markov systems based on a single sample path.Moreover, the optimization formulations for Markov 14, 19, 20 , semi-Markov 21 , and partially observable Markov 22 systems were proposed, and in 3 , the theory has successfully been extended to evaluate the M/G/1 queueing systems.The structure of PA-based queueing system is shown in Figure 2.For each feasible resource allocation policy, a set of service rates are allocated to each user queue.With each change of one user queue's service rate, a perturbation is generated on the queue's sample path, which has effect on the system performance cost.As illustrated in Figure 3, in an M/G/1-type user queue {Y i m ; m 0, 1, 2, . ..}, such a perturbation can be regarded as a "jump" among its states k ∈ Φ and has effect on the performance cost η μ i f .Thus, we need to measure all states' effect on the performance cost η μ i f before discussing the performance cost optimization.We briefly introduce a concept called performance potential that is useful in this paper 18 .
Definition 3.1 performance potential .Denote g μ i k as the ith user queue's performance potential of state k ∈ Φ under service rates μ i with respect to the performance cost function f i .It measures the effect of state k to the performance cost η μ i f and can be written as The performance potential vector of the M/G/1/K user queue {Y i m } with respect to the performance cost function f μ i is denoted as g μ i g μ i 0 , . . ., g μ i K T .In essence, the performance potential vector is the solution of the Poisson equation, which has been studied remarkably in the literature 23 : Furthermore, if {Y i m } is strongly ergodic, all of the performance potential vectors can be calculated by , where A μ i # is said to be the group inverse for details, see 18 of the {Y i m }'s infinitesimal generator A μ i under service rates μ i .Now, we need to assign the feasible service rates region to each active user queue.It is well-studied from queueing theory that an M/G/1-type user queue Therefore, the service rates policy space in Figure 2 is defined as follows.
Definition 3.3 feasible policy space .The policy space for all user queues at epoch t ≥ 0 can be denoted as a compact set Ω t , where Ω t Next, we analyze the optimal criterion for the performance cost optimization. 3.5 Proof.Note that Γ i t is a compact set, and f μ i t A μ i t g μ i t is component-wise continuous on Γ i t .Thus, there must exist at least one cost optimal service rates in Γ i t .
According to 20 , we have the fact that the service rates μ * i t for ith user queue is cost optimal if and only if where the symbol denotes vector inequality or component-wise inequality in R K 1 , and K 1 is the the states' number of each user queue.Note that a better cost-benefit service rates can be searched based on the comparison of current service rate.Besides, from 3.2 , we have eη . Thus, we can conclude that if the μ * i t is the cost optimal service rates, the following equation: is established and vice versa.

Gradient-Based Policy Optimization
Now, the purpose is to develop an efficient and practical policy algorithm, which minimizes all the user queues' performance cost based on the corresponding sample paths.In essence, the objective function for each user queue in Definition 2.5 represents the timeaverage performance measure of the M/G/1 queue.Thus, developing a global optimization algorithm for such a performance measure will greatly increase the complexity.More importantly, to some extent, it is impractical to consider both the delay sensitiveness of user applications and the dynamic changes of the network element.In this state, a fast gradientbased optimization for the stochastic system is considered.It is well studied that a gradient optimization algorithm is to find a local minimum of objective function; however, it can be fairly efficient, especially when the interval Γ i t for each user queue i is not very large.To begin with, the performance gradient formula for each user queue is derived as follows.
Theorem 3.5 performance gradient .For any given resource allocation policy ν t μ 1 t , . . ., μ N t t ∈ Ω t at each event time t ≥ 0, k 1, 2, . .., the gradient of the performance cost η μ i t f generated by the ith user queue is obtained by 3.8 Proof.By taking the gradient of the performance measure 2.8 , we have ∇η μ i t f ∇π μ i t f μ i t π μ i t ∇f μ i t .

3.9
From 14, 20 , there must be particular solution to 3.2 , such that η μ i t f π μ i t g μ i t ; thus, we obtain −A μ i t eπ μ i t g μ i t f μ i t .

3.10
With P μ i t A μ i t I, we further have I − P μ i t eπ μ i t g μ i t f μ i t .

3.11
Left-multiply by ∇π μ i t on both sides of 3.11 , it follows that ∇π μ i t f μ i t ∇π μ i t I − P μ i t eπ μ i t g μ i t .

3.12
Recall that π μ i t e 1, P μ i t e e, andπ μ i t P μ i t π μ i t , then we have ∇π μ i t P μ i t π μ i t ∇P μ i t ∇π μ i t , ∇π μ i t e 0.

3.13
Hence, using 3.12 , it suffices to show that ∇π μ i t f μ i t π μ i t ∇P μ i t g μ i t .

3.14
Combining 3.9 with 3.14 , the result then follows.
Then, we proceed to describe the process flow of the policy gradient algorithm shown in Figure 4.The procedure of the algorithm is described in Algorithm 1.Note that in Algorithm 1, sp h max is defined as the span seminorm on R n .Moreover, the construction of the algorithm is presented as follows.The algorithm begins by choosing an arbitrary feasible policy for all user queues at given time t.Then, with current service rates, the corresponding performance gradient is calculated by analyzing the sample path of each user queue.Based on the line search along the gradient, the right step size is obtained.Thus, a better policy can be updated for each user queue.By iteration until the stopping criterion is met, the optimal cost-benefit resource allocation policy can finally be achieved.Without loss of generality, suppose that every gradient iteration of each user queue leads to an improving performance cost, that is, where l ∈ Z represents the iteration index in Algorithm 1.The convergence property of the policy gradient algorithm is evaluated as follows.
Theorem 3.6 convergence property .Consider ν l t μ l 1 t , . . ., μ l N t t ∈ Ω t , l ∈ Z is a performance improving resource allocation policy sequence at each given time t ≥ 0, then for all i ∈ {1, . . ., N t }; one has: 3.17 Determine the gradient such that: Do line search along the gradient, choose the right step size γ l i .8 Update service rates μ l 1 i t : b That there must exist an optimal cost-benefit resource allocation policy, denoted as

3.18
Proof.For part a Since ν l t μ l 1 t , . . ., μ l N t t ∈ Ω t , l ∈ Z , i ∈ {1, . . ., N t } is a performance improving resource allocation policy sequence, we can conclude that for each user queue i, {η } is a monotonously decreasing and bounded performance cost sequence, with lower bound η μ * i t f .

By using the continuity of η μ i t f
, we obtain a service rates μ i t ∈ Γ i t for each user queue i, satisfying η , ∀k ∈ Φ.

Mathematical Problems in Engineering
Recall that 3.21 by taking limit as l → ∞, we further have For part b Note that for all > 0, ∃l 0 ∈ Z , such that sp To show the second relation of the theorem, considering l > l 0 ; it follows that

3.23
Since in Algorithm 1, μ l 1 i t is the best choice for μ l i t along the gradient by the line search, we therefore have k .

3.25
Thus, it can be regarded that μ l 1 i t is an -optimal service rates, and we can write by the arbitrariness of .
Finally, from the fundamental fact of the uniqueness principle of limitation theory, we can conclude that η , which is equivalent to say that μ i t is an optimal cost-benefit service rates.This leads immediately to the results that lim l → ∞ μ l i t μ * i t and lim l → ∞ η μ l i t f η μ * i t f .Remark 3.7.By proving the convergence property of the policy gradient algorithm, the cost-benefit resource allocation optimization approach for the M/G/1/K user queues has been proposed.In a broad sense, the service time allocated to each user queue i has been considered as a general distribution G i .However, the mathematical expression of performance gradient is not unique according to different application scenarios.

Performance Gradient Analysis of Application Scenarios
To make the analysis more tractable, we now present two application scenarios for which the performance gradient can be explicitly derived.

Deterministic inter-service time.
This is the simplest practical case where inter-service times for each user queue i are deterministic, that is, M/D/1/K.Without loss of generality, we denote each inter-service time as a constant 1/μ i .Consider now that ρ i λ i /μ i < 1.Let P μ i be the transfer matrix of the embedded Markov chain.In this scenario, hence, we have that for all k ∈ Φ, the element a i k of the transfer matrix is equal to, implying that the element of steady-state probability vector π μ i k , k ∈ Φ satisfies

3.27
According to 3.26 , hence, we have

3.28
Corollary 3.8.Considering the ith user queue in the steady state, from Theorem 3.5, the performance gradient with respect to the service rates μ i for the case of M/D/1/K is where π μ i π μ i 0 , . . ., π μ i K , g μ i g μ i 0 , . . ., g μ i K , and π μ i −1 0.

Exponential inter-service time.
Having considered the straightforward scenario of deterministic service rates, we will now investigate the M/M/1/K case where the inter-service times are independent and exponentially distributed, that is, memoryless.For ease of notation, we show results using the same parameters as in the first scenario.Similarly, in this case, we derive the element a i k , for all k ∈ Φ of the transfer matrix as follows: 3.30 it suffices to show that,

3.31
Corollary 3.9.Considering the ith user queue in the steadystate, from Theorem 3.5, the performance gradient with respect to the service rates μ i for the case of M/M/1/K is

3.33
Remark 3.10.Note that the essential feature behind the cost-benefit resource allocation is the performance gradient corresponding to the service rates.Based on this, the policy gradient algorithm for each user queue is executed, until the performance cost of network system is minimized.

Performance Evaluation
In this section, we investigate the performance of the M/G/1-type queueing system.Before proceeding, we first present the simulation model, namely, the sample path-based simulation scheme.

Simulation Model
To evaluate the performance of Algorithm 1 for each user queue i, it is imperative to calculate both the steady-state probability vector π μ l i t and the performance potential g μ l i t with any feasible service rates μ l i t in every iteration l 0, 1, 2, . . . .
According to the Borel property 24 , the steady-state probability of state k in the embedded Markov chain Y i m has an unbiased estimate as follows: where and ζ denotes the transfer number of the queue states and is set to 10000 in this simulation.Once the steady-state probability vector has been estimated, the estimation of the performance cost η μ i f for the user queue can be derived by η However, solving the Poisson equation in 3.2 for achieving g μ l i t leads to significant computational complexity, which is impractical for the online cost-benefit optimization.Therefore, a sample path-based estimation is very essential.Note that the performance potential can be estimated as where , k, j ∈ Φ is called the realization matrix, and Thus by Theorem 3.5, the gradient corresponding to the service rates μ l i t allocated in the lth iteration for the user queue i can be efficiently estimated by

Numerical Results
The following describes numerical examples to illustrate the analytical results derived in the previous sections.Consider a given time t 0 > 0 and that there are four active user queues in the network element.Here, we limit our experimental tests to the simulation parameters values that are depicted in Table 2.Note that the value of λ i t 0 , μ min i t 0 , and μ max i t 0 can be considered as packet numbers transmitted per unit time.
Besides, the backlog capacity of every user queue is set to 100 packets.The performance cost function of the sharing user queues is considered as 1, 2, 3, 4, k ∈ Φ, where c 1 , c 2 > 0 are constants.In this experiment, they are set to 80 and 1, respectively.Moreover, the stopping criterion in Algorithm 1 is set to 0.001, and the link service capacity C ι is set to 200 packets per unit time.By choosing the initial resource allocation policy as the minimum service rates for all user queues, the simulation results are described in Table 3.Finally, the iteration processes of the four user queues are shown in Figure 5.
Based on the observation on Figure 5, we can conclude the following.
i Given the the stopping criterion 0.001 and each feasible region Γ i t 0 , i ∈ {1, . . ., N t 0 }, all iterations for user queues can converge within 15 steps.
ii During the algorithm iterative process, the corresponding performance cost has been reduced by 56.3%, 55.7%, 52.3%, and 44.5%, respectively for each user queue.
iii The optimal service rates for each user queue e.g., μ * 1 t 0 may not be achieved on the boundary i.e., μ min 1 t 0 or μ max 1 t 0 of the feasible region shown in Table 2.
Remark 4.1.Note that the convergence rate of the proposed algorithm is closely related with the estimation of the performance gradients.The convergence rate of the algorithm will grow faster with a more accurate estimation.More precisely, according to Theorem 3.5, the performance gradient can be calculated via the steady-state probability vector π μ l i t and the performance potential g μ l i t in every iteration l ∈ Z .Thus, we can increase the number of queue states' transfers, thereby achieving the estimation accuracy of steady-state probability vector and performance potential, or equivalently, performance gradient.In addition, the reason for the location of the optimal service rates is that there exists a trade-off between the backlog-related and service rate-related performance cost, which can be adjusted by the definition of the performance cost function.

Conclusions
In this paper, performance optimization problems of communication networks with stochastic characteristics are studied.To describe this complex dynamic process of system behavior, all user queues in each network element are represented by multiple concurrent M/G/1-type Markov processes such that system model is proposed.Furthermore, an efficient algorithm is developed for the optimization of system performance cost by using sensitivity analysis approach.During every iteration, the proposed algorithm estimates the derivative of the performance measure and the performance potential by analyzing a single sample path of each user queue, which implies its computational efficiency.The asymptotical convergence analysis, combined with the numerical examples, paves the way for designing cost-aware computer communications systems.

Figure 2 :Figure 3 :
Figure 2: Structure of user queueing system with PA.
Begin Per-user, calculate current performance gradient and choose the right step size Update to a better resource allocation policy Meet the stopping criterion?Achieve the optimal cost-benefit policy End Yes No Choose an arbitrary policy, namely the initial service rates for each user queue Choose a stopping criterion for all queues

11 end for Algorithm 1 :
Policy gradient algorithm.

Figure 5 :
Figure 5: The iteration processes of the four user queues.
the network element is stable if and only if all individual user queues are stable.Suppose that the lower bound service rates of each user queue at epoch t ≥ 0 are set to μ min Thus, at epoch t, the service rates of user queue i are upper bounded by, Theorem 3.4 optimality criterion .A cost-benefit resource allocation policy ν * t μ * 1 t , . . ., μ * N t t ∈ Ω t is optimal with each given initial policy if and only if for each user queue i, one has An arbitrary initial policy ν t μ 1 t , . . ., μ N t t ∈ Ω t at given t ≥ 0. Output: The optimal policy for all user queues ν * t μ * 1 t , . . ., μ * N t t ∈ Ω t . Input:

Table 3 :
Optimal resource allocation policy.