Throughput Maximization of Queueing Networks with Simultaneous Minimization of Service Rates and Buffers

1 Departamento de Estatı́stica, Universidade Federal de Minas Gerais, 31270-901 Belo Horizonte, MG, Brazil 2 School of Computer Science, University of Nottingham, Jubilee Campus, Wollaton Road, Nottingham NG8 1BB, UK 3 School of Computer Science & Software Engineering, The University of Western Australia, 35 Stirling Highway, Crawley, WA 6009, Australia 4 Departamento de Matemática, Universidade Federal de Ouro Preto, 35400-000 Ouro Preto, MG, Brazil 5 Departamento de Ciências Exatas, Universidade Estadual de Montes Claros, 39401-089 Montes Claros, MG, Brazil


Introduction
In this study, the maximization of throughput Θ the number of jobs, parts, clients, etc., served per unit of time in an acyclic, general-service time queueing network for an example, see Figure 1 was evaluated.To obtain the maximum Θ, the minimum number of buffers K {K 1 , K 2 , . . ., K n } and service rates µ {μ 1 , μ 2 , . . ., μ n } that must be allocated to a queueing network in a given topology and external arrival rate Λ {Λ 1 , Λ 2 , . . ., Λ n } was determined.Potential users of general-service time, finite-queueing network-based optimization models include computer scientists and industrial engineers.Indeed, these models may help to understand and improve various real-life systems, including manufacturing 1-5 , production 6-8 , and health 9-11 systems, urban or pedestrian traffic 12, 13 , computer and communication systems 14-17 , web-based applications with tiered configurations 18 , and quality-of-service QoS requirements measured in terms of response times, throughput, server availability, and security 19 .
This study focused on single-server queueing networks with exponentially distributed interarrival times and generally distributed service times, configured in an arbitrary acyclic, series-parallel topology.An example of the type of network under consideration is shown in Figure 1.In particular, buffer allocation, server allocation, and throughput tradeoff were optimized in networks of M/G/1/K queues.Thus, in Kendall 21 notation, we focused on Markovian arrivals, generally distributed service times, a single server, and the total capacity of K items, including the item of service.
Indeed, there is a critical tradeoff between the overall number of buffers and service rates and the resulting throughput.Buffers and service capacities can be very expensive; thus, the total number of buffers and the overall service capacity should be reduced as much as possible.On the other hand, the highest possible network throughput is also desired.Unfortunately, throughput is directly affected by the number of buffers allocated, where an increase in buffers generally leads to a higher throughput.Likewise, the service capacity also directly affects the throughput.
Figure 2 shows the throughput, Θ, for a single M/G/1/K queue with s 2 1.5 squared coefficient of variation of the service time and Λ 5 users per time unit external arrival rate , as a function of several values for buffer size, K, and service rate, μ see 2.1 and 2.2 , as well as the respective contour plot.Similar throughput behavior is also observed in a network of queues.The surface of the plot shown in Figure 2 is smooth, and convexity is apparent, which is similar to the results of simple queueing networks 22, 23 .However, the top of the surface plot near the maximum throughput is flat, which creates difficulties for traditional optimization methods.For instance, Smith and Cruz 20 used the Powell method and multiple starts to avoid premature convergence to a local optimum and to derive a successful optimization algorithm.
From a modeling point of view, throughput maximization can be defined by a mixedinteger mathematical programming formula, where the total buffer and server costs are minimized, and the throughput, subject to integer buffer allocation and nonnegative service  rates, is maximized.By defining a queueing network as a digraph of G N, A , where N is a finite set of nodes, and A is a finite set of arcs, the mixed-integer mathematical programming formula was obtained 24 minimize F K, µ 1.1 subject to where the decision variables K i and μ i indicate the total capacity of the service and the service rate for the ith M/G/1/K queue, respectively.The objective functions, F K, µ ≡ f 1 K , f 2 µ , −f 3 K, µ , are the total buffer allocation, f 1 K ∀i∈N K i , the overall service allocation, f 2 µ ∀i∈N μ i , and the overall throughput, f 3 K, µ Θ K, µ .Throughput is often modeled as a constraint that must be greater than a target minimum, rather than as an objective that must be maximized 20, 25 .However, to successfully solve the problem, throughput constraints must be relaxed.Thus, parameters such as the threshold throughput Θ τ must be determined beforehand.However, establishing an appropriate threshold is not a trivial task.Moreover, it is possible that a small decrease in throughput can result in a significant reduction in buffer allocation and costs .The tradeoff between throughput and the number of buffers is not apparent in a single-objective formula.Indeed, the weights ω of a single-objective function have a significant effect on both the objectives and parameters, including errors ε on performance measure estimates and threshold throughput Θ τ .Thus, weight determination is difficult, and the results of single objective optimization techniques can be arbitrary.
In this study, an optimization approach was developed that determines the entire set of Pareto-optimal solutions.Thus, our method produces a set of efficient solutions for more than one objective in the objective function 26 .With the proposed approach, the decision maker is able to evaluate the effect of solution replacement.Moreover, the multiobjective approach also allows the user to increase one objective e.g., throughput while simultaneously reducing another objective e.g., buffer and service rate allocation .A multiobjective evolutionary algorithm MOEA was used in combination with a generalized expansion method GEM , which is a well-known method for obtaining accurate approximations of queueing network performance 27-29 .MOEAs are particularly suitable for multiobjective problems and have been shown to perform well in similar multiobjective problems of networks e.g., see Carrano et al. 30 and references therein .
In this paper, a MOEA, specifically developed to multiobjective optimization, is presented see Section 2 .Additionally, the GEM, a performance evaluation tool used to approximate throughput, is also presented.In Section 3, the results of a comprehensive set of computational experiments are presented to show the efficiency of the approach.Finally, the article is concluded in Section 4, where final remarks and suggestions for future research are discussed.

Proposed Algorithms
The exposition of proposed algorithms was conducted in two parts.First, the performance evaluation algorithm was presented, which allowed the overall performance of the system to be estimated in terms of overall throughput, Θ K, µ , for a given configuration of the buffer and service allocation.Then the proposed optimization algorithm was developed, which was applied to obtain the optimal buffer and service allocation.

Single Queues
To maximize the throughput, Θ K, µ must be estimated.In a single M/G/1/K queue, the estimation of Θ K, µ can be achieved with a computationally efficient and accurate closedform approximate expression of the blocking probability, p K .The method proposed by Smith 31 , which is based on a two-moment approximation from Kimura 32 , was employed where ρ < 1 is the system utilization, which is the ratio between the total arrival rate and the service rate, ρ λ/μ.s 2 is the squared coefficient of variation of the service time, T s ; thus, The results indicated that the approximation of p K was accurate for a wide range of values 20, 25, 33 .
In order to obtain the throughput in a finite M/G/1/K single queue, we need to adjust the arrival rate.In fact a fraction p K of the arrivals cannot join the system because they have come when there is no waiting space left.Thus the actual rate of arrivals to join the system must be adjusted accordingly.Since Poisson arrivals see time averages the PASTA property , it follows that the effective arrival rate seen by the servers is λ eff λ 1 − p K 34 .Thus, the throughput may be given by θ λ eff λ 1 − p K . 2.2

Network of Queues
For a network of queues, the estimation of throughput is considerably more complicated.The generalized expansion method GEM is an algorithm that has been successfully used to estimate the performance of arbitrarily configured, finite queueing, and acyclic networks 29 .The GEM is a combination of node-by-node decomposition and repeated trials, where each queue is analyzed separately, and corrections are made to account for interrelated effects between network queues.The GEM uses type I blocking, where the upstream node becomes blocked if the service for an individual customer is complete and the queue at the downstream node is full.This is often referred to as "blocking after service," which is prevalent in most production, manufacturing, and transportation systems.
With the help of Figure 3, we now describe the GEM.Firstly, we remark that the exponential distribution is a good approximation for the interdeparture times of entities leaving an M/G/1/K node.In fact, it is possible to show the quasireversibility of a broader class of finite queues, which are the state-dependent M/G/C/C queues 35 .When those entities that are lost are included, the output stream is Poisson.This assumption is supported by several empirical results 7, 8, 13, 20, 25, 36 .The following three stages are involved in the GEM: network reconfiguration, parameter estimation, and feedback elimination.

Network Reconfiguration
This stage involves reconfiguring the network.An auxiliary vertex h j is created, which is modeled as an M/G/∞ queue with service rate μ h and precedes each finite queue j.When an entity leaves i, vertex j may be blocked with probability p K j or unblocked with probability 1 − p K j .Under blocking, the entities are rerouted to vertex h j for a delay while node j is busy.After this delay, the entity may be blocked again with probability p K j , for a second delay period.Vertex h j accumulates the time an entity has to wait before entering vertex j and the effective arrival rate to vertex j.

Parameter Estimation
In this stage, the parameters p K , p K , and μ h are estimated for clarity, we will omit the subscript for node j .
1 p K is obtained by means of a two-moment approximation recently developed by Smith 31 p K equation 2.1 .

2.3
2 p K is obtained with the following approximation from diffusion techniques given by Labetoulle and Pujolle 37 : where r 1 and r 2 are the roots to the polynomial with λ λ j − λ h 1 − p K , λ h is the actual arrival rate to the artificial holding node, and λ j is the actual arrival rate to the finite node j, given by 3 μ h is calculated as follows using renewal theory: where σ 2 j is the service time variance.

Feedback Elimination
The repeated visits to the holding nodes due to the feedback create strong dependence in the arrival process.Therefore, the repeated immediate feedback is eliminated.This is accomplished by giving the customer enough service time during the first passage through the holding node.The adapted service rate for the holding node μ h then becomes Summary 1.The goal of GEM is to provide an approximation scheme to update the service rates of upstream nodes that take into account all blocking after service caused by downstream nodes For each finite node j in the network succeeding node i, we have simultaneous nonlinear equations in variables p K , p K , and μ h , along with auxiliary variables such as λ and λ i .Solving these equations simultaneously, we can compute all the performance measures of the network Equation 2.10 through 2.13 is related to the arrivals and feedback in the holding node.Equation 2.14 through 2.16 is used to solve 2.13 with z used as a dummy parameter for simplicity.Lastly, 2.1 gives the blocking probability for the M/G/1/K queue.Thus, we essentially have five equations to solve 2.10 -2.13 and 2.1 .

Optimization Algorithm
For the network under consideration, MOEAs appeared to be a suitable choice for the multiobjective maximization of throughput.MOEAs are optimization algorithms that perform an approximate global search based on information obtained from the evaluation of several points in the search space 38, 39 .The population of points that converge to an optimal value are obtained through the application of genetic operators such as mutation, crossover, selection, and elitism.Each one of these operators characterizes an instance of a MOEA and can be implemented in several different ways.Additionally, MOEA convergence is guaranteed by assigning a value of fitness to each population member and preserving diversity.In fact, recent successful applications of GAs for single-objective applications were reported by Lin 40   x j x j 1 , x j 2 , . . ., x j n if x i is superior to x j in one objective f k x i < f k x j , for minimization and is not inferior in any other objective f x i / >f x j , for minimization .
For instance, Figure 4 displays the points for a two-dimension minimization problem.In the figure, point V is dominated by point I, but not by points II, III, and IV.Point VI is dominated by points I, II, and III, but not by point IV.The best front includes points I through IV and is an approximation for the Pareto set, which is the set of points that are superior to other points.To perform elitism, an algorithm commonly referred to as the fast nondominated sorting algorithm was employed details may be found in Deb et al. 44 .This algorithm separates the individuals in the population into several layers or fronts F i , such that the solutions in F 1 are nondominated, and every solution in a given front F i , i > 1, is dominated by at least one solution in F i−1 , and not by any solution in F j , j ≥ i.This can be achieved in O n log n time 44 .
Selection is performed by sequentially selecting points from each nondominated front F 1 , F 2 , . . .until the number of required individuals for the next iteration is obtained.Criteria must be applied if, after the addition of a group of individuals from F i , the maximum number x 1,(2,t) x 1,(1,t) x 2,(1,t) x 2,(2,t) x 1,(1,t+1) x 2,(1,t+1) x 1,(2,t+1) x 2,( x n, (1,t) x n, (2,t) x n, (1,t+1) x n,(2,t+1) of individuals is exceeded.The algorithm computes a measure of diversity the crowding distance, as defined by Deb et al. 44 to ensure the highest possible diversity.Thus, only the points with the largest crowding distance are kept for future iterations, as shown in Figure 5.
Crossover and mutation are somewhat independent of the multiobjective nature of the problem but are highly dependent on the application.For the problem at hand, a crossover mechanism known as "uniform" was selected 45 .Uniform crossover is popular in multivariable encodings due to its efficiency in identifying, inheriting, and protecting common genes, as well as recombining noncommon genes 46, 47 .In this mechanism, crossovers were performed for each variable with a probability rateCro that is in accordance with the crossover operator.The crossover operator used in the algorithm was the simulated binary crossover operator SBX 48, 49 , as illustrated in Figure 6.SBX is quite convenient for real-coded GAs because it is able to simulate binary crossover operators but avoids reencoding the variables.The children x i, •,t 1 were calculated from the parents x i, •,t according to the following equation:

2.18
Mathematical Problems in Engineering where β is a random variable obtained from the following probability distribution function:

2.19
The function was designed to create a child solution that possesses a similar search power to a single-point crossover of binary-coded GAs 48 .By adjusting η, several different weights β can be generated to produce children that are similar to their parents i.e., large η or not small η .Several distributions are shown in Figure 7.
For each individual gene the decision variables K i and μ i , the mutation scheme occurs with a specific probability rateMut .As suggested by Deb and Agrawal 48 , Gaussian perturbations were added to the decision variables, K i ε i and μ i ε N i , for all i ∈ N, with ε i ∼ N 0, 1 , i ∈ {1, 2, . . ., 2N}.
After crossover and mutation, constraints 1.2 may no longer apply.To guarantee feasibility, the values of integer variables were rounded accordingly and were readjusted by applying reflection operators where 1 is the lower limit of buffer allocation, μ lowlim i is the lower limit of service allocation to ensure that ρ < 1 is applicable , K i and μ i are the resulting values after crossover and mutation, and K rfl i and μ rfl i are the results after reflection.The proposed scheme generates feasible solutions without avoiding or favoring any particular solution.
Recently, the stopping criterion of multiobjective optimization evolutionary algorithms has been analyzed in detail see, e.g., Rudenko and Schoenauer 50 and Martí et al. 51 .Evidently, the maximum number of generations numGen plays an important role in the quality of the solutions.However, increasing the number of generation may not be ideal because computational time is wasted when many iterations do not lead to a significant improvement.Thus, Rudenko and Schoenauer 50 suggested that a superior stopping criterion is obtained when a fixed number of iterations are performed without improvement.To demonstrate the complexity of the issue, Rudenko and Schoenauer 50 conducted a comprehensive set of computational experiments.Their results revealed that an obvious stopping criterion, such as the entire population possessing a rank of 1, did not indicate that evolution should be terminated.The authors proposed a local stopping criterion that computes a measure of the stability of nondominated solutions after each iteration.Another global stopping criterion was recently proposed by Martí et al. 51 .This sophisticated method views population evolution as a dynamic system, where the state of the system can be estimated by a Kalman filter.For the sake of simplicity, the criterion of Rudenko and Schoenauer 50 was employed in this study.This criterion is based on the stabilization of the maximal crowding distance, d l , measured over L generations, and is calculated by the following standard deviation:

2.21
As shown in 2.21 , d L is the average of d l over L generations.Moreover, 2.21 indicates that the MOEA stops when σ L < δ lim .Rudenko and Schoenauer 50 suggested that σ L does not depend on the actual values of the objective function because crowding distances are normalized.Furthermore, they also suggested that L and δ lim should be set to 40 and 0.02, respectively, which leads to a stopping criteria that is σ 40 ≤ 0.02.According to empirical evidence, these values are compatible with the network under consideration.

Computational Results and Discussion
To use previous implementations of GEM based on the International Mathematics and Statistics Library IMSL , the optimization algorithm was implemented in Fortran 31, 33 .The code is available from the corresponding author upon request and must be used for educational and research purposes only.The computational experiments were conducted to discover the suboptimal set of parameters that guarantee rapid convergence.Moreover, the analysis of a large and complex network of finite queues was also achieved with the proposed algorithm.

Setting the Parameters
Unfortunately, to ensure rapid convergence with a minimal amount of computational effort, the suboptimal set of parameters must be determined by trial and error, as indicated by previous studies on GAs.Thus, networks containing 3, 5, and 10 queues were used to set the parameters of MOEA see Figure 8 .For the sake of conciseness, only the results obtained from 3 and 10 nodes are presented Figures 9-12 .Different topologies of acyclic similarly sized networks were also tested, and the results not presented were similar to those obtained from series topologies.In this study, each factor was analyzed independently; specifically, each factor was varied one at a time while the other parameters were held constant.Montgomery 52 reminds us that the major disadvantage of an independent analysis is that it fails to account for interactions between variables.However, recent experiments reported by Cruz et al. 53 indicated that potential interactions were insignificant; thus, interactions between factors were neglected in this study.

Mathematical Problems in Engineering
Figure 9 presents the convergence speed in terms of σ L as a function of the number of generations.It is possible to conclude that pure mutation could be used to determine the optimal solution sometimes pure mutation solves the problem, see Mathieu et al. 54 .However, the SBX operator was also utilized because it removed irregularities from the convergence profile.The combination of pure mutation and SBX provided satisfactory results, regardless of the number of queues in the network.
The results in Figure 10 revealed that the population size popSize had a significant effect on algorithm convergence.However, the population size cannot be arbitrarily increased because the required computational effort would become prohibitive.Moreover, the performance of the algorithm was not affected by an increase in the number of network nodes.
Figure 11 displays the convergence rate as a function of the parameter rateMut.The results revealed that an increase in the mutation rate accelerated convergence; however, once a specific rate was attained, a further increase did not lead to improved convergence.Under the experimental conditions, mutation rates between 1 and 2% provided superior results, regardless of the number of nodes.Figure 12 displays the convergence as a function of η, which controls the dispersion of β q in the SBX operator, 2.19 .A further improvement in the convergence speed was not observed for values of η greater than 8.
Finally, Figures 13 and 14 illustrate the population evolution, from the starting point to the final generation.They show the population spreading over to cover an increasing of objective space.In summary all of the attempted problems could be successfully solved by employing the following combination: a combined use of mutation and SBX, a population size of 400 individuals, a mutation rate of 2%, and a dispersion parameter η of 8.Moreover, the convergence seemed to be fairly independent of the topology results not shown, for split topologies, merge topologies, and so on , the external arrival Λ , the squared coefficient of variation of the service time s 2 , and the number of nodes of the network.Additionally, to ensure that the computation is complete within a finite amount of time, the maximum number of generations numGen must be predefined.In this study, numGen was set to 4,000 generations.

Analysis of a Large Complex Network
The complex network presented in Figure 1 was extracted from the literature 20 and analyzed with the proposed method.Three different squared coefficients of variation were  analyzed s 2 {0.5, 1.0, 1.5} at an arrival rate Λ 1 of 5.0.First, the convergence speed of the genetic algorithm was confirmed to be robust for this type of problem.The experimental setup was identical to the previous analysis.However, the results indicated that convergence was stable at 2,000 iterations.Moreover, as shown in Figure 15, the convergence was largely independent of the squared coefficient of variation.
The corresponding profiles are shown in Figure 16, including the contour plot and final surface after convergence.For comparison, an exact contour plot of a single-node queue is presented in Figure 2 b , and the resemblance between the two graphs was encouraging.However, the behavior of a given network cannot be predicted without the use of an algorithmic approach such as the one proposed here.A detailed analysis of the results in Figure 16 revealed that many different pairs of buffers and service rates can be selected for a given throughput.Additionally, it is possible to evaluate the results when the buffer size or service rate is so high that it does not have an effect on the throughput i.e., when the respective contour lines are parallel to the axes .Moreover, with the proposed algorithm, important insights related to the target throughput are obtained.For example, the results in Figure 16 d suggest that it is easier to increase the throughput from 2.6 to 3.1 20% improvement than 4.1 to 4.5 10% improvement .Contour lines that are far apart indicate that further improvements can be achieved only by dramatically increasing the buffer size and service rate.

Conclusions and Final Remarks
In this study, a multiobjective approach was developed to maximize the throughput of single server, general queueing networks.By combining the generalized expansion method GEM with a multiobjective evolutionary algorithm MOEA , insightful Pareto curves were obtained.These curves display the tradeoff between throughput, total buffer allocation, and overall service allocation.
Future investigations should be conducted to determine the applicability of this methodology for the determination of other optimal conditions in finite queueing networks.For instance, this method could be applied to optimize throughput in finite general, multiserver queueing networks or queueing networks with loops.Thus, the method could be used to model systems that lead to a reverse stream of products.Moreover, future research should be conducted to evaluate the algorithms in real-life situations.

Figure 1 :
Figure 1: A complex network adapted from Smith and Cruz 20 .

Figure 5 :
Figure 5: Illustration of the crowding distance.

Figure 9 :
Figure 9: Effect of crossover and mutation.

c
Final surface for s 2 1.

d 5 Figure 16 :
Figure 16: Final results for the 16-node network from Figure 1.

Algorithm 1 :
and Calvete et al. 41 , whereas Carrano et al. 30 employed GAs for multiple-objective applications.Additionally, the efficiency of GAs is well established for multiobjective problems 42, 43 .Many references are provided by the aforementioned authors.The instance of MOEA used in this study was based upon the elitist nondominated sorting genetic NSGA-II algorithm of Deb et al. 44 , which is shown in Algorithm 1 .In the application of GAs for multiobjective optimization, the selection operator and elitism operator must be specifically structured to correctly identify optimal conditions as shown shortly.Elitism is based on the concept of dominance.Point x i x i 1 , x i 2 , . . ., x i n dominates point algorithm read graph, arrival, service rates, G N, A , Λ i ∀ i ∈ N P 1 ← GenerateInitialPopulation popSize for i 1untilnumGendo / * generate offspring by crossover and mutation * / Q i ← MakeNewPop P i / * combine parent and offspring * / R i ← P i ∪ Q i / * find nondominated fronts F F 1 , F 2 , . . .* / F ← FastNonDominatedSort R i / * find new population by * / / * the crowding-distance-assignment * / P i 1 ← GenerateNewPopulation R i end for P numGen 1 ← ExtractParetoSet P numGen writeP numGen 1 end algorithm Elitist multiobjective genetic algorithm NSGA-II .