Finite queueing modeling and optimization : a selected review

This review provides an overview of the queueing modeling issues and the related performance evaluation and optimization approaches framed in a joined manufacturing and product engineering. Such networks are represented as queueing networks. The performance of the queueing networks is evaluated using an advanced queueing network analyzer: the generalized expansion method. Secondly, different model approaches are described and optimized with regard to the key parameters in the network (e.g., buffer and server sizes, service rates, and so on).


Introduction
Queueing theory is the mathematical study of waiting lines and it enables the mathematical analysis of several related processes, including arrivals at the queue, waiting in the queue, and being served by the server. The theory enables the derivation and calculation of several performance measures which can be used to evaluate the performance of the queueing system under study. More specifically, the focus in this paper is on finite buffer queueing networks which are characterized by the blocking effect, which eventually degrades the performance, commonly measured via, for example, the throughput of the network.
Queueing modeling and optimization of large scale manufacturing systems and complex production lines have been and continue to be the focus of numerous studies for decades (e.g., see Smith [1][2][3]). Queueing networks are commonly used to model such complex systems (see Suri [4]). The main reason to use queueing modeling is because of their ability to accurately model the resource allocation problems we are interested in (i.e., the buffer, server, and bufferserver combination problems), under approximate Poisson arrivals and general service rates. Of course, there are other methodologies tailored for different settings as, for instance, simulation methods (see Law and Kelton [5]) and advanced methods that explore the spectral characteristics of the associated matrices in Markovian models (e.g., see the works of Yeralan and Tan [6] and Fadiloglu and Yeralan [7], among others).
This review aims at providing an overview of modeling, performance evaluation, and optimization approaches from a queueing theory point of view. Additionally, the algorithms selected were implemented and tested in some basic queueing network topologies, namely, series, merges, and splits. The numerical results provide new insight into this important class of manufacturing network design problems.
The paper is structured as follows. In Section 2, we present the performance evaluation method considered for the queueing networks analyzed. Also in Section 2, we elaborate on the different optimization models that exist and discuss some of the optimization tools that are used to optimize these models. Section 3 gives computational results for some selected optimization models for a complex network. Finally, Section 4 concludes the paper and gives some pointers for future research in the area.

Journal of Applied Mathematics
In closed queueing networks, customers never leave or enter the system but a fixed number of customers circulate within the network (see Whitt [9] and Smith [10]). Mixed queueing networks are systems that are open with respect to some customers and are closed with respect to other customers (see Balsamo et al. [11]). Research in the area of queueing networks is very active as we shall see and resulted in a vast amount of journal papers, books, and reports. For a more general discussion on queueing networks, the reader is referred to Walrand [12] among others. In this paper, we will focus on finite queueing networks.
The assumption is that the capacity of the buffer space between two consecutive connected service stations is finite. As a consequence, each node in the network might be affected by events at other nodes, leading to the phenomena of blocking or starvation. In the literature, two general blocking mechanisms are presented: blocking after service (BAS) and blocking before service (BBS). BAS occurs when after service a customer sees that the buffer in front of her/him is full and as a consequence s/he cannot continue her/his way in the network. BBS implied that a server can start processing a next customer only if there is a space available in the downstream buffer. If not, the customer has to wait until a space becomes available. Most production lines operate under BAS system. Moreover, in the literature it is the most commonly made assumption regarding the buffer behavior (see Dallery and Gershwin [13]).

Network Performance Evaluation.
Performance evaluation tools for queues include product form methods, numerical methods, approximate methods, and simulation. Let us discuss each of them a bit more in detail. More in-depth information can be found in the references mentioned as follows.
Initially, the product form methods decompose the system into single pairs or triplets of nodes instead of analyzing the entire system at once. Details may be found in Perros [14]. Each decomposed node can then be treated as an independent service provider, for which all results and insights of the single node queueing models can be used (see, e.g., Gross et al. [15]). Jackson [16,17] firstly showed that the joint distribution of the entire network is made up of the product of the marginal distributions at each of the nodes under some strict conditions (e.g., exponential arrivals and services, under no blocking). A decomposition technique yields exact results for queueing networks with product form solutions. For networks without a product form solution, the technique gives a good approximation (see Balsamo et al. [11]).
The numerical methods are also useful, as in theory these methods can be used to solve every Markovian model. The problem, however, with numerical solutions is that the state space of queueing networks grows exponentially with the number of nodes, the number of customers, and the number of buffers. As a consequence, numerical methods consume extensive computer time to get to the solution. Numerical methods are applied to smaller networks though (e.g., see Balsamo et al. [11]).
Among the approximate methods, the decomposition methods are very popular. These methods are approximate because the subnetworks are only a part of the whole line and, as such, do not have exactly the same behavior (see Dallery and Gershwin [13]). However, if obtaining an exact solution is too expensive in terms of computational effort, approximate methods are justified. The main challenge with approximate methods is to be as close as possible to the exact values. The accuracy of an approximate method can be tested with numerical solutions (for smaller networks) or by using simulation. The main idea of the decomposition methods is to try to generalize the ideas of independence and product form solutions from the Jackson networks to more general systems. Reiser and Kobayashi [18] and Kuehn [19] were the first to develop this approach. After them, several researchers came up with a similar approach (e.g., Buzacott and Shanthikumar [20] and Alves et al. [21]).
Finally, simulation is another way to obtain all relevant performance measures for a queueing network (see Law and Kelton [5]). Successful results on simulation based methods were reported by Cruz et al. [22,23], Pereira et al. [24], Dorda and Teichmann [25], and Cardoso et al. [26], among many others.

The Generalized Expansion Method.
In this paper, the Generalized Expansion Method (GEM) is used as the prime performance evaluation tool. Consequently, this paper provides a selected review based on the GEM and does not explicitly consider other methodologies to obtain the performance measures. Note that the models described fit any performance evaluation tool.
In general, we evaluate the performance of the network via its throughput . This throughput (and all other measures, e.g., blocking probability, sojourn time, work-in-process, etc.) can be obtained via a queueing network representation. This queueing network representation then needs to be "solved" to obtain the performance of the given network. Notice that we will focus here on / / / queueing models, which in Kendall notation means a queueing system with Markovian arrival rates, generally distributed service times, servers in parallel in the queue, and a total capacity of users in the queue (including those under service).
As it will be detailed soon, the GEM transforms the queueing network into an equivalent Jackson network, which can be decomposed so that each node can be solved independently of each other (similar to a product form solution approach). The GEM is an effective and robust approximation technique to measure the performance of open finite queueing networks. The effectiveness of GEM as a performance evaluation tool has been presented in many papers, including Kerbache and Smith [27][28][29], Jain and Smith [30], Smith [31], and Andriansyah et al. [32].
The GEM uses BAS, which is prevalent in most systems including production and manufacturing (see Dallery and Gershwin [13]), transportation (see van Woensel and Vandaele [33,34]), and other similar systems. Developed by Kerbache and Smith [27], the GEM has become an appealing approximation technique for performance evaluation of queueing networks due to its accuracy and relative simplicity. Moreover, exact solutions to performance measurement are restricted only to very simple networks and simulation requires a considerable amount of computational effort.
The GEM is basically a combination of two approximation methods, namely, the repeated trials and the node-bynode decomposition. In order to evaluate the performance of a queueing network, the method first divides the network into single nodes with revised service and arrival parameters. Blocked customers are registered into an artificial "holding node" and are repeatedly sent to this node until they are serviced. The addition of the holding node expands the network and transforms the network into an equivalent Jackson network in which each node can be solved independently.
In the remaining part of this section, we will present a high-level overview of the method. For more detailed information and applications of the GEM, the reader is referred to, for example, Kerbache and Smith [28]. The GEM described below assumes that one wants to solve / / / queueing networks. Note that the methodology is generic such that / /1/ , / / / , / /1/ , / /1/ , and / / / queueing networks could also be analyzed. Only the relevant equations (e.g., the blocking probabilities) need to be adapted for these other cases.
There are three main steps in the GEM, namely, network reconfiguration, parameter estimation, and feedback elimination. The notation for the GEM, presented in Basic Network Notation section will be used throughout the paper. The steps are described as follows.
Stage I: Network Reconfiguration. For each finite node in the queueing network, an artificial node is created to register the blocked jobs. By introducing such artificial nodes, we also create new routing probabilities in the network. The result of network reconfiguration can be seen from Figure 1.
There are two possible states of the finite node, namely, the saturated and the unsaturated states. Arriving jobs will try to access the finite node . With a probability of (1 − ), the job will find the the finite node unsaturated, when it will enter the queue and eventually be serviced. However, if the finite node is saturated (with a probability of ), then the job will be directed to the artificial holding node ℎ , where it will be delayed. The delay at the artificial node is modeled using a / /∞ queue, representing a delay time without queueing. Afterward, the blocked job will try to reenter the finite queue with a success probability of (1 − ). There is a probability of that the blocked job still finds the finite node saturated and thus it will be directed again to the artificial holding node ℎ . This process repeats until the blocked job is able to enter the finite node.
Stage II: Parameter Estimation. At this stage, the values for parameters , , and ℎ are determined. Notice that the node index is omitted for the sake of simplicity.
: In order to determine , exact analytical formulas should be used whenever possible (see Kerbache and Smith [29]). For cases where exact formula is unavailable, approximations for in / / / setting provided by Smith [31] can be used. These approximations are based on a closed-form expression derivable from the finite capacity exponential queue ( / / / ) using Kimura's [35]  two-moment approximation. The following formula for / /2/ is presented as an example (see Smith [31]): The for the / / / , for = 3, 4, . . ., will not be shown for brevity but are available in Smith [31].
, since no exact method is available to calculate , an approximation from Labetoulle and Pujolle [36], based on diffusion techniques, is used: in which 1 and 2 are the roots of the polynomial. Consider in which = − ℎ (1 − ), and and ℎ are the actual arrival rates to the finite and artificial holding notes, respectively. Furthermore, it can be argued that ℎ , the delay distribution at the holding node ℎ, is actually nothing but the remaining service time of the finite node . Based on the renewal theory, one can formulate the remaining service time distribution as the following rate ℎ : in which 2 is the service time variance of the finite node .
At this point, one should notice that if the service time of the finite node is exponentially distributed with rate , then the memoryless property of exponential distribution will hold, such that Stage III: Feedback Elimination. As a result of the feedback loop at the holding node, a strong dependency on the arrival process is created. In order to eliminate such dependency, the service rate at the holding node must be adjusted as follows: As a consequence, the service rate at node preceding the finite node is affected as well. One can see that the mean service time at node is −1 when the finite node is unsaturated, and −1 + ℎ −1 when the finite node is saturated.
Thus, on average, the mean service time of node preceding the finite node is The above equations apply to all finite nodes in the queueing network.
Summary. To sum up, all performance measures of the network can be obtained by solving the following equations simultaneously: ≡ Equation (1) .
Note that (1), for , only applies to an / /2/ setting. Other expressions for for / / / queues, with = 3 to = 10, have been developed by Smith [31] and can be used in the set of (9) and (1).

Optimization Models.
In this section, we review some of the optimization models found in the literature. Given a directed graph ( , ) to represent the queueing network, characterized by Poisson arrivals, in which is the set of node (queues), with nonnegative buffers, multiple servers, and a general service time distribution, and in which is the set of directed arcs (pairs of queues) interconnecting the nodes, we can optimize (i) on the number of buffers, (ii) on the number of servers used in each vertex , (iii) on the characteristics of the service distribution (e.g., the service rates and/or the service variability), (iv) on the routing probabilities related to the arcs , or (v) on any combination of these possible decision variables.
In general, we can write the generic optimization model as follows: subject to which minimizes the total allocation (x) = ∑ ∈ (i.e., over all vertices ∈ ), constrained to provide a minimum throughput of Θ .
A number of specific models can be specified based on the above generic model.
(i) When we set x ≡ B, the buffer allocation problem (BAP) appears. One extra constraint needs to be added to reflect the integrality condition, ∈ N, for all ∈ . The objective function is then BAP = min ∑ ∈ . This is a model formulation presented and discussed in detail in Smith [37], Smith and Cruz [8], and Smith et al. [38] in which series, merge, and split topologies were examined using the GEM to estimate the performance of these queueing networks and an iterative search methodology based on Powell's [39] algorithm to find the optimal buffer allocation within the network. The papers of Gershwin and Schor [40] and Shi and Gershwin [41] also deal with buffer allocation.
(ii) The server allocation problem (CAP) appears if we have x ≡ c. Again, an extra integrality constraint is needed, ∈ N, for all ∈ . The objective function is then CAP = min ∑ ∈ . The CAP was considered by Smith et al. [42] in which a methodology was developed built upon two-moment approximations to the service time distribution embedded in the GEM for computing the performance measures in complex finite queueing networks and Powell's [39] algorithm for optimally allocating servers to the network topology.
(iii) Combining the server and buffer allocation problems by setting x ≡ (B, c) results in the joint buffer and server allocation problem (BCAP). In this case, the integrality constraints are , ∈ N, for all ∈ . Next to this integrality constraint, one more constraint is needed. It is necessary to ensure that the traffic intensity is such thar ≡ /( ) < 1 to ensure a finite optimal solution. Note that buffers can be equal to zero, hence, having a zero-buffer system (more on bufferless systems in Andriansyah et al. [32] in which such networks were evaluate in terms of the throughput using the GEM that was compared to simulations and a multiobjective optimization approach was adopted to derive the Pareto efficient curves showing the trade-off between the total number of servers used and the throughput). Secondly, note that Journal of Applied Mathematics 5 the objective function needs to be adapted slightly to take into account the two objectives (i.e., buffers and servers).
We consider two options to rewrite the objective function depending on how to deal with the multiobjective issue.
(a) First, the objective function can be written as a weighted sum of the two objectives; that is, We assign a cost of to servers and (1 − ) to buffers. We can then modify the value of , such that 0 < < 1, to reflect the relative cost of servers versus buffers. As is decreased, the cost of servers will become relatively lower than that of buffers. That is, buffers are then more expensive than servers. Alternatively, when the value of is increased, the servers become more costly relative to the buffers. In this way, we evaluate whether different pricing of servers and buffers results in a significantly different buffer and server allocation. It is worthwhile to mention that if = 0, the above problem reduces to the pure BAP and if = 1, the pure CAP is obtained. The BCAP1 problem was treated in details by Woensel et al. [43], when the joint optimization of the number of buffers and servers was firstly solved by means of Powell's [39] method, a classical nonlinear derivative-free optimization algorithm, while a two-moment approximation and the GEM compute the performance measure of interest (the throughput). The proposed methodology was capable of handling the trade-off between the number of servers and buffers, yielding better throughput than previously published studies. Also, the importance of the squared coefficient of variation of the service time was stressed, since it strongly influenced the approximate optimal allocation. (b) Secondly, the objective function can be formulated in a multicriteria way; that is, in which each one of the two objectives are considered explicitly, with 1 (c) ≡ ∑ ∈ , and 2 (B) ≡ ∑ ∈ . Consequently, one obtains an approximation of the Pareto set of solutions for the two objectives. As such, this perspective is more general than the BCAP1 formulation. For further details on multiobjective optimization in general, see Chankong and Haimes [44]. The BCAP2 formulation was treated by Cruz et al. [45,46] which developed a multiobjective genetic algorithm to satisfy these conflicting objectives and to produce an approximation of the complete set of all best solutions, known as the Pareto optimal or noninferior set.
(iv) A slightly different formulation is the optimal routing problem (OROP). Here, the routing probabilities , are determined such that the throughput is maximized. Of course, the sum of all routing probabilities , in the arcs leaving each vertex ∈ and reaching its successors , such that ( , ) ∈ should sum up to one subject to in which Θ( ) is the throughput, which is the objective that must be maximized, the optimal routing probability matrix, , that is, the matrix that maximizes the objective function of this predefined network, and ( ) is the set of succeeding vertexes of vertex ; that is, ( ) ≡ { | ( , ) ∈ }.
The throughput will thus be affected by the effective routing of jobs through the network, the variability of the service distribution, the number of servers, and the number of buffers. Among the papers that dealt with the ORAP, we could mention Gosavi and Smith [47] and Daskalaki and Smith [48]. Additionally, Cruz and van Woensel [49] solved the ORAP by using the GEM as the performance evaluation tool of the finite queueing network and optimizing by means of a heuristics based on Powell's [39] algorithm.
(v) A last variation considered is the profit maximization model. The models are thus expanded with financial indicators in order to maximize the profit generated. This profit will be a function of the quantity one can set in the market (i.e., the throughput) and the costs to realize this throughput, which could be the buffer and/or server allocation. The decision variable is thus the investment in buffers or servers ((B, c)). Assume the cost of the buffers or servers is and the gain of a unit of throughput is equal to . Then we can formulate the objective function as follows: in which [Θ − Θ(B, c)] is either positive or zero (i.e., Θ(B, c) ≤ Θ ). Penalty costs of size are charged when the system throughput does not meet the market demand (i.e., Θ(B, c) < Θ ). Penalty costs can include the cost of outsourcing production to another factory. Figure 2 displays the behavior of this optimization function for a network of three / / / queues in tandem, with = 10, 2 = 1.5, for all , an external arrival rate Λ 1 = 5 and Θ = 5. It shows the achieved profit, PROFIT , at the optimal buffer allocation, for = 1 and different cost settings and . When the operational expense increases ( ≡ / ), it is more attractive to underachieve market demand (i.e., Θ(B, c) ≪ Θ ); optimal throughput decreases. When the penalty costs increase ( ≡ / ), it becomes less attractive to underachieve market demand; optimal throughput increases.
It is worthwhile to state that the models described above are difficult nonlinear integer programming problems. Considering the BCAP model, it can be shown that for a network with nodes, the complexity involved is Clearly, the solution space grows exponentially in the number of nodes, but not (exponentially) in the capacity of each node. The complexity of the BCAP model can thus be written as ( ).

Optimization Methodologies.
While the GEM computes the performance measures for the queueing network, many of the above discussed models need to be optimized on the decision variables defined in x. Note that there, of course, exist many optimization methods. An exhaustive discussion is left out of this paper, but the interested reader is referred to Aarts and Lenstra [50] and the references therein. We describe two methodologies which have proven to be successful for the above described models, namely, the Powell's [39] algorithm and a genetic algorithm approach. Of course, small problems can always be enumerated.
Powell's [39] algorithm can be described as an unconstrained optimization procedure that does not require the calculation of first derivatives of the function. Numerical examples from Himmelblau [51] have shown that the method is capable of minimizing a function with up to twenty variables Powell's method locates the minimum of (x) of a nonlinear function by successive unidimensional searches from an initial starting point x (0) along a set of conjugate directions. These conjugate directions are generated within the procedure itself. Powell's method is based on the idea that if a minimum of a nonlinear function (x) is found along conjugate directions in a stage of the search, and an appropriate step is made in each direction, the overall step from the beginning to the -th step is conjugate to all of the subdirections of the search. Genetic algorithms (GA) are optimization algorithms to perform an approximate global search relaying on the information obtained from the evaluation of several points in the search space and obtaining a population of these points that converges to the optimum through the application of the genetic operators mutation, crossover, selection, and elitism. Each of these operators may be implemented in several different ways, each one of them characterizing a specific instance of GA. Additionally, convergence of GA is guaranteed by assigning fitness to each population member and preserving diversity at the same front. For instance, recent successful applications of GA were reported by Lin [52] and Calvete et al. [53], for single-objective applications, and by Carrano et al. [54] and Cruz et al. [45,46], for multipleobjective applications. A wealth of references is given by these authors. For an application of GA to manufacturing problems, see Andriansyah et al. [32].

Results and Insights
In this section, we will focus on one example network and describe the results for some of the different optimization Journal of Applied Mathematics 7 Table 1: Results for the buffer allocation problem (see Smith and Cruz [8]). models discussed above. We consider a combination of the three basic topologies (series, split, and merge), as shown in Figure 3. This network consists of 16 nodes with the processing rate of servers in each node given in the figure. The network is adopted from Smith and Cruz [8]. We use exactly the same values for Λ, , 2 , and the routing probabilities for the splitting node (#1 and #2). Note that the routing probability #1 refers to the up tier of the node, while #2 refers to the low tier. Refer to Figure 3 for the position of each node in the network. Table 1 the results from Smith and Cruz [8] for this network structure with Λ = 5 and the routing probabilities equal to 0.5 (Table 29 in their paper). The optimization method used was Powell with multiple restarts to avoid local optima. Note that Smith and Cruz [8] considered an / /1/ setting and therefore the number of servers in all nodes is set to 1 while optimizing on the buffer allocation. Based on Table 1, we see that the first node (most congested) is receiving more buffers to cope with the relatively high arrival rate.

The Server Allocation Problem (CAP).
Let us now fix the number of buffers beforehand and then optimize on the number of servers used. More specifically, we set all buffers equal to 1 and look at the resulting server allocation. The results are given in Table 2, also obtained from Powell algorithm with multiple restarts to avoid local optima. Interestingly, we observe the same behavior as for the buffer allocation; that is, the first node is receiving more resources than the remaining nodes. On the other hand, the number of servers added is relatively low compared to the buffers added (e.g., 5 versus 8, for 2 = 0.5). This is because a server is also acting as a buffer, but a server adds more value, measured in throughput, as servers actually provide service.

The Joint Buffer-Server Allocation Problem (BCAP).
Before going to the results for the example network, we analyze the difference between buffers and servers. We saw that the BAP and CAP give different results in terms of number of servers versus number of buffers used. Let us assume that we have a single zero-buffer node with one server (i.e., = 1, = 0, and = 1), submitted to an external arrival rate Λ = 5.0, service rate = 10 and a squared coefficient of variation of the service time distribution 2 = {0.5, 1.0, 1.5}. Figure 4 gives the percentage increase of adding either a server (adding one to nine servers compared to the base case) or a buffer (adding one to eleven buffers compared to the base case) to the zero buffer base situation.
It is clear that in this case, the first added buffer or first added server gives the largest contribution to the throughput value, which is limited by the external arrival rate Λ. Note that the addition of the first extra server gives an increase in throughput of about 58% to 78%, depending upon the squared coefficient of variation of the service time distribution 2 , while the first added buffer only gives a 36% to 39% increase. Important to mention is that, in order to achieve the same increase in throughput by only using buffers, we need four to six extra buffer spaces, depending on the 2 , rather than only one server space.
The results for the joint buffer-server allocation problem are presented in Table 3, in which the / price ratio gives an indication of the relative costs of servers compared to buffers, obtained from Powell. A price ratio of 8 : 1, for example, means that servers are 8 times more expensive than buffers. The results from Table 3 show a higher throughput than for the pure BAP, Table 1, for every setting. As expected, we found that the optimal server allocation in the BCAP is different from the server settings in the pure BAP. This, however, depends strongly upon the price ratio of buffers versus servers. We found that / /1/ is not an optimal 8 Journal of Applied Mathematics  Table 3: Results for the joint buffer-server allocation problem. (3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3)  (3 3 3 3 3 3 3 3 3 3 3  configuration for this particular queueing network structure, except when buffers are becoming relatively too expensive. For these cases, we found that single-servers are optimal indeed (see rows where / ratio is 8 : 1). We observe that (near) zero-buffer configurations are identified where appropriate; that is, where the servers are relatively cheaper compared to buffers. Varying the squared coefficient of variation of the service time distribution 2 does result in some changes in the optimal server and buffer allocation, which shows the importance of models dealing with general service times. The results show that the number of buffers seems to be large under high variability, which could be expected since the increase in 2 means an increase in the variability. The extra buffers are there to handle this high variability.

Final Remarks and Insights.
The above numerical results for the buffer allocation problem, the server allocation problem, and the joint buffer-server allocation problem show that significant gains can be achieved in manufacturing systems. Specifically, setting the buffers and servers in an appropriate way greatly affects the throughput for these manufacturing systems. This is important as these systems need to be as highly utilized as possible, given the high investments. Our models and optimizations show that the optimal configurations are not always straightforward and thus advanced models and solution methods are needed. We have followed a queueing network approach with finite buffers, as this resembles reality the closest. This modeling approach is of course much harder than, for example, infinite queueing networks. We see based on the various experiments that our solution methodology is powerful and suitable for the different types of models handled in this paper. This offers managers and manufacturing systems designers a powerful tool to work with.
We saw that the BAP and CAP obviously give different results. We also note that while the addition of the first extra server gives a certain amount of increase in the throughput, the addition of the first buffer space generally will give a lower increase. In other words, in order to achieve the same increase in throughput by only using buffers, we need more extra buffer spaces rather than only a few server space.

Conclusions
This review provided an overview of the different modeling issues, the performance evaluation, and optimization approaches of the manufacturing systems assuming a queueing theory approach. We discussed the merits of the Generalized Expansion Method as a performance evaluation tool of the finite queueing networks. This methodology has proved in the literature to be a valuable approach. Secondly, different optimization models are discussed, namely the buffer allocation problem, the server allocation problem, the joint buffer, server allocation problem, and some other models. The different optimization models are shown to be hard nonlinear integer programming problems which are able to be approximately solved with a Powell heuristic. The paper ended with an overview of some results for the different models considered on a complex queueing network.

Future Research Suggestions.
In this paper, we considered the throughput as the main performance measure. Instead of the throughput, it would be interesting to evaluate the behavior of the models based on cycle time, work-in-process (WIP), or other performance measures. In a number of industrial improvement projects carried out, we observed that the critical issue to be able to use the above models is related to data availability. More specifically, processing rates, arrival rates, uncertainty in the service process, and so on need to be extracted from the available databases. An interesting approach to obtaining the relevant data is the effective process time (EPT) point of view (see Hopp and Spearman [55]). The advantage of the effective process time (EPT) approach is that various types of disturbances on the shop-floor are aggregated into EPT distributions, this enables effective modeling. However, it is important to note that, disturbances which are aggregated into the EPT distribution cannot be analyzed afterwards. Hence, shopfloor realities or disturbances which are modeled explicitly and excluded from aggregation in the EPT are defined beforehand.
Topics for future research on the queueing part include the analysis and optimization of networks with cycles, for example, to model many important industrial systems that have loops, such as systems with captive pallets and fixtures or reverse streams of products due to rework, or even the extension to GI/G/c/K queueing networks with generally distributed and independent arrivals.

Λ:
External Poisson arrival rate to the network : Poisson arrival rate to nodẽ : Effective arrival rate to node : Exponential mean service rate at finite nodẽ : Effective service rate at finite node due to blocking : Blocking probability of finite queue of size : Feedback blocking probability in the expansion method ℎ : The artificial holding node (queue) preceding node created in the GEM : Number of servers at node : T o t a lc a p a c i t ya tn o d e including the items in service ≡ − : Buffer capacity at node excluding the items in service : Set of nodes (queues) in the queueing network : Set of arcs (pairs of nodes) in the queueing network ≡ /( ): Traffic intensity at node : M e a nt h r o u g h p u tr a t ea tn o d e 2 : Squared coefficient of variation of the service time distribution at node .