Traffic intensity estimation in finite markovian queueing systems

• A submitted manuscript is the version of the article upon submission and before peer-review. There can be important differences between the submitted version and the official published version of record. People interested in the research are advised to contact the author for the final version of the publication, or visit the DOI to the publisher's website. • The final author version and the galley proof are versions of the publication after peer review. • The final published version features the final layout of the paper including the volume, issue and page numbers.


Introduction
Queueing models are idealizations of many real systems. However, they enable the accurate determination of performance measures as long as a previous step has been fulfilled; that step is the statistical estimation of its parameters [1,2]. It is impossible to discuss inference in queues without mentioning the pioneering work of Clarke in the 1950s, who describes maximum likelihood estimators for the arrival rates and service times in simple queues, and the work of Schruben and Kulkarni in the 1980s, who consider the problem of bias in queue estimation. It is also important to mention the Bayesian papers on the subject, including that of Muddapur in the 1970s, who published one of the first results that extended Clark's methodology, the series of papers from Armero and Bayarri [3,4], Armero and Conesa [5][6][7][8], Choudhury and Borthakur [9], and, more recently, Chowdhury and Mukherjee [10,11], Cruz et al. [12], and Quinino and Cruz [13]. These are only a few examples of the papers in this important research area.
The purpose of this paper is to evaluate the behavior of a traffic intensity estimator ( ), which is defined as the ratio between the arrival rate ( ) and the service rate ( ), specifically for single-server finite Markovian queues, which model various real systems [14]. The type of queue researched in this paper is typically seen in many service-oriented settings, where there is a finite queue in front of a server. Think of a gas station, where cars can queue up for a limited amount of space, and the traffic intensity should be not too high, as in this case cars might not join the queue. An alternative example is observed within a supermarket context where customers line up for the check out. Having a good understanding of the traffic intensity is crucial in these situations. Good estimations are needed to properly design 2 Mathematical Problems in Engineering MLERoMM1K<−function (K,samp) { loglike.f<−function (rho, K, n, sumxi) { n * log (1−rho)+sumxi * log (rho)−n * log (1−rho ∧ (K + 1)) } Eps.MLE<−1e−06 n<−length (samp) sumxi<−sum (samp) res<−optimize (loglike.f,c(Eps.MLE,1−Eps.MLE), K, n, sumxi, maximum=TRUE, tol=Eps.MLE) return (res$maximum) } Listing 1: MLE for . the system (e.g., to install the gas pump buffer sizes needed) or to properly manage the system (e.g., to set waiting spaces for the store clerks at check out).
In summary, previous results obtained for infinite Markovian queues are extended here for finite Markovian queues. To reach this goal this paper combines some techniques and approaches from the work of Almeida and Cruz [15] (i.e., Bayesian inference and Monte Carlo simulation for evaluation of estimators under finite samples) with other classical tools (e.g., Gibbs sampling and bootstrapping).
The remainder of the paper is organized as follows. Section 2 details the queue equations and estimators for . The computational results are presented and discussed in Section 3, followed by Section 4, which concludes the text with some final remarks and topics for future research in the area.

Material and Methods
When you have Poisson arrivals, exponential service times, a single server, and limited waiting space, you have an / /1/ queue; in Kendall notation, represents the number of customers simultaneously allowed in the queueing system. The probability of a number of users of the system, for = 0, 1, . . . , , is given by [14]: where is the traffic intensity. Estimating traffic intensity is important as it is a key design parameter in production network design, routing of products, and so on.

Maximum Likelihood Estimator.
Maximum likelihood estimation for the truncated geometric model is known since Thomasson and Kapadia [16]. Consider the stationary probability distribution given by (1). Next, consider a random sample of size , x = ( 1 , 2 , . . . , ) T , where is the number of customers an outside observer finds in the system. In this case, a maximum of customers are allowed in the queueing system at once. Therefore, the likelihood function is where = ∑ =1 . Note that the likelihood is a function of traffic intensity and sample x, although only its size, , and its sum, , which is a sufficient statistic for , are necessary. Needless to say that, for the implementation of maximum likelihood estimator (MLE), any bounded optimization algorithm could be used. However for the sake of simplicity, the implementation used in this study was encoded in R [17] and can be seen in Listing 1. For convenience, the logarithm of the likelihood function was considered because it allows products to be turned into sums. Maximizing the log-likelihood is done numerically through an R internal function. However, tests (not shown) were conducted with the original likelihood function; they indicated that the results did not change significantly in terms of accuracy or computational effort.
A nice feature about the MLEs is their invariance to transformations [18] in such a way that if̂is the MLE of and = ℎ( ) is a function of , then̂= ℎ(̂) is the MLE of . Thus, the expected number of customers in the system and the average queue length have, respectively, the following MLEs [14]:̂=1 in whicĥis the MLE for .

Bayesian Inference.
One of the alternatives for making inferences is the Bayesian method. One of its main differences from the classical method is that Bayesian inference allows the incorporation of some a priori information into the model of the unknown parameters. Unlike the classical method, the Bayesian method considers these parameters random variables, i.e., associates them with probability distributions.
Therefore, the knowledge that the manager has regarding a given unknown parameter can be considered. Two different prior distributions are described as follows. However, other alternatives are possible, such as those proposed in the study by Armero and Bayarri [3]. Lingappaiah [19] also considered a Bayesian approach to this distribution.

Beta
Prior. Therefore, the inference process for estimating starts with (2) and assumes an a priori beta distribution, that is, ( ) ∼ Beta( , ), which has been successfully used in inference in other Markovian queues [12,13] and results in the following a posteriori distribution: Because the a posteriori distribution of (5) is not a known distribution, it is necessary to use an approximation method to generate samples from the distribution. The implementation of the Bayesian estimator for beta prior is shown in Listing 2. Note that this is a bounded one-dimensional problem, and numerical integrations could be used as well to find the a posteriori distribution, which should be followed by another numerical integration to compute the estimator. An open research question is whether or not one method could be consistently superior to the other depending on the sample sizes in Gibbs sampling and the precision of the two numerical integrations.
A sample is extracted from the a posteriori distribution by using the function arms (for size 5,000), which is available in R's HI package [20]. The a posteriori distribution was represented by the logarithm of the probability density function (without the normalization constant). Because a quadratic loss function is considered, the Bayes point estimator is simply the average of the sample. Examples of a priori beta distributions are shown in Figure 1.

Jeffreys
Prior. Someone might argue that it would be more natural to use a noninformative Jeffreys' prior distribution, which is defined in terms of the Fisher information given by Thus, the following prior distribution for can be obtained: Thus, from (1), a ( ) can be found as follows. The logarithm of ( | ), from (1), is given by First and second derivatives are given, respectively, by  (1), logJe.f, function(x, K, n, sumxi)((x > 0) * (x < 1)), sSize, K, n, sumxi) return(mean(res)) } Listing 3: Bayes estimator for for Jeffreys' prior.
It follows that The expectation [ ] is given by [14] [ ] = 1 − − ( Then Thus, combining the likelihood, (2), and Jeffreys prior, given by (13), it is possible to find the following posterior probability distribution for : in which 0 < < 1.
Note that, unlike the case shown earlier, (5), the posterior distribution given by (14) is fixed. Indeed, it is not possible to vary Jeffreys' prior, which assumes a specific form. The implementation of the Bayesian estimator for Jeffreys' prior is shown in Listing 3. Because the logarithm of the posterior distribution is used, variable Irho, as defined in Listing 3, must be such that 0 < Irho < ∞.

Bootstrap Correction.
Among the bias correction methods that are commonly used for estimators, the bootstrap method is widely used [21]. In its nonparametric version, the method consists of making several (usually = 1, 000) resamplings x * (with replacement). The parameter of interest is reestimated by some (usually biased) method,Θ * . Then, the average is calculated, Θ * , and the bias is estimated using whereΘ is the estimate obtained from the original sample. Then, the following corrected version of the estimator is obtained:Θ The procedure is illustrated in Figure 2. The R code [17] for the bootstrap corrected estimator is shown in Listing 4. The parameter is the maximum number of users simultaneously (in service and waiting) in the / /1/ queue. Note that 1,000 bootstrap replicates are used and that the correction occurs as part of the MLE (see Listing 1). Figure 2: The bootstrap method. Besides bias correction, the bootstrap method has been used by many researchers in the past with good results in confidence interval building and hypothesis testing [22]. As an example, an empirical bootstrap confidence interval is used in this work as a simple way of interval estimation for . If the distribution of =Θ − Θ was known then the critical values /2 and 1− /2 could be found, where is its 100%th percentile, and then which gives an (1 − )100% confidence interval of The bootstrap makes it possible to estimate the distribution of by the distribution of * =Θ * −Θ, whereΘ * is the estimate obtained from an empirical bootstrap sample as explained earlier.

Simulation M/M/1/K of Queues.
The number of users present in an / /1/ queue follows the distribution given by (1). To efficiently generate random variables from a discrete distribution several methods are widely used in the literature, including function sample, from R [17]. The method used here is the discrete analog of the inverse transformation method, in which it is necessary to generate numbers ∼ Unif(0, 1), i.e., from a uniform distribution between 0 and 1, and to know the probabilities of interest, { = } = , ∀ . Therefore, to simulate a discrete random variable with the probability function it is necessary to compute because and follows the required probability distribution. For / /1/ queues and from (1), the following must hold: Mathematical Problems in Engineering Therefore, where ⌈ ⌉ is the ceiling function; that is, its value is the least integer that is not inferior to .
Listing 5 presents the R implementation [17] used in this study with a sample call.

Computational Results
To analyze the performance of the estimators, 10,000 Monte Carlo replications were made using the R code [17] shown in Listing 6. Note that fESt(K, samp,. . .) can be any of the  implemented estimation functions for (MLE, Bayesian, and bootstrap corrected MLE) and that the state of the random seed GlobalEnv$.Random.seed was stored immediately before the function fESt(K, samp,. . .), which can be stochastic, and was reloaded immediately after its call to ensure that the samples generated for the estimates were the same for all estimators. (iii) the Bayesian method via Jeffreys prior distribution and an average of 1,000 samples from the a posteriori distribution, (14), obtained by the function arms [23] from R's HI package [20] via code from Listing 3;
The results are shown in Tables 1, 2, and 3  For queues with capacity = 5 (Figure 3), an approximately constant average estimation error is observed for the MLE and the bootstrap corrected MLE, except when ≈ 1.0 (Figure 3(a)). The Bayesian estimators (beta prior and Jeffreys' prior) did not show equivalent performance; their estimates tended to overestimate the true value (positive error) when < 0.5 and to underestimate otherwise. Regarding the sample size, , all of the estimators showed a monotonic decrease in error (Figure 3(c)). From the MSE side, both Bayesian estimators presented themselves generally as the best alternative because they presented the lowest values most of the time. Although the bootstrap corrected MLE presented the lowest bias, the method achieves this performance at the cost of high variability, as reflected by its highest MSEs.
When the queue capacity was increased slightly to = 20 (Figure 4), a very similar behavior was noted. However, the 8 Mathematical Problems in Engineering  Finally, for queues with = 80 (Figure 5), the observed behavior could be considered, for practical purposes, as being equal to that of an infinite Markovian queue in terms of the average estimation error and the MSE. This behavior was observed for infinite Markovian queues ( / /1 queues) [15]. This finding is merely evidence of the correctness of our implementations and the quality of the computational results presented.
Additionally, computational experiments were performed for the estimators for , (3), and for , (4), and for their bootstrap corrected versions, with 1,000 resamplings, and = 20. Table 4 and Figure 6 show the results obtained for ∈ {0.5, 1, 2, 4, 8, 16}, for sample sizes ∈ {10, 20, 50, 100} and averages of 10,000 Monte Carlo replications. Similarly for , the results are presented in Table 5 and Figure 7. In summary, at an extra cost of the bootstrap method and without inflation of the MSEs, the researcher may achieve with samples of size = 10 estimates for and with the same average error of the MLE for samples of size = 100, a reduction that is relevant in practical terms because it may lead to reduction in time and cost to obtain the estimates. Note that the bootstrap method always provides smaller errors and MSEs than the MLE method for all estimates of and , even when > 1. Also note the jump up and down in the errors and MSEs when transitions from < 1 to > 1.
Finally, to illustrate the ease of use of the bootstrap in the interval estimation of the traffic intensity , computational experiments were performed. The length and coverage of empirical bootstrap intervals computed from (18) and from a normal distribution approximation (i.e.,Θ ± /2Θ , where is the 100%th percentile of the standard normal distribution and the standard deviationΘ was estimated also by bootstrapping) were evaluated for ∈ {0.10, 0.20, 0.50, 0.90}, for sample sizes ∈ {10, 20, 50, 100, 200}, averages of 10,000 Monte Carlo replications, and = 20. The satisfactory performance of the bootstrap was demonstrated, as presented in Table 6, with the coverages approaching the nominal confidence of 95% (that is, 1 − = 0.95) as the sample sizes increase.

Numerical Example.
To better illustrate an application of the method, a numerical application based on the data considered in Table 7, collected in a large supermarket network      in a region of interest [12], is provided. The goal is to evaluate traffic intensity, which, for managerial reasons, should not exceed 87%; if it does, users may leave. The data comprise 200 random observations of the number of customers in the system at random times sufficiently spaced and previously defined by the person responsible for collecting the data to avoid correlation. The observed values (V) and frequency (F) are presented in Table 7. For instance, of the 200 observations, at 8 times, no customers were found in the system; at 21 times, only 1 customer was found, and so on. The estimates are shown in Table 8; they were calculated using the MLE (Listing 1) and the Bayesian estimators  (Listings 2 and 3) based on an a priori Beta(1.0, 1.0) and 5,000 samples and Jeffreys' distribution and 1,000 samples from According to the results presented in the previous section, the Bayesian estimates should be the most reliable. It is notable that the system utilization seemed to be below the target (̂= 0.8405396 < 0.87). Note that the analysis is based only on counts of the number of users in the system. It is not necessary to estimate the arrival and service rates separately to determine .

Conclusions and Final Observations
The problem of traffic intensity estimation in finite Markov queues ( / /1/ queues) is presented as quite challenging. In fact, no estimator is absolutely superior to another in all parametric space. Although the estimates of the MLE and the bootstrap corrected MLE exhibit less bias, the Bayesian estimates (beta and Jeffreys' prior) present the lowest MSE in general. Perhaps due to the skewness of the a posterior distribution, the Bayesian estimators do not present low bias. In general, for sample size = 50 and queues with ≥ 20, the average estimation error is less than 0.005; this value was only exceeded by the Bayesian estimator for queues with total capacity = 5.
In regard to the behavior of the average estimation error and the average MSE as functions of the traffic intensity , the major errors are observed when the sample size is small ( ≤ 20) and the traffic intensities are ≈ 1.0, unlike in the case of the / /1 queues, which exhibit higher biases when ≈ 0.5. Perhaps due to the truncation of the number of users to the maximum queue length, , systems with high traffic intensities require more computational effort and are the most difficult to estimate.
Finally, it is important to note that, for queues with capacity = 80, the system's behavior is similar to that of an infinite Markovian queue (an / /1 queue), as expected. That is, the average estimation error is greater, and the MSE is highest when ≈ 0.5.
Future work in this area includes testing other Bayesian point estimators (e.g., the median, because of the asymmetry of the a posterior distribution), developing interval estimators, hypothesis testing methods, or even Kernel-based methods [24].

Data Availability
The data used to support the findings of this study are included within the article.

Disclosure
The Brazilian government funding agencies mentioned had no role in the study.