mBm-Based Scalings of Traffic Propagated in Internet

Scaling phenomena of the Internet traffic gain people’s interests, ranging from computer scientists to statisticians. There are two types of scales. One is small-time scaling and the other large-time one. Tools to separately describe them are desired in computer communications, such as performance analysis of network systems. Conventional tools, such as the standard fractional Brownian motion fBm , or its increment process, or the standard multifractional fBm mBm indexed by the local Hölder function H t may not be enough for this purpose. In this paper, we propose to describe the local scaling of traffic by using D t on a point-by-point basis and to measure the largetime scaling of traffic by using E H t on an interval-by-interval basis, where E implies the expectation operator. Since E H t is a constant within an observation interval while D t is random in general, they are uncorrelated with each other. Thus, our proposed method can be used to separately characterize the small-time scaling phenomenon and the large one of traffic, providing a new tool to investigate the scaling phenomena of traffic.


Introduction
Consider an application that sends a series of packets from the source to the destination through the Internet.Suppose a traffic series passes through I servers from the first server with the service curve S 1 t to the Ith server with the service curve S I t to reach the destination.Then, the communication from the first server to the Ith one can be expressed by Figure 1 Li and Zhao 1 , Li 2 , where A 1 j t is the arrival traffic accumulated within the time interval 0, t and D I j t is the departure traffic within 0, t .Let a i j t be instantaneous arrival traffic, implying the bytes of a packet at time t from connection j at the input port of the server i with the service curve S i t .Then, the accumulated function regarding a i j t in the time interval 0, t is given by A i j t t 0 a i j t dt. 1.1 We now consider the aggregated traffic x t .By aggregated traffic, we mean the following: where N is the positive number representing all connections at the input port of the server i.In this research, traffic time series x t is in the sense of 1.2 .The accumulated traffic within the interval t 0 , t is given by In the field of traffic modeling, there are two categories of traffic models.One is deterministic modeling, more precisely, bounded modeling, and the other is stochastic modeling, see Li and Borgnat 3 , Michiel and Laevens 4 .Scaling plays a role in all types of traffic models, see, for example, Willinger et al. 5 , Feldmann et al. 6 , Jiang 7 , and Papagiannaki et al. 8 .There are two types of scaling phenomena in traffic.One is the smalltime scaling and the other is large-time one, see, for example, Paxson and Floyd 9 .This paper aims at investigating two types of scaling phenomena of traffic for either the bounded modeling, say A t , and the stochastic modeling of x t .
Note that a commonly used model of x t in the wide sense stationarity is the selfsimilar process, that is, fractional Gaussian noise fGn , see, for example, Stalling  i We claim the small-time scaling phenomenon is independent of the large-time one and vice versa based on the model of Cruz.
ii We propose the point of view to use mBm to analyze the scaling phenomena of traffic in this way.Describing the small-time scaling phenomenon by using D t on a point-by-point basis and to characterize the large-time scaling one by using E H t on an interval-by-interval basis.
The rest of the paper is organized as follows.We will give the preliminaries regarding conventional time series in Section 2, aiming at pointing out why scaling is a topic in traffic of the fractal type.We will describe the reason why the small-scaling phenomenon of traffic is independent of its large-time one in Section 3. In Section 4, we will introduce a twoparametric model of mBm towards the scaling analysis of traffic based on the local H ölder function H t .Finally, Section 5 concludes the paper.

Preliminaries
Traffic time series on old telephony networks is in the class of the Poisson processes, such as the Poisson one and its compound ones, see Erlang 32  The early fractal model used for traffic modelling is the self-similar process with long-range dependence LRD , that is, fGn with LRD.For this reason, we will address the preliminaries in this section in the aspects of conventional time series, stationary self-similar process, that is, fGn, and LRD processes.

Conventional Time Series
Let {x l t } l 1, 2, . . .be a 2-order stationary random process, where x l t ∈ R is the lth sample function of the process, where R is the set of real numbers.We use x l t to represent the process without confusion causing.Its mean in the wide sense can be expressed by x l t const.

2.1
Its autocorrelation function ACF can be written by x l t x l t τ R s x τ .

2.2
In 2.1 and 2.2 , the superscript s implies that the mean and the ACF are computed by using spatial average.The mean and the ACF of a process expressed by time average are written by where the superscript t indicates that the mean and the ACF are computed by time average.The process x l t is said to be ergodic if 2.5 holds,

2.5
In what follows, we simply use x t to represent a random function in general.Denote by p x the probability density function PDF of x t .Then, the probability is given by x 1 p ξ dξ.

2.6
The mean and the ACF of x t based on PDF are given by 2.7 and 2.8 , respectively, x t x t τ p x dx.

2.8
Let V x be the variance of x.Then, If x t ∈ R, then it has the following properties.
Note 1.The PDF p x is light tailed.By light tailed, we mean that the integrals in 2.7 and 2.8 are convergent in the domain of ordinary functions.
Note 2. There exist μ x and V x for x t if the PDF of x t is light tailed.
The Poisson distribution is an instance of light-tailed distribution, which expresses the probability of a number of events occurring in a fixed period of time if these events occur with a known average rate and independently of the time since the last event.In communication networks, one is interested in the work focused on certain random variables N that count, among other things, a number of discrete occurrences sometimes called "arrivals" that take place during a time interval of a given length.Denote the expected number of occurrences in this interval by a positive real number λ.Then, the probability that there are exactly n occurrences n 0, 1, 2, . . . is given by the Poisson distribution below Note 3. The ACF of x t with a light-tailed PDF decays fast.By "decays fast," we mean that R τ is integrable in the continuous case and summable in the discrete case in the domain of ordinary functions.
Denote by S x ω the power spectrum density PSD of x t .Then, Thus, we have Note 4 below, which is a consequence of Note 3.
Note 4. S x exists in the domain of ordinary functions.
The results in Notes 1-4 are usually assumptions for conventional time series as can be seen from Fuller 44 ,Box et al. 45 , Mitra and Kaiser 46 , and Bendat and Piersol 47 .We will explain below that all in Notes 1-4 may be no longer valid for LRD traffic.

Scaling Measures for Conventional Gaussian Time Series
A Gaussian process is completely determined by its second-order properties, more precisely, its mean and ACF, see Papoulis 48 and Doob 49 .Note that the mean of x t is a measure of the global property of x t .On the other side, the variance of x t measures its local property.These two points can be easily inferred from 2.3 and 2.9 .For a Gaussian process with mean zero, one has Therefore, mean and variance or ACF are two essential numeric characteristics of a Gaussian process.In fact, if x t is Gaussian, then However, V x or mean of traffic time series x t may not exist in general due to LRD, see Li 22 , which is a particular point of a time series with LRD Beran 50, 51 .A simple explanation about this is V x → ∞ in 2.13 .In the case of V x → ∞, μ x in 2.13 is indeterminate.Therefore, variance and mean are no longer suitable for measuring the local property and the global one of LRD traffic.

Correlation Time of Conventional Time Series
Correlation time is defined by Nigam 52, page 74 It is a measure relating to the scaling of a random function x t .It implies that the correlation can be neglected if t c ≤ t, where t is the time scale of interest 52 .As traffic is LRD, both the numerator and denominator on the right side of 2.14 do not exist.Therefore, correlation time that is a useful measure in conventional time series is inappropriate to be used in LRD traffic.

Brief of LRD Time Series
One says that f t is asymptotically equivalent to g t under the limit x → c if f t and g t are such that lim x → c f t /g t 1 Murray 53 , that is, where c can be infinity.It has the property expressed by

2.16
In this sense, f t is called slowly varying function if lim u → ∞ f ut /f u 1 for all t.A random function x t is said to be LRD if its ACF r τ is nonintegrable, while it is called short-range dependent SRD if r τ is integrable.This implies that x t is LRD if where c > 0 can be either a constant or a slowly varying function.It is SRD if Theoretically, any series whose ACF is nonintegrable are LRD.In the field of telecommunications, however, the term of LRD traffic usually corresponds to a hyperbolically decayed ACF.Its asymptotic expression for τ → ∞ is often indexed by the Hurst parameter

2.19
Note 5.The tail of the PDF of LRD traffic is heavy according to Taqqu's theorem, see Abry et al. 54 .
According to the Fourier transform in the domain of generalized functions Kanwal 55 , Gelfand and Vilenkin 56 , one immediately obtains the Fourier transform of the right side of 2.17 given by where F stands for the operator of the Fourier transform.Therefore, for LRD traffic, we have Note 6. LRD traffic is in the class of 1/f noise Li 57 .
In summary, from the point of view of the assumption of Gaussian distribution, we say that the tail of the PDF of LRD traffic may be so heavy that its μ x and V x do not exist.Owing to this meaning of the heavy tails, the ACF of traffic decays so slow in a hyperbolical manner such that it is nonintegrable.Consequently, a random variable that represents a traffic time series can be no longer considered to be independent, hence, LRD or long memory.On the other hand, the PSD of LRD traffic obeys a power law, see 2.21 , hence, 1/f noise.

Brief of Self-Similar Time Series
A random function x t is said to be self-similar if it satisfies the definition of self-similarity given by where ≡ denotes equality in the sense of probability distribution.
Note 7. The concept of LRD differs from that of self-similarity Li 23 .
Note 8.The self-similarity described by 2.22 is in the global sense.
The commonly used self-similar model of traffic is fGn in the stationary case and fBm in the nonstationary case.We will brief them in the next subsection.

fGn and fBm for Traffic with LRD
fGn is an only stationary increment process with self-similarity Samorodnitsky and Taqqu 58 .We discuss it in this subsection towards exhibiting the limitation of fGn in describing two types of scaling phenomena of traffic.Let B t be Brownian motion Bm .Let B H t be the fBm of the Weyl integral type with the Hurst parameter H ∈ 0, 1 .Let Γ • be the Gamma function.Then,

2.23
The function B H t has the following properties.
i B H 0 0.
ii The increments B H t t 0 − B H t 0 are Gaussian. iii Thus, the ACF of B H t , denoted by r B H ,W t, s , is given by where

2.25
Denote by S B H ,W t, ω the PSD of B H t .Then Flandrin 59 From the above, we see that either the ACF or the PDF of B H t is time varying.Therefore, B H t is nonstationary.Note that B H t is self-similar because it satisfies the definition of self-similarity.In fact, where ≡ denotes equality in the sense of probability distribution.From 2.26 , one sees that the PSD of B H t is divergent at ω 0, exhibiting a case of 1/f noise, see Csabai 60 for the early work of 1/f noise in traffic theory.The relationship between the fractal dimension of fBm, denoted by D fBm , and its Hurst parameter, denoted by H fBm , is given by

2.28
Note that the increment series, B H t s − B H t , is fGn.Thus, the ACF of the discrete fGn dfGn is given by Since the ACF is an even function, we have where k ∈ Z. Denote by C H τ; ε the ACF of fGn in the continuous case.Then, where ε > 0 is used by smoothing fBm so that the smoothed fBm is differentiable.The PSD of dfGn was derived out quite early by Sinaȋ 61 .It is given by where C f V H 2π −1 sin πH Γ 2H 1 and ω ∈ −π, π .The PSD of fGn is see Li and Lim 62 which exhibits that fGn belongs to the class of 1/f noises.Note that 0.5 τ 1 2H −2τ 2H τ − 1 2H can be approximated by H 2H −1 τ 2H−2 , in fact, that is, the finite second-order difference of 0.5 τ 2H .Approximating it with the secondorder differential of 0.5 τ 2H yields

2.34
From the above, one immediately sees that fGn contains three subclasses of time series.In the case of H ∈ 0.5, 1 , the ACF is nonsummable and the corresponding series is of LRD.For H ∈ 0, 0.5 , the ACF is summable and fGn in this case is of SRD.FGn reduces to white noise when H 0.5.
Among LRD processes, fGn has its advantage in traffic modeling.For example, it can be used to easily represent two types of traffic series, namely, self-similar process and processes with LRD.Note that LRD is a global property of traffic.However, in principle, self-similarity is a local property of traffic, which is measured by fractal dimension D.

2.35
Therefore, one gets Li et al. 63 Hence, for fGn type traffic, the local properties of traffic happen to be reflected in the global ones as noticed in mathematics by Mandelbrot 64 .
The above discussions exhibit that the standard fGn as well as fBm has its limitation in traffic modeling because it uses a single parameter H to characterize two different phenomena, that is, small-time scaling and large-time one.The former is a local property and the latter is a global one.

Large-Time Scaling of Traffic Is Independent of Its Small-Time One
Traffic x t is greater than zero, that is, The above holds because x t is arrival traffic.In addition, where x min t and x max t are constants restricted by the IEEE standard without technical reasons except the need to limit delays.For instance, the Ethernet protocol forces all packets of x t to have x min 64 bytes and x max 1518 bytes without considering the Ethernet preamble and header Stalling 10 .Due to the functionality of TCP, traffic appears "burstiness" see The above exhibits, taking into account 3.5 and 3.6 together, that the accumulated traffic within t 0 , t is bounded by Equation 3.7 implies that traffic has scaling phenomena in two folds.One is small-time scaling and the other large one.
Note 2. From Note 1, we see that the small-time scaling of traffic is independent of the largetime one.
We now further explain the point in Note 2 from the point of view of fractal time series.Denote the autocorrelation function ACF of traffic by where τ is the lag.Then, r x τ for small lags, more precisely, for τ → 0, if r x τ is sufficiently smooth on 0, ∞ , is given by where c is a constant and α is the fractal index of x t .The fractal dimension of x t , denoted by D, is given by The parameter H is utilized to characterize the global property, more precisely, LRD, of traffic from a view of fractals.
Note 3. Generally, D is independent of H.
Note 4. Owing to Note 3, we infer that the small-time scaling of traffic is independent of the large-time one in general.
The above discussions exhibit that it may be more flexible to characterize two types of scaling phenomena of traffic by using two independent parameters.One is for large-time scaling and the other for small-time scaling.

Applying mBm to the Scaling Analysis of Traffic
From the previous discussions, we suggest that it is natural to use two independent measures to describe two types of scaling phenomena that are independent of each other.Conventionally, fBm as well as its increment process, that is, fGn, is indexed by a single parameter H, alternatively by D 2−H.Thus, there is a limitation for them to independently characterize the scaling phenomena of two.This limitation was empirically noticed by Paxson and Floyd 9 .Lately, it was noticed by Ayache et al. 18 18 discussed their method to measure the LRD of a random function, those works may not be enough for traffic because the small-time scaling is independent of the large-time one as we explained previously.As a matter of fact, it is quite awkward to use H t to describe two scaling phenomena of traffic because H t is linearly correlated with the fractal dimension D t with the expression D t 2 − H t 91, 92 .To overcome the difficulty to capture the large scaling phenomena of traffic in the global sense, we introduce the measure expressed by E H t .Based on this, we propose our opinion like this; using D t to represent the small scaling of traffic on a point-by-point basis and E H t to characterize the large scaling of traffic in the global sense, respectively.The key point of our opinion is that D t and E H t are independent of each other.
In the rest of this section, we will brief the mBm in Section 4.1.Then, in Section 4.2, we will demonstrate the applications of D t and E H t to real-traffic traces.

mBm of H t Type
Note that the above 2.27 implies that the local irregularity of a random function X t is globally the same.That, nevertheless, may not meet the real case of traffic.As a matter of fact, if D of a traffic function x t is a constant, σ of x t in 3.3 is a constant too.This is a unifractal case, which is obviously in contradiction with real traffic as σ is time dependent, see 3.5 .
One simple way to investigate the multifractality of traffic is to use mBm.Replacing the constant H with a time-dependent function H t , where t > 0 and H : 0, ∞ → a, b ⊂ 0, 1 is also called the local H ölder exponent, see  That is,

Scaling Analysis of Traffic Using mBm
The function D t in 4.14 characterizes the local irregularity of traffic on a point-by-point basis or the small-time scaling of traffic.
Note that H t may be used to describe the LRD of traffic on a point-by-point basis, see Peltier and Levy-Vehel 91 .From a view of applications, it is desired to represent the LRD, which is a global property of traffic at large time scales, on an interval-by-interval basis.As a matter of fact, from a practical view of the Internet traffic, one is interested in the LRD measure, say H, to investigate how traffic at time t is correlated with that at τ apart from t.Thus, the LRD at time t on a point-by-point basis, that is, H t , may be difficult to be used in practice.In addition to this, since the local irregularity of traffic is independent of its LRD while H t linearly correlates to D t see 4.13 , H t may be unsatisfactory to characterize the LRD property of traffic.Therefore, we propose the following expression to describe the LRD of traffic: where the subscript m implies the mean.We show two demonstrations of real-traffic traces named DEC-PKT-1.TCP and DEC-PKT-2.TCP that were recorded at Digital Equipment Corporation DEC in March 1995. Figure 2 plots its first 1025 data of traffic DEC-PKT-1.TCP, which is denoted by x i to imply the size of the ith packet i 0, 1, . . . . Figure 3 shows its D i of the first 8193 data points.The value of H m E H i for DEC-PKT-1.TCP equals to 0.756 in the range of i 0, . . ., 8192.Figures 4 and 5 are plots for DEC-PKT-2.TCP, where H m 0.754.The plots in Figures 3 and 5 exhibit that traffic has highly local irregularity as discussed by Li and Lim 19 on an intervalby-interval basis.

Conclusions
The key idea in this paper is to describe small-time scaling and large-time one of traffic, separately.Following this idea, we have explained the limitation of the standard mBm in this regard because the local irregularity of traffic because D t of mBm linearly relates to its H t .To relax this restriction, we suggest to use D t to describe the local irregularity of traffic on a point-by-point basis for the small scaling phenomenon and propose to use E H t , instead of H t , to represent the LRD of traffic for the large scaling phenomenon on an interval-byinterval basis, providing a promising candidate to study the scaling phenomena of traffic.The present results, in methodology, may be applied to random data in related issues, for example, those in 96-111 , for the scaling analysis.

Figure 1 :
Figure 1: Packets passing through a series of servers from source to destination.
10 , McDysan 11 , Pitts and Schormans 12 , Leland et al. 13 , Beran et al. 14 , Tsybakov and Georganas 15 , Willinger and Paxson 16 , and Adas 17 .However, there is a limitation in fGn for the analysis of two scaling phenomena, namely, small scaling and large one, since it is indexed by a single parameter called the Hurst parameter H, see Paxson and Floyd 9 , Tsybakov and Georganas 15 , Ayache et al. 18 , Li and Lim 19, 20 , and Li 21-24 .Therefore, two-parameter models of traffic are needed.In this paper, we address two types of traffic models.One is the multifractional Brownian motion mBm , see Li et al. 25 .The other is the 2-parameter bounded model introduced by Cruz, see 26, 27 , Li and Zhao 28 , Raha et al. 29 , Jiang and Liu 30 , and Boudec and Thiran 31 .The contributions of this paper are in two aspects.
and Brockmeyer et al. 33 .It has been successfully used in the design of infrastructure of old telephony networks for years, see, for example, Bojkovic et al. 34 , Le Gall 35 , Lin et al. 36 , Manfield and Downs 37 , and Reiser 38 .It is such a success on old telephony networks that it has almost been taken as an axiom for modelling traffic in communication systems, see Gibson 39 , Cooper 40 , and Akimaru and Kawashima 41 .Due to unsatisfactory performances of the Internet, such as traffic congestions, people began doubting about the models of the Poisson type.Accordingly, they began measuring and analyzing the traffic at different sites in the Internet during different periods of times for the purpose of reevaluating general patterns of traffic, see 9, 13, 14 , Paxson 42, 43 , and Traffic Archive at http://www.sigcomm.org/ITA/.Experimental processing real-traffic traces exhibited that traffic is in the class of fractal time series.
Tobagi et al. 65 or intermittency and non-Poisson Jain and Routhier 66 , Jiang and Dovrolis 67 , and Papagiannaki 8 .The burstiness has considerable effects on system performances, see, for example, Nain 68 , Draief and Mairesse 69 , Németh et al. 70 , Li and Zhao 71 , Jiang et al. 72 , Wang et al. 73 , and Starobinski and Sidi 74 .even in the field of Lebesgue's integrals, see Bartle and Sherbert 75 and Trench 76 .However, it makes sense when it is considered in the domain of generalized functions.A simple way to explain 3.3 is The above exhibits that traffic has highly local irregularity or high burstiness as observed by Feldmann et al. 6 , Papagiannaki et al. 8 , Paxson and Floyd 9 , Jiang and Dovrolis 67 , Willinger et al. 77 , and Estan and Varghese 78 .Such a local irregularity considerably affects the polices or performances of telecommunication systems, such as queuing see, e.g., Nain 68 and Draief and Mairesse 69 , end-to-end delay, see, for example, Németh et al. 70 , Li and Zhao 71 , Jiang et al. 72 , Wang et al. 73 , and Starobinski and Sidi 74 , resource allocation see, e.g., Gravey et al. 79 , anomaly detection Tian and Li 80 , and admission control Knightly and Shroff 81 , Raha et al. 82 , and Jia et al. 83 , just naming a few.Note that the bound of the average rate expressed above describes a global property of traffic.It implies that the bound of the average rate of traffic is robust as ρ is a constant.This is in agreement with the experimental observations stated by Feldmann et al. 6 , Willinger et al. 22 , and Paxson and Floyd 9 .
The integral expressed in 3.3 does not make sense if lim t → t 0 t t 0 x t dt / 0 for the continuous x t 3.10 see Adler 84 , Hall and Roy 85 , Chan et al. 86 , Kent and Wood 87 , Gneiting and Schlather 88 , Lim and Li 89 , and Li et al. 90 .The parameter D is used to describe the local irregularity of traffic.It is in terms of small-time scaling of traffic, see Li 21-24 and Li and Lim 19, 20 .From 2.19 , we have from the point of view of the multifractional Brownian motion mBm .In this research, we are interested in the work in mBm by Peltier and Levy-Vehel 91, 92 as well as Benassi et al. 93 to generalize the standard fBm by replacing the constant H with the H ölder function H t .Li et al. 94 applied H t to describe the multifractality of traffic.Although 91-93, 95 explained the local self-similarity characterized by using H t and Ayache et al.
N H t 1 , H t 2 |t 1 | H t 1 H t 2 |t 2 | H t 1 H t 2 − |t 1 − t 2 | H t 1 H t 2 ,which implies that the increment processes of mBm are locally stationary.It follows that the local Hausdorff dimension of the graphs of mBm is given by dim{Xt , t ∈ a, b } 2 − min{H t , t ∈ a, b }, Figure 5: D i of DEC-PKT-2.TCP for i 0, . . ., 8192.H m 0.754.
Note 2. E H t is uncorrelated with D t .Denote by corr as a correlation operator.Then, considering that H m is a constant, we have corr{E H t , D t } 0. 4.16 Note 3.According to 4.13 , we have |corr{H t , D t }| 1. 4.17 Equation 4.17 exhibits that H t is completely correlated with D t .