Suboptimal RED Feedback Control for Buffered TCP Flow Dynamics in Computer Network

We present an improved dynamic system that simulates the behavior of TCP flows and active queue management (AQM) system. This system can be modeled by a set of stochastic differential equations driven by a doubly stochastic point process with intensities being the controls. The feedback laws proposed monitor the status of buffers and multiplexor of the router, detect incipient congestion by sending warning signals to the sources. The simulation results show that the optimal feedback control law from the class of linear as well as quadratic polynomials can improve the system performance significantly in terms of maximizing the link utilization, minimizing congestion, packet losses, as well as global synchronization. The optimization process used is based on random recursive search technique known as RRS.


Introduction
Since random early detection (RED) was presented by Sally Floyd and Van Jacobson in 1993 as a congestion avoidance mechanism in packet-switched networks [1], some variants of RED have been designed to deal with the setting of various parameters characterizing it [2].The basic principle of RED-family algorithm consists of the following steps: (1) active queue management (AQM) mechanism detects incipient congestion by computing the average queue size, (2) notifies connections of congestion either by dropping packets or by setting a bit in packet headers.In [3], we proposed a modified version of RED governed by doubly stochastic Poisson-driven stochastic differential equations.The dynamic model developed there [3] contains a suboptimal feedback control law which monitors the status of queue of the (multiplexer) router and adjusts or controls the dropping intensity by sending (congestion) warning signals to the sources.The major advantage of the algorithm is that the critical parameters suggested by the original RED-AQM system are automatically adjusted by the suboptimal feedback law.This provides control actions based on current network conditions replacing the constant configurations as used in the original RED.The experimental results presented in [3] demonstrate that the proposed model (and algorithm) can improve the system performance significantly.However, it was observed that the feedback control law introduces global synchronization into the system.This is due to the fact that all the connections have the same control policy and hence have the same mean rate of warning signals prompting similar actions at the same time leading to synchronization.
In this paper, we propose an improved model aimed at solving the problem of global synchronization that results from many of the connections reducing their window sizes at the same time.This we do without compromising any of the advantages of the system proposed in [3].In the modified system we propose the following configuration.Access of each TCP connection to the router is controlled by a dedicated buffer (for the connection).The single feedback control law used in previous models is replaced by a system of individual feedback control laws for each of the connections.This leads to decentralized control as opposed to centralized control.For analysis of the active queue management system, we still use doubly stochastic Poisson-driven stochastic differential equations.This models the dynamic behavior of TCP flows and queues of router in which the drop rate (intensity) for each of the connections is a function of both the average queue size of the multiplexer and the average queue size of the individual buffer dedicated to the connection.The objective function which incorporates packet losses, global synchronization, link utilization, and congestion is minimized by an iterative choice of the parameters of the feedback control laws proposed here.Through simulation (experimental) results, we show that the modified system is not only capable of effectively controlling congestion and synchronization but also more robust than the earlier versions.
The rest of the paper is organized as follows: in Section 2, system model is presented.Feedback control laws are given in Section 3. Objective function and state-space formulation are presented in Section 4. Numerical results are presented in Section 5.The paper ends with conclusion in Section 6.

System model
For the construction of mathematical models for the system we need indicator functions as defined below.Let S denote any logical or mathematical statement and define the indicator function as follows: bandwidth C i .The packets reaching the router from all the individual sources are multiplexed for onward transmission through an outgoing link of capacity C n .In the absence of drop priorities, the relationship between C i and C n may be given by where β > 0 is a design parameter discussed at length later.For the specific case of differentiated services model (DiffServ) where different connections have different priorities, network designers just provide wider bandwidths between buffers and the multiplexer for those connections having lower drop priority and narrower bandwidths for those with higher drop priority.
The dynamic model of the TCP flow control system can be described in terms of window size of the sources and queue sizes at buffers and the multiplexer.The window size is governed by the following equation: where w i (t) denotes the window size of TCP connection i at time t ≥ 0 and the process N λi i (t) represents the number of packets dropped from the connection i over the time interval [0,t].This is a point process with intensity λ i .The process q i (t) is the queue size at buffer i and q n (t) is the queue size at the multiplexer measured at time t ≥ 0. The expression R i q i (t), q n (t) , (2.4) which is dependent on both q n (t) and q i (t), is the round trip time of TCP connection i and the destination.This is generally given by the following expression: R i q i (t), q n (t) ≡ a i + q i (t) where a i denotes the propagation delay between the source and destination and it is considered here as a random variable uniformly distributed over the range D ⊂ (0,∞).The set D depends on the network size and its topology and this can be estimated easily while it is impossible to incorporate propagation delay packet by packet and present a reasonably correct analysis of the traffic process.For our numerical experiments, we have chosen {a i } to be uniformly distributed independent random variables with values in D = [0.01,0.2].The lower and the upper limits of the set D are determined from the topology of the network.The remaining part of the expression models the queuing delay for the source i.In summary, the first term on the right-hand side of the expression (2.3) gives the windows opening rate and the second the closing rate.
The dynamics of buffers and multiplexer queue loads can be described as follows: where Q i is the size of buffer i and Q n is the size of the multiplexer (main buffer).The processes u i (t) and v i (t) expressed as follows represent the instantaneous rates of incoming and outgoing packets at time t: v i (t) ≡ C i (t)I q i (t) > 0 i = 1,2,...,n.
(2.8) Time delay.The models presented above are independent of the two-way propagation delay between the router and the sources.In reality there is always a communication delay.Assuming an average delay of δ units of time, (2.3) and (2.7) can be modified as follows: ) (2.10) The first equation takes care of delay from router to sources and the second from sources to router.

Feedback control laws
The process 3) is a counting process with intensity process λ i (t), t ≥ 0. In the context of internet traffic, open loop control is impractical and so out of question.The intensity process is used as a control variable and its choice should depend on the current status of the traffic which means feedback control.Thus, in general, one may N. U. Ahmed and X. H. Ouyang 5 choose λ i as an appropriate nonnegative function of the current state ξ = {w 1 ,w 2 ,..., w n−1 , q 1 , q 2 ,..., q n } as shown below: λ i (t) ≡ f i w 1 (t),w 2 (t),...,w n−1 (t), q 1 (t), q 2 (t),..., q n (t) , i = 1,2,...,n − 1, t ≥ 0.
In practice it is prohibitively costly to monitor the entire state ξ of the network.The only information that can be easily monitored by the AQM system in the router is the multiplexer queue q n and possibly the buffer queues {q i }.Also it is not practical for the router to monitor each individual window size.Thus the feedback control for each individual source may be required to depend only on the state of its own buffer and the state of the multiplexor.This leads to a partially observed feedback control law.This is given by the following expression: In high-speed traffic environment, the state of queue changes too fast and therefore direct use of this fast changing traffic would result in high-frequency fluctuation of the control.To avoid this phenomenon, one uses smoothed version of the queue given by the following exponentially weighted moving average [1]: leading to the following control laws Thus, the short-term increases in the queue sizes, resulting from bursty traffic or from transient congestion, do not result in significant fluctuation of the average queue size.Since both q i (t) and q n (t) are random processes, N λi i (t) is a doubly stochastic counting process.The feedback law f i can be chosen by the system designer as a suitable nonnegative and nondecreasing function defined on R 2 + ≡ {(x, y) ∈ R 2 : x, y ≥ 0}.One possible choice of the feedback law f i is given by where, in general, each g i is given by a bivariate polynomial of degree m with real valued coefficients such as For numerical experiments, we consider a simpler version of this such as 6 Mathematical Problems in Engineering The variable λ i actually denotes the mean rate of congestion warnings sent out to the source i.The function f i , defining the packet dropping scheme for the stream i, is eventually determined by optimization process.This is done by choosing the coefficients of the above polynomials that minimizes the cost functional introduced in the following section.

Objective functional and state space formulation
In the following, we use the notation E{z} to denote the expected value of any random variable z.For evaluation of network performance over the running period I ≡ [0,τ], an objective functional is chosen as follows: where The last component of the expression (4.1) is given by where γ ∈ (0,1) is also a design parameter discussed later and Here f denotes a feedback control law which has to be chosen to optimize the performance integral given by the expression (4.1).The overall performance of the system is measured in terms of throughput, congestion, packet losses, and synchronization.The quantities J C , J L , J S are, respectively, the expected costs of congestion, packet losses, and synchronization per unit time, while J T is a measure of expected throughput.The quantity J T gives the utilization of the link bandwidth C n ; J C is a measure of congestion when the average queue size at the multiplexer falls into the congestion zone given by Q α ≡ [αQ,Q] with α ∈ (0,1).The factor J L is a measure of packet losses at the multiplexer and the buffers due to overflow suffered during the period [0, τ].The quantity J S is a measure of global synchronization.The value of γ in (4.4) can be chosen by network designers in accordance with the number of connections.The real numbers α i , i = 1,2,3,4 are used as relative weights given to each of the costs.These can be chosen by network designers N. U. Ahmed and X. H. Ouyang 7 to reflect different concerns and scenarios and assign appropriate weights as necessary.Once all of them are chosen and fixed, one can then attempt to choose the control (or dropping) strategies f (4.2) to minimize the cost functional (4.1).The optimal strategy guarantees maximum expected throughput and minimum expected congestion, packet losses, and global synchronization.
For further analysis, it is convenient to write the state-space model for the system.Denoting the state of the system by ξ ≡ (w 1 ,w 2 ,...,w n−1 , q 1 , q 2 ,..., q n ) , one can write the system in the state-space form as follows: where the vector field F(ξ) of dimension 2n − 1 is given by The function G(ξ) is a (2n − 1) × (n − 1) matrix with entries given by (4.9) Since λ i is determined by the choice of f i , we may rewrite the system (4.6) as where the vector function f is the packet dropping scheme to be determined for optimum performance.Clearly, the cost functional (4.1) can be compactly written as where denotes the integrand of the expression (4.1).One of the objectives of the network provider is to improve the system performance by using control strategies (4.2) that minimize this cost functional.Δt The principal objective is to determine the optimal feedback law modulated by indicator function as shown in the expression (3.5).The feedback law f of the expression (4.2) based on (2.10) and (3.2) is dependent on a matrix of coefficients of dimension (n − 1) × (2m + 1) or equivalently a set of vectors (4.12) Rearranging these coefficients in any suitable order, one may consider the coefficient vector a ∈ R (n−1)(2m+1) as the choice variable in the optimization process.So the objective is to choose the coefficient vector a * that minimizes the cost functionals (4.1) or equivalently (4.11).This is accomplished by the optimization process through the iterative choice of a.In our simulation, random recursive search (RRS) algorithm is used as the optimization tool.

Numerical Results
We present detailed numerical results in order to demonstrate that the modified version of RED mechanism as proposed here is more practical and robust than the earlier version given in [3].To obtain the optimal feedback control law f (3.7), numerical simulations are performed using MATLAB.For numerical simulation only, we consider a simple scenario in which the system is comprised of three TCP flows without drop priority.
Basic system parameters used.The basic system parameters used for numerical simulation are buffer sizes {Q 1 ,Q 2 ,Q 3 ,Q 4 }, weighting factor κ q , outgoing link capacity C 4 , size of the time slot Δt, and three other factors {γ, δ,β} representing synchronization as defined in (4.4), average time delay appearing in (2.9), (2.10), and capacity allocation factor as shown in the expression (2.2), respectively (see Table 5.1).
As a matter of fact, β balances the traffic between the buffers and the multiplexer.Specifically, if β is set too low, sources are assigned less bandwidth thereby limiting their flows and causing packet losses at the user buffers.This may result in reduced utilization of the system significantly.For β ≤ 1, no queue build-up occurs at the multiplexor.In contrast, if β is too large, more packet losses and congestion may occur at the multiplexor depending on the volume of incoming traffic.As a compromise, we set β equal to 2 in our experiments.Later we vary β to determine the robustness of the optimal control law.

Linear feedback.
For linear state feedback control law, we have chosen the following general form: λ i = f i q i , q n a i 0 + a i 1 q i + a i 2 q n I a i 0 + a i 1 q i + a i 2 q n > 0 , i = 1,2,3.
(5.1) N. U. Ahmed and X. H. Ouyang 9 The optimization process is started by randomly selecting a coefficient vector a ∈ R 9 which, for convenience, is written in the matrix form where the ith row represents the coefficients of the feedback law of source i.Using the RRS algorithm given in [4], we minimize the cost functional (4.1) by iteratively choosing appropriate a.If a choice decreases the value of the cost functional, it is accepted and the corresponding cost is retained.The process is terminated once the stopping criterion is satisfied.Specific steps of sample choice using RRS are illustrated as follows.First, 44 random samples (points in R n ) are taken from the parameter space and the corresponding costs are computed.The point at which the cost functional (4.1) is minimum is taken as the center of the promising region, for example a hypercube, which is further explored.In this step, 11 random samples are chosen from the hypercube.If a better point is found within these 11 samples, the center of the cube is moved to this point keeping the size of the cube unchanged.If a better point is not found in the 11 samples, the size of the hypercube is reduced by half while keeping the center unchanged.This process of contraction and translation is repeated until the size of the region is reduced below a threshold.The search process is restarted until obtaining a point satisfying a certain stopping criterion.
Based on this optimization process, the optimum vector obtained for linear feedback control law is given by which is able to keep the value of cost functional (4.1) to a minimum.In order to prevent the RRS from getting trapped in a local optima, we have tried with very large perturbation of the parameter a from the starting point and also (the local optimum a * ) with two different starting vectors denoted by a and a as given below: (5.4) 10 Mathematical Problems in Engineering Clearly these are very far from the initial choice of a.Running the RRS algorithm once again with these vectors as the starting points, we arrive at the following optimal vectors: (5.5) which are very close to a * .This shows that the RRS algorithm is quite efficient, it is able to escape traps of local minima.

Nonlinear feedback.
The general form of the nonlinear state feedback control law is taken from the following class of polynomials: (5.6) Our objective is to find the best one.To reduce computational time we have concentrated on second-order polynomials, m = 2. Again, we use the RRS algorithm to determine the best feedback law from the class of polynomials of degree 2. Randomly, we start with two starting vectors b ∈ R 3×5 = R 15 as given below: (5.8) It is interesting to note that the last two columns of the above matrices are very small.In other words, the nonlinearity is negligible.The effectiveness of linear control law is close to that of the quadratic law.However, we note a small difference in the two quadratic laws determined by b * and b * .

5.2.
System performance corresponding to the optimal feedback control.It is important for an AQM algorithm to have adaptive and robust control performance for dynamic TCP flows.We examine the robustness of the optimal feedback control laws obtained above with respect to variations in the network parameters and discuss how tunable parameters should be set to optimize the system.In this set of experiments, the parameter setup is the same as in the above experiments, except the value of β.For β = 3, the optimization process is started from (5.9) The optimum parameter vector a * was found as   21 −195.95 −195.44 −195.22 −195.12 −195.11 −195.15J(Quadratic) −175. 82−185.20 −195.94−195.42 −195.19 −195.18−195.12 −195.Figure 5.4 shows variations of queues at the buffers and the multiplexor for values of β = 1,1.2,2,10corresponding to the linear feedback control law.It was observed that if β > 2, queue build at the buffers is substantially reduced over the simulation period.Note that Figure 5.3(a) also shows that for identical network conditions, the cost corresponding to the linear feedback policy is almost the same as that of the quadratic policy, both of them reaching the minimum at the point β = 2.The proposed feedback control algorithm shows the robustness in control performance subject to changes of network environments.This ensures robustness in stability and good performance for a wider range of network parameter uncertainties.

Impact of time delay on performance.
In general, the control performance of an AQM system is affected by several network parameters such as the buffer size (Q i ,i = 1,2,3,n − 1), the link capacity (C n ), the traffic balancer (i.e., β), and the propagation delay of warning signals.The network parameters such as buffer size, the link capacity and the traffic balancer are design parameters and hence fixed once the AQM is installed in a router.In contrast, the propagation delay is a dynamic factor because it changes dynamically over time.In fact, the propagation delay is a function of several factors such as the physical distance between sources, destinations, and the router including traffic conditions.If one wants to be exact, the problem becomes mathematically very difficult and intractable.Therefore, for simplicity one may view δ as the mean of a random variable having uniform distribution (considering the worst case/maximum entropy) over a compact interval Δ ⊂ [0,∞).For our experiments we have only considered a fixed set of time delays measured in units of basic time slots (used for computing the solutions of the system equations).Given γ = 1/3, β = 2, we use δ = {1τ,3τ,5τ}.The results shown (see Table 5.3) correspond to the optimum linear and quadratic control laws with a * and b * as given in Sections 5.1.1-5.1.2.It is clear from these results that time delay in controls has significant negative impact on system performance.An increase of time delay not only degrades the control performance of an AQM system but may also lead to instability.Thus, the effect of propagation delay of warning signals should be taken into account in designing a robust AQM system.significant reduction of synchronization.This corresponds to optimal feedback control laws f 1 , f 2 , f 3 with coefficients a = a * (linear) and b = b * (quadratic).This is expected as the intensities λ 1 , λ 2 , λ 3 (which are functions of average queue size at the multiplexer and feedback control as seen in Figure 5.3.It is important to mention that computation of optimum nonlinear feedback control laws take much longer time than linear ones and does not provide any noticeable improvement.Therefore, we conclude that in our particular setting it is not necessary to consider nonlinear feedback control laws.The linear optimal feedback control laws are fairly satisfactory.Consequently, (3.7) can be replaced by the simpler linear law g i (x, y) = a i 0 + a i 1 x + a i 2 y, i = 1,2,...,n − 1.

Conclusion
A variant of RED model presented in this paper formally defines the interactions between TCP connections and the AQM system in the computer network.The dynamic model which contains a feedback control law is governed by a system of stochastic differential equations driven by doubly stochastic point processes with intensities being the controls for each of the TCP connections.The frequency of warning signals represented by the intensities is taken as function of available information such as the router and buffer loads.The optimal (linear) feedback control law based on such information is obtained by random recursive search (RRS) technique.According to the feedback control law proposed, the controller observes the behavior of all the queues (at buffers and the multiplexer) and controls every individual intensity by sending congestion signals (warnings) to the sources for adjustment of their transmission rates.The simulation results demonstrate that the model and the methodology proposed can improve the system performance significantly via maximizing the link utilization and minimizing congestion, packet losses, and global synchronization.
. Ahmed and X. H. Ouyang 11 The optimum vectors obtained for the quadratic feedback control law are given by

Figure 5 . 2
Figure 5.2 shows the comparison of expected costs per unit time of individual factors such as throughput, congestion, packet losses, and synchronization for different values of β.As shown in Figure 5.2(b), RED (almost) fully utilizes the bandwidth of the outgoing link as β approaches 4.This is because the queue at the multiplexor is seldom empty.The incoming TCP (traffic) flows always keep the queue full.This in turn leads to more congestion and more packet losses at the multiplexor as shown in Figures 5.2(c) and 5.2(d).Consequently, AQM system would generate warning signals more frequently to force the sources to cut their window sizes more often, which may result in more global synchronization as shown in Figure 5.2(e).Despite the variation of the individual costs, Figures 5.2(b)-5.2(e), the total system cost, as shown in Figure 5.2(a), and hence the overall performance, is robust with respect to variation of the parameter β beyond 2.Figure5.4shows variations of queues at the buffers and the multiplexor for values of β = 1,1.2,2,10corresponding to the linear feedback control law.It was observed that if β > 2, queue build at the buffers is substantially reduced over the simulation period.Note that Figure5.3(a) also shows that for identical network conditions, the cost corresponding to the linear feedback policy is almost the same as that of the quadratic policy, both of them reaching the minimum at the point β = 2.The proposed feedback control algorithm shows the robustness in control performance subject to changes of network environments.This ensures robustness in stability and good performance for a wider range of network parameter uncertainties.

Table 5 .
3. Variation of costs with time delay.