Experimental Evaluation of Optimal Rate Delay and Power Allocation Algorithm inWireless Control Networks

The network utility maximization (NUM) framework, widely used for wireless networks to achieve optimal resource allocation, has led to both centralized as well as distributed algorithms. We compare the convergence performance of centralized realization of the NUM framework with that of distributed realization by implementing the algorithms using a hardware test-bed. Experimental results show a superior convergence performance for centralized implementation compared to the distributed implementation, which is attributed to the dominance of communication delay over processing delay. The convergence results for the distributed case also show a tradeoff between processing time and the associated communication overhead providing an optimal termination criterion for the convergence of different subproblems.


Introduction
Since the seminal work by Kelly et al. [1] was published, the basic network utility maximization (NUM) problem has been extensively studied for rate allocation and congestion control in wired [2] as well as wireless networks [3][4][5].
Recently NUM framework has been extended to achieve optimal resource utilization for wireless control networks (WCNs 1 ) while meeting the stringent end-to-end delay demands of distributed control system [5].Using the primaldual decomposition, distributed algorithms are proposed in [1][2][3][4][5] to solve the resource optimization problem based on the NUM framework.This paper compares the convergence performance of a distributed algorithm with that of a centralized algorithm by implementing on a hardware testbed.We also develop a performance evaluation model and validate its convergence performance using the experimental results.The model can be used for performance evaluation of both distributed as well as centralized realizations of different NUM frameworks.
To compare the convergence performance of distributed and centralized implementations, we have selected the following NUM problem of [5]: R min (s) ≤ r s ∀s, 0≤ P l ≤ P max ∀l. (1) In (1), d (q) l and d (t)  l are queuing and transmission delays, respectively, at link l ∈ L (L being the set of network links and |L| is the total number of links in the network), d s represents the end-to-end delay for transmission session s, μ is the packet length, c l (P) given by log(SINR l (P)), for high link SINR, is the capacity at link l, r s is the end-to-end session rate for transmission session s ∈ S with an associated shortest path consisting of a subset of links L(s) ⊆ L, D max (s) is the end-to-end delay threshold, and R min (s) is the minimum rate 2 Journal of Electrical and Computer Engineering threshold.In the expression for c l (P), SINR l (P) = γ ll P l /(n l + m / = l γ lm P m ) is the signal-to-interference and noise ratio, γ lm is the channel gain from transmitter of link m to the receiver of link l, P l ∈ P, P l ≤ P max ∀l is the transmitter power for link l, P max is the maximum transmit power level, and n l is additive noise.We also define transmission cycle to be the set Θ of transmission schedules where each transmission schedule θ(i) ∈ Θ has an associated subset of simultaneously transmitting links L(θ(i)) ⊆ L.
The test-bed used for realization is equipped with wireless transceiver [6] using binary frequency-shift-keying (FSK) with 1.5 Mbps operating in 902-928 MHz ISM band providing tunable transmitter power from 0-21 dBm with 10-bit resolution.Manchester encoding, to reduce the bit error rate due to saturation of the data slicer, as well as the fixed modulation used by the RF transceiver do not allow us to realize c l (P) in the form of log(SINR l (P)).Instead power control is used to maximize the link packet success rate, ρ l .With these hardware limitations, we compare the convergence performance, of the distributed and centralized implementations, using Texas Instrument's TMS320C6713 DSP platform.

Rate Delay and Power Allocation Algorithm
The dual decomposition of the problem in (1) leads to the following rate, delay, and power allocation subproblems: maximize Pl≤Pmax l λ l + ψ l c l (P).(4)

Rate Allocation Subproblem.
The rate-allocation subproblem in (2) is further primal decomposable and can be solved individually for each session s at the source nodes.The choice of the "log" utility function and the resulting structure of rate allocation problem make it possible to have a closed form solution using KKT conditions.But we opt for an iterative solution, as choosing a different utility function may lead to a situation for which closed form solution is not available.Since the objective function for the rate allocation subproblem is differentiable, projected gradient method [7] is used leading to the following rate updates for each r s : In ( 5), β r is the fixed step size and [x] + is defined as max{R min (s), x}.Using the definition of ∇ f 1 (r s ) in ( 5) and evaluating the gradient, we obtain r s (k + 1) = [r s (k) + β r (1/r s − l∈L(s) λ l )] + .We would like to point out that compared to the gradient method an algorithm based on the Newton's method or an interior point method will further improve the convergence performance.

Delay Allocation Subproblem.
Using the gradient projection method for convex objective with linear inequality constraints [7], we obtain the updates for the queuing and transmission delay-allocation subproblems given by In ( 6), β q and β t are the fixed step sizes and Π is the projection matrix obtained from , where I is the identity matrix, D is 2|L| × m matrix with m < 2|L|, and m is equal to the number of active set of constraints.The columns of D are the gradients of the active set of end-toend delay constraints and are obtained from D T [d (q) d (t) ] T = D max .Since the end-to-end delay constraints are linear, the matrices D and Π do not change at each iteration and need not to be computed at each iteration.

Power Control Subproblem.
The power allocation subproblem in (4) is not convex but can be transformed into a convex problem by "log" transformation, and then is solved efficiently by power-update with fixed step size β P [3,4] as In the implementation of the power updates, we assume that there is not a considerable change in the channel gains γ jl ∀ j, l for durations comparable to convergence time of the algorithm.

Dual Problem.
For the dual problem of minimizing g(Λ, Ψ) s.t.λ l , ψ l ≥ 0 ∀l [6], we observe that (c l (P) − μ/d (t) l ) and (c l (P)− s:l∈L(s) r s −0.5μ/d (q) l ) are subgradients of g(Λ, Ψ) with respect to ψ l and λ l , respectively.As a result, solving the dual problem is equivalent to updating the dual variables for each link l by In (8), [x] + is defined as max{0, x} and β ψ and β λ are the fixed step sizes.The communication overhead for different realizations involves exchanging the updated primal (r s , P l , d l , and d (t) l ) and dual (ψ l and λ l ) variables.

Performance Evaluation Model
To compare convergence performance of distributed and centralized implementations, we develop performance evaluation model for processing and communication overheads.The per iteration processing overhead is the time required to update the primal and dual variables, and the per iteration communication overhead corresponds to the time spent in obtaining the required updated primal and dual variables.

Processing Overhead Model.
Processing overhead comprises of gradient and projection evaluations denoted by t ∇ (•) and t Π (•), respectively.For instance, per iteration processing overhead for delay allocation subproblem, for centralized t (cent.)dproc.and distributed t (dist .)dproc.implementations, is given by The higher processing overhead in (9) for distributed case compared to centralized is attributed to the projection evaluation at each node along the path in contrast to centralized implementation where it is evaluated once.Expressions similar to (9) can be obtained for rate, power, and dual subproblems with the exception that there is no associated projection overhead due to the absence of any linear constraints.The per iteration processing overhead for distributed case is then given by t (dist .)proc.= t (dist .)dproc.+ t (dist .)rproc.+ t (dist .)Pproc.+t (dist .)λproc.+t (dist .)ψproc.and a similar expression can be obtained for the centralized case.

Communication Overhead Model.
The communication overhead is a function of the number of node pairs involved in information exchange, the associated number of hops, and link packet success rate ρ l .Let h l (N j ) be the number of hops from the transmitter of link l to node N j .For the distributed case, communication overhead is the sum of subproblem overheads.For instance, communication overhead for rate subproblem is given by s∈S l∈L(s) h l (N s )/ρ l , where the summation over path L(s) is to receive all dual variables to evaluate r s for a given s.Now the per iteration communication overhead is obtained as where N k is the receiver of link k and the second and the third terms on right-hand side correspond to the communication overheads for link delay and power allocation subproblems, respectively.There is no communication overhead associated with dual subproblem as all the required variables are available either locally or due to the evaluation of other subproblems at that node.For centralized implementation, the communication overhead is independent of the number of iterations and corresponds to transmitting the optimal variables to the required nodes once the algorithm has converged.

Experimental Results
For convergence performance comparison, we use an example network shown in Figure 1(a), where the dashed lines represent the transmission sessions between source and destination nodes.The centralized implementation is realized at node N 1 ; while for distributed implementation, we consider two cases.In the first case of distributed implementation (Dist 1 ), the assignment is done based on the subproblem functionality where the rate and delay optimization subproblems are assigned to nodes N 1 and N 5 , respectively, and the link power and dual variable updates are assigned to N 6 .In the second case (Dist 2 ), the assignment is done based on the variable localization by assigning the rate (r s ) updates to the source nodes (e.g., rate update for session s 4 is assigned to node N 4 ) and the link delay, power, and dual variable updates to the transmitting node on that link (e.g., P l3 , d l3 , ψ l3 , λ l3 are updated at the transmitting node N 2 ).To validate the processing and communication overhead models, we obtain t ∇ (•), t Π (•) and packet success rate ρ l using experiments.Overhead model is then validated by comparing its convergence performance with that of implementation results and is shown in Figure 1(b).For the centralized case, the model matches well with implementation results, while for the two cases of distributed implementation, minor difference is due to the dominating communication overhead for which accurate evaluation of parameter ρ l is challenging.The results in Figure 1(b) also show that distributed implementation leads to higher convergence time compared to the centralized counterpart.It can also be noticed that the convergence performance of Dist 1 implementation is superior compared to that of Dist 2 implementation due to its dominant communication overhead.Higher communication overhead for the two cases of distributed implementation compared to its centralized counterpart is attributed to the fact that updated variables are exchanged at each iteration among the nodes executing different subproblems in case of distributed realization.Table 1 provides the distribution of the processing time and the communication overhead for centralized and distributed cases.A lower processing time of 27.4 ms for the centralized case, when realized at node N 1 , compared to the sum of processing times for distributed case, totaling 51.9 ms, is mainly due to the projection evaluation for end-to-end delay at each node along the shortest path for distributed case, compared to single evaluation at the central node.It should  be noted that processing at different nodes is performed sequentially to avoid any variable update synchronization issues.
Next, for distributed implementation, we analyze how the termination criterion for the convergence of the subproblems affects the overall network resource optimization convergence.Figure 2 shows the overall convergence time as a function of subproblem convergence error-tolerance corresponding to two cases of distributed implementation.From the result in Figure 2, we observe that there is an optimal value of subproblem convergence error-tolerance which leads to minimum overall convergence time.When using a larger error-tolerance for subproblem convergence, it takes a more number of iterations of the algorithm to achieve overall convergence resulting in larger communication overhead per iteration of the algorithm.On the other hand, using smaller error-tolerance, for each subproblem  convergence, leads to higher computation overhead per iteration of the algorithm.The optimal subproblem convergence error-tolerance effectively achieves a tradeoff between communication overhead and the associated computation time for distributed implementation.The processing and communication times in Table 1 are obtained for the optimal termination criterion shown in Figure 2.
Finally, we compare the convergence performance of centralized implementation for its realization at nodes N 1 , N 4 , and N 6 using performance evaluation model which is shown in Figure 3.It is observed that the choice of the node for centralized implementation as well as the termination criterion of the algorithm affect the convergence performance considerably.As a result, an improved convergence performance is achievable by carefully selecting the node for centralized implementation of the algorithm.

Conclusion
An iterative algorithm is implemented for the case of distributed and centralized realizations to solve the network utility maximization problem for wireless control networks.It is observed that the convergence performance for centralized case is better than that of distributed implementation due to the dominance of communication delay over processing delay.For distributed implementation, we observe a tradeoff between processing delay and the associated communication overhead for different subproblems, which provides an optimal overall convergence performance.A further performance improvement, for centralized case, can be achieved by using faster algorithms and higher processing power at the central node.However, for the case where processing delay dominates the communication overhead (e.g., fiber-optic link with a relatively slower centralized processing capability), we expect the distributed implementation to perform better compared to the centralized implementation.

Figure 1 :
Figure 1: (a) Example network with each node equipped with TI's DSP (TMS320C6713) running at 225 MHz and MicroLinear's ML2722 RF transceiver; (b) Convergence performance comparison of overhead model with experimental realization for centralized and distributed implementations.

Figure 2 :
Figure 2: Convergence time as a function of percentage tolerance for the termination of iterative subproblem algorithm.

Figure 3 :
Figure 3: Convergence performance comparison for centralized implementation at the network nodes N 1 , N 4 , and N 6 .

Table 1 :
Overall convergence time components.