Delay-Optimal Scheduling for Two-Hop Relay Networks with Randomly Varying Connectivity : Join the Shortest Queue-Longest Connected Queue Policy

We consider a scheduling problem for a two-hop queueing network where the queues have randomly varying connectivity. Customers arrive at the source queue and are later routed to multiple relay queues. A relay queue can be served only if it is in connected state, and the state changes randomly over time.The source queue and relay queues are served in a time-sharingmanner; that is, only one customer can be served at any instant. We propose Join the Shortest Queue-Longest Connected Queue (JSQ-LCQ) policy as follows: (1) if there exist nonempty relay queues in connected state, serve the longest queue among them; (2) if there are no relay queues to serve, route a customer from the source queue to the shortest relay queue. For symmetric systems in which the connectivity has symmetric statistics across the relay queues, we show that JSQ-LCQ is strongly optimal, that is, minimizes the delay in the stochastic ordering sense. We use stochastic coupling and show that the systems under coupling exist in two distinct phases, due to dynamic interactions among source and relay queues. By careful construction of coupling in both phases, we establish the stochastic dominance in delay between JSQ-LCQ and any arbitrary policy.


Introduction
We consider a scheduling problem in queueing systems with random connectivity of servers.For example, in wireless communication systems, the communication channel may randomly become unavailable for data transmissions due to fluctuation of channel quality over time.To cover areas with poor channel quality, relay networks have been widely adopted [1][2][3] in which there exist relay nodes responsible for relaying data packets in a hop-by-hop manner to destination.In this paper, we investigate the minimum delay scheduling in two-hop relay networks with random connectivity.Delayoptimal scheduling in multihop networks with random connectivity is an open problem and has eluded researchers even for very simple models.Low-latency communications have recently attracted much attention in upcoming 5G (5thgeneration) communication networks [4].There exist many 5G applications which require extremely low latency, for example, autonomous vehicles, remote surgery, and automated factories.
We consider a queueing model depicted in Figure 1.The system consists of one source queue (SQ) and  relay queues (RQs).We consider a time-slotted system, where a customer arrives at the SQ with probability  at each time slot.The customers at the SQ are routed to one of the RQs selected by the scheduler.Each RQ is associated with a service state called connectivity: the scheduler can serve a RQ only if the RQ is in connected state.The connectivity of the RQs changes randomly over time.Customers are routed or served in a time-sharing manner; at a given time slot, either a customer is routed from the SQ to a RQ, or a customer is served from a RQ in connected state and exits the system.This is a common model for relay networks in half-duplex operation; that is, the queues in adjacent hops cannot be served simultaneously.
Potential engineering applications of our queueing model include wireless relay networks.In 5G communication systems, wireless relays are expected to be widely used to enhance capacity and coverage of the network [5].In such networks, data packets can be delivered to a mobile user through multiple relay devices, for example, as in device-todevice (D2D) communications [6,7].Our model is applicable to downlink packet transmissions in such next-generation cellular networks as follows.In the downlink scenario, we regard the SQ as the queue located at the base station (BS).Also a RQ is a queue located at a relay node (RN), and the customers correspond to data packets.Suppose the BS intends to send a stream of packets to a mobile user, say user , which is located at cell edge with poor channel quality.Instead of transmitting packets directly to user , which is likely to fail most of the time, the BS chooses to transmit packets to one of the multiple RNs.The RNs typically have better channels to user  and later transmit the temporarily stored packets to user .Each channel from a RN to user  may undergo fading due to, for example, the mobility of user .As a result, the channel between a RQ and user  may randomly switch between "on" and "off" state; in our model, this can be viewed as the RQs being randomly "connected" to user  over time.A related technique of utilizing multiple relays over timevarying channels, called cooperative relaying or opportunistic relaying, has been studied extensively [8][9][10][11][12][13].Meanwhile, our model is applicable to the uplink transmission for cellular networks as well.In the uplink scenario, a user node becomes the source queue and utilizes multiple relay nodes equipped with queues.The packets are eventually relayed to the base station, which is the destination node, by the relay nodes.
The time-sharing service of customers between SQ and RQs is analogous to half-duplex transmissions of packets in wireless relay networks, that is, only either BS or RN can transmit at a time slot.While full-duplex relays are recently under investigation [14,15], half-duplex relays are still widely used [14] since they are simple to design and cost-effective.In addition, the connectivity from the SQ to RQs is assumed to be always in connected state in our model.A common architecture for cellular networks with relays proposes using nodes dedicated for relaying called Relay Stations (RSs).The RSs are typically installed in fixed locations which are in lineof-sight (LOS) to the BS [16].Therefore, the channel quality between the BS and a RS is typically very high.Also, in D2D networks in which mobile nodes are selected to act as RNs, it makes sense to select mobiles which have good channels to the BS in the first place.Such a high quality channel between the BS and RSs can be modelled as being always in "on" state, which is assumed in our model.
We introduce a policy called Join the Shortest Queue-Longest Connected Queue (JSQ-LCQ).
Definition 1.The JSQ-LCQ policy is defined as follows: (1) If there exist RQs which are nonempty and connected, serve a customer from the longest queue among the connected RQs.
(2) If there is no RQ to serve, route a customer from the SQ to the shortest queue among the RQs.
It is observed that the JSQ-LCQ is a simple and greedy policy with tehe following properties: (i) JSQ-LCQ prioritizes serving the RQs over serving the SQ.If there is any chance to serve a RQ, it will do so.
(ii) Among the connected RQs, it will serve the longest one.This is an attempt to make the RQs as "balanced" as possible.
(iii) When JSQ-LCQ routes a customer to RQ, it chooses the shortest RQ.Again the policy attempts to balance the RQs.
JSQ-LCQ focuses on balancing queues during the service and routing so as to maximize the "opportunism" of time-varying connectivity, that is, so that as many nonempty queues as possible can observe the connected state.LCQ is inspired by the policy in [17] with the same name.In the singlehop case where the queues are served in parallel, the LCQ policy is delay-optimal for the symmetric system; that is, connectivity and arrival processes have identical statistics across the queues [17].Meanwhile, in a two-hop network, if all the queues are in connected state and are served in parallel without time-sharing constraints, it is well-known that the JSQ policy is delay-optimal [18].However, two-hop networks with the time-sharing constraint and random connectivity are considered in our problem.Two-hop networks are very different from one-hop networks, since the customers served in the first hop do not leave but stay in the system, waiting to be served; moreover, the service availability randomly changes over time.The question is as follows: does there exist a delay-optimal policy for such two-hop networks?If so, can we find the optimal policy?In this paper, we answer these questions in the affirmative as follows.Assume a symmetric system; that is, the connectivity has a symmetric distribution across the RQs.Our claim is that JSQ-LCQ is optimal in a strong sense, that is, in the stochastic ordering sense.
In this paper, we establish the delay optimality of a twohop network model with time-varying connectivity.We show that JSQ-LCQ policy is strongly optimal, such that it minimizes the number of customers in the system in the stochastic ordering sense.We use the coupling argument to show that the queue length processes under JSQ-LCQ is stochastically dominated by any other feasible policy.However, unlike typical coupling, we show that the coupled systems can exist in two distinct phases.Such a system behaviour is attributable to dynamic interactions among random connectivity, queue states, and the time-sharing constraint of service.The system phases are characterized in terms of certain relations between the queue states of JSQ-LCQ and another arbitrary policy in comparison.Specifically, the phases are defined based on (i) the difference in the source queue lengths and (ii) the weak majorization relations between the vectors of relay queue lengths, of the compared policies.We carefully develop the coupling argument for these phases, which leads to the stochastic dominance of the total number of customers in the system.To our knowledge, delay optimality for two-hop relay networks, even for simple channel models, has not been well-known yet.Considering that there exist few works on delay-optimal scheduling, we believe that our work opens a new possibility for deeper understanding of the problem.In summary, the key contributions of our work are listed as follows: this paper (i) proposes JSQ-LCQ and proves the delay optimality of the algorithm for two-hop relay networks with timevarying connectivity, which is of theoretical significance; (ii) introduces a novel coupling technique associated with the transition of system phases defined in terms of the majorization relations among source and relay queues.
This paper is organized as follows.We present related works in Section 2. In Section 3, we describe the system model.The optimality of JSQ-LCQ is proved in Section 4. Simulation results are reported in Section 5. Section 6 concludes the paper.

Related Work
Delay-optimal scheduling is not only important from an engineering perspective but also of theoretical and mathematical interest.Delay optimality is notoriously hard to achieve with time-varying service capacity, and there exist only a few results which we review below.In their seminal work [17], Tassiulas and Ephremides considered a single-hop scheduling of parallel queues with time-varying connectivity, that is, with "on-off" channels.They proposed LCQ and showed that, under the symmetry assumptions, LCQ is delayoptimal in the stochastic ordering sense.Yeh and Cohen [19] considered single-hop scheduling over multiaccess fading channels when the capacity region of the users' service has a polymatroid structure [20].They proposed Longest Queue Highest Possible Rate (LQHPR) policy which successively allocates higher rates to longer queues.LQHPR is shown to be delay-optimal when the arrival statistics and capacity region are symmetric across users.For multihop case, authors of [21] consider scheduling tandem queues with interference constraints; that is, the adjacent queues cannot be served simultaneously.In their model, queues are always in connected state.The delay-optimal policy is obtained by serving the nonempty queue closest to the destination and then serving the next nonempty and noninterfering queue closest to the destination, which is iteratively done over the entire network.Interestingly, we observe that JSQ-LCQ policy has a similar principle to the policy in [21]: JSQ-LCQ prioritizes serving the RQs ("closer" to the destination) whenever possible, and the SQ has the lower priority.However, it is highly nontrivial to extend the optimality result to networks with time-varying connectivity like our model.Recently, Cui et al. [22] studied a two-hop network with "on-off" channels.However, there exists only one relay queue in their model, and the authors propose a policy which is asymptotically optimal using a dynamic programming (DP) approach.We observe that delay-optimal scheduling has been found only for simple models, for example, single-hop network with symmetric service capacity.To our knowledge, even with symmetric connectivity, there is no known delay-optimal scheduling for two-hop relay networks involving multiple relay queues under the time-sharing (half-duplex) constraint.
Our scheme can be regarded as a relay selection and scheduling scheme for two-hop cooperative relay networks, that is, a cooperative transmission utilizing multiple RNs, for example, [8-10, 23, 24].Bletsas et al. [8] consider a relay selection algorithm for two-hop cooperative relay networks where there exist multiple RNs for a source-destination pair.The "best" relay is chosen based on either the minimum or harmonic mean of the instantaneous channel gains of sourcerelay (S-R) and relay-destination (R-D) links.A time slot is divided by half, and S-R transmission occurs at the first half of the time slot, and R-D transmission occurs at the second half.Cui et al. [9] considered two-hop relay networks with multiple source, relay, and destination nodes.Their scheme selects RNs in an opportunistic manner, that is, based on favorable Signal-to-Noise-Ratio (SNR) among multiple S-R and R-D pairs.In [23], the authors considered a model in which the RNs have buffers where the transmission occurs over two phases (time slots) as follows.In the first time slot, a S-R pair with the best channel is scheduled, and the transmitted packet is stored in the selected relay for later transmission.In the second time slot, the best R-D pair is scheduled in which the RN transmits previously stored data.Note these works focus on maximizing transmission rates or throughput; however they do not consider the delay issue.For example, the queues are assumed to be infinitely backlogged in the aforementioned works.Also the links are scheduled in a fixed manner; for example, the selected S-R and R-D pairs are scheduled to transmit over two consecutive time slots.In contrast, our work not only considers relay selection but also link scheduling.For example, when our scheme selects the shortest RQ to route (JSQ), it can be regarded as selecting "best" relay.In other words, by balancing the RQs, the policy will enhance opportunism by making as many nonempty RQs as possible observe the connected state.In a general approach to ensure delay optimality for multihop cooperative networks, one needs a problem formulation via Markov Decision Process (MDP), for example, [25,26].To achieve delay optimality, one requires solving infinite time horizon MDP.However, it is difficult to apply MDP-based policies to large systems due to "curse of dimensionality."Wang et al. [11] considered queue-based cooperative relaying by approximately solving MDP using a stochastic learning approach.The authors proposed a distributed online algorithm which is shown to be asymptotically optimal under the heavy-traffic limit.
In contrast to delay optimality, throughput optimal policies are relatively well-known; a policy is said to be throughput optimal if the policy makes a queueing system stable whenever stability is feasible.Tassiulas and Ephremides [27] showed that the routing and scheduling under backpressure algorithm based on backlog differentials are throughput optimal for multihop networks with link constraints.However, backpressure algorithm only ensures throughput optimality but does not provide any guarantee on achievable delay.A number of enhancements to backpressure algorithm have been proposed to address the issue of delay performance [28][29][30][31][32][33][34].Backpressure algorithm often suffers from long delays in large multihop networks.This is because the algorithm explores all possible paths from source to destination.The routing based solely on backpressure may create a long path resulting in large delays.To alleviate this problem, in [31,32] an algorithm is proposed which adaptively exploits short paths, while maintaining the stabilizing property of backpressure algorithm.Specifically, the algorithm uses shortest paths under the light traffic but utilizes longer paths with increasing traffic to ensure stability.Practical implementations of backpressure algorithm are proposed in [35][36][37].Note that the aforementioned works considered throughput optimal schemes; however, throughput optimality is a relatively weak form of performance as compared to delay optimality.

System Model
Consider a time-slotted system consisting of one SQ and  RQs.Only one customer can be served at a time slot: either a customer is routed from the SQ to one of the RQs, or a customer is served at one of the RQs.A customer can be served from a RQ only if the RQ is in connected state.The service at the RQs has randomly varying connectivity.A RQ is connected with probability , and the connectivity is independent over time slots and across the RQs (similar to [17], we do not need the independence of connectivity across the RQs.We only need the assumption that the joint distribution of connectivity is symmetric across the RQs, under which our optimality result will hold as well.However, the independence assumption will simplify the arguments on, e.g., stability and stochastic coupling, which we discuss later).The arrival of a customer at the SQ is i.i.d.over time slots with probability .
The number of customers at SQ (resp., RQs) at time  is denoted by () ∈ Z + (resp., Y() = ( 1 (), . . .,   ()) ∈ Z  + ).Let C() = ( 1 (), . . .,   ()) ∈ {0, 1}  be a random process indicating the connectivity at the RQs.Specifically,  1 (), . . .,   () are i.i.d.Bernoulli random variables with parameter .Denote the arrival process by () ∈ {0, 1}.The state of the system at time  is denoted by S() fl ((), C(), (), Y()).We define the set of actions which a policy can take at a given timeslot.An action takes a value from the set of symbols defined by A fl {,  1 ,  2 , . . .,   ,  1 ,  2 , . . .,   }.Symbol  stands for the policy being idle,   stands for routing a customer from the SQ to the th RQ, and   represents serving a customer from the th RQ.Policy () is defined as a scheduling decision at time  which takes a value from A. Note that () is based on the entire history of scheduling actions ({(),  < }) and system states ({S(),  ≤ }).The SQ and RQs evolve as follows: for  = 1, . . ., , where 1(⋅) is the indicator function and Next we consider the condition for stability.The arrival rate to the system is .The service rate from the SQ to RQs is one per time slot, whereas the overall service rate at the RQs is 1 − (1 − )  on average.Hence the utilization at the first and second hops is given by  and /{1 − (1 − )  }, respectively.Due to the time-sharing constraint, the combined utilization must be less than 1; that is, It is known that throughput optimal policies such as backpressure algorithm [27] can stabilize the system under condition (2).Later, we will show that JSQ-LCQ is delay optimal; that is, the average number of customers in the system under JSQ-LCQ is no more than that under any other policy including backpressure algorithm.Thus, JSQ-LCQ is a stable policy under condition (2); note that delay optimality implies throughput optimality.Due to the memoryless property of connectivity, it suffices to consider () which is a static statefeedback policy; that is, there exists optimal () which bases its decision only on the present state of the system.Under such (), queue state process ((), Y()),  = 0, 1, 2, . .., forms a Markov chain; if () is throughput optimal, the Markov chain is stationary and ergodic under stability condition (2).

Delay Optimality of JSQ-LCQ
In this section, we will prove the delay optimality of JSQ-LCQ.We will show that JSQ-LCQ is optimal in the stochastic ordering sense which we define as follows.For two random variables  and  in R, let  ∼  denote that they have the identical distribution.
For vector x ∈ R  , let ‖x‖ fl ∑  =1 |  |.We state the main theorem as follows.
Theorem 3. Let () and Y() denote the queue length processes of the SQ and RQs under JSQ-LCQ.Also let X() and Ỹ() denote the length of the SQ and RQs under an arbitrary policy.Suppose ((), Y()) and ( X(), Ỹ()) are in an arbitrary initial state ( 0 , y 0 ) at time  = 0.Then, for  ≥ 0, Theorem 3 states that the number of customers in the system under JSQ-LCQ is stochastically smaller than that under any other policy.By Little's law, the theorem implies that JSQ-LCQ will minimize the delay in the stochastic ordering sense.Theorem 3 is a much stronger statement than, for example, achieving the minimum average delay, as we can see from (3).To prove Theorem 3, we will use stochastic coupling arguments leveraging the forward induction technique [18].Specifically, we show that one can construct the sample paths of queue length processes under JSQ-LCQ and another arbitrary policy by properly coupling the arrival and connectivity processes so that (4) holds.

4.1.
Coupling.Let () denote the JSQ-LCQ policy.Let π() denote another arbitrary policy.((), Y()) (resp., ( X(), Ỹ()) in Theorem 3 denote the queue length processes under () (resp., π()).In the following we use forward induction [18,Section 8.3] by coupling the connectivity of the RQs under () and π() as follows.Suppose the process of RQs under (), or Y(), has the connectivity given by C() = ( 1 , . . .,   ) at time .We will couple C() with the connectivity variables for Ỹ() as follows: if the th longest queue of Y() has the connectivity  ∈ {0, 1}, then let the th longest queue of Ỹ() have the same connectivity , for  = 1, . . ., .In other words,  [] () and Ỹ[] () see the same connectivity variable by the coupling for  = 1, . . ., .This coupling will not change the marginal distributions of the connectivity seen by Y() and Ỹ() as is required in (1) and (2) of Definition 2 (see also [18,Proposition 8.3.2]), because this coupling involves simply permuting the connectivity variables across the RQs.Specifically, the connectivity variables are i.i.d.across the RQs and thus their joint distribution is symmetric or invariant to permutation; that is, where P is an arbitrary permutation of the index set {1, 2, . . ., }.Next, the arrivals to the system are coupled as follows: if an arrival occurs at the SQ under (), then let there be an arrival at the SQ under π().In the rest of the proof, we will assume that the queue length processes under  and π are coupled in the above fashion.
Unlike previous works on single-hop scheduling, the coupling argument in our problem must consider dynamic interactions among queues, connectivity, and the half-duplex constraint, as follows.JSQ-LCQ prioritizes serving the RQs; that is, it will serve the RQs whenever possible, in a balanced manner.Thus, the RQs will tend to be short under JSQ-LCQ.This means that, due to half-duplex operation, the SQ will get relatively long.However, if many RQs become empty due to prioritized service, the number of nonempty and connected RQs will become small.Hence JSQ-LCQ may be forced to frequently route customers to the RQs, in a balanced manner, in which case the RQs will build up.However, as the number of nonempty RQs grows, there will be many nonempty and connected RQs, and JSQ-LCQ will again begin to actively serve the RQs.Thus, we observe that the system exhibits some cyclic patterns in the services and evolution of the queue states.
Based on this observation, we identify that there exist two distinct phases in our coupling process.In the first phase, the RQs tend to be short and the SQ tends to be long under JSQ-LCQ.In the second phase, the SQ tends to be short and RQs tend to be long.The first phase is called weak majorization (WM) phase, and the second phase is called water-filling majorization (WFM) phase.We will explain the phases in more detail in the subsequent sections.We show that, by introducing the concept of phases, we are able to handle the aforementioned patterns in system behaviour.Specifically, under proper coupling, we show that the system remains in either of the two phases or makes the transition to the other phase.Later we will show that this construction implies the desired stochastic majorization given by (4).

Weak Majorization (WM) Phase. For vector
x is said to be weakly majorized by y, which is denoted as x ≺  y.
Recall that ((), Y()) (resp., ( X(), Ỹ())) is the queue length processes under JSQ-LCQ or  (resp., an arbitrary policy or π).Definition 5. We say the system is in weak majorization (WM) phase if (1) the following relation holds: equivalently, there exists integer  ≥ 0 such that (2) Y() is weakly majorized by Ỹ(); that is, We discuss the implication of the WM phase.Firstly, JSQ-LCQ will greedily serve the RQs whenever possible, making the overall length of RQs small.By contrast, the customers that arrived at the SQ will have to wait relatively long due to the half-duplex constraint.Condition (8) represents these properties; that is, the SQ under JSQ-LCQ is relatively long, and the sum-length of the RQs is relatively small.In addition, if we rearrange the inequality on the right of condition (8), which is interpreted as follows.There are an excess of  customers backlogged at the SQ under JSQ-LCQ.Thus, if we compare only the SQs, JSQ-LCQ appears to be  customers "behind" π.Now suppose JSQ-LCQ adds these  customers to ‖Y()‖ by serving the SQ for  times to "catch up" π, which takes  time slots.During this time interval, π can serve  customers from the RQs to push the customers out of the system as much as possible, in which case ‖ Ỹ()‖ is reduced by .However, (10) shows that, even after such actions by π, the number of customers under JSQ-LCQ still is no more than that under π.Thus, condition (8) indicates that JSQ-LCQ is in fact sufficiently "ahead" of π in WM phase.Secondly, not only will the RQs be short, but also they are well "balanced" due to JSQ-routing and LCQ-scheduling principles.Condition (9) represents this property.An example of the system in WM phase is depicted in Figure 2.However, the system may get out of WM phase if most of the RQs are emptied out, after which JSQ-LCQ will mainly serve the SQ.Consequently, the RQs will become relatively long but the SQ will become relatively short, in contrast to WM phase.In that case, the system makes the transition to WFM phase, which we will define and discuss in detail in Section 4.3.
To use forward induction we will show that if the system is in WM phase at time , under a proper coupling of connectivity and arrivals, the system will either remain in WM phase or make the transition to WFM phase at time  + 1.Later, we will make a similar coupling argument for the system in WFM phase: a system in WFM phase at time  either stays in WFM phase or makes the transition to WM phase at time  + 1.This implies that the same argument holds for all time  such that  ≥  due to forward induction; that is, the aforementioned relation between the queue states will propagate over time through coupling [18,21].Theorem 3.There exists coupling between ((), Y()) and ( X(), Ỹ()) such that if the system is in WM phase at time

Lemma 6. Consider the queue length processes defined in
Figure 2: Example of the system in weak majorization (WM) phase.
≥ 0, either the system remains in WM phase or it makes the transition to WFM phase at time  + 1.
Proof.Initially, at  = 0, the queue states are identical to initial condition ( 0 , y 0 ).By definition, the system is in WM phase at time 0 because (8)-( 9) are satisfied when the queue states are identical.Now consider the system at time  where we make the induction hypothesis; that is, the system is in WM phase at time .Once the connectivity and queue states are coupled, () and π() may take different actions from A. For the sake of simplicity, we will define new symbols for actions denoted by , , and  with some abuse of notation: (i) () =  denotes that a service has occurred at one of the RQs (a precise notation will be () ∈ { 1 , . . .,   }) under  at time .
(ii) () =  denotes that a routing from the SQ to one of the RQs (a precise notation will be () ∈ { 1 , . . .,   }) has occurred under  at time .
(iii) () =  denotes that the policy idles at time .
For instance, ((), π()) = (, ) denotes the event that  served a RQ and π routed a customer from the SQ to a RQ.We will consider a total of 9 cases, since ((), π()) can possibly take 9 action pairs.Recall that, in all cases, we will use the coupling introduced in Section 4.1; that is,  [] () and Ỹ[] () see the same connectivity variable for  = 1, . . ., .Also () and X() see the same arrival variable.
Case 2 (((), π()) = (, )).Both policies routed a customer to a RQ; thus (8) clearly holds at time  + 1. Next, suppose π has routed a customer to the th longest RQ, whereas  has routed a customer to the shortest RQ.We have that, for any  = 1, . . ., , which holds due to  ≤  and the induction hypothesis.Thus, we have that Y(+1) ≺  Ỹ(+1); that is, (9) holds at time +1.Thus the system is in WM phase at time  + 1.
Suppose the service occurred at queue  under , and the routing occurred at queue  under π.Also it is implied that X() > 0 in this case.We have that Let  = () − X() ≥ 0. From ( 13), we have that Also ‖ Ỹ()‖ − ‖Y()‖ ≥ 2 implies that, from (14), Thus, ( 8) is satisfied at time  + 1.Consequently, the system is in WM phase at time  + 1.
Next we check if weak majorization (9) holds at time +1.Let us define  fl  [] ().Suppose that there were  more queues which have had length  at time .
Case 5.2.(() = X()).In this case, we have ( + 1) < X( + 1).Thus, condition (8) ceases to hold, and the system makes the transition to WFM phase.We will discuss WFM phase in detail and prove the transition of phases in the next section.
Case 6 (((), π()) = (, )).This is the case where  idles and π serves a RQ.Since  is idle, we must have () = 0, which in turn implies X() = 0 due to induction hypothesis (8).Suppose the service has occurred on th longest RQ under π.We must have  [] () = 0; otherwise  would have served the th longest RQ instead of idling, because the th longest RQ is in connected state due to coupling.Given  [] () = 0, which will violate induction hypothesis (9).Thus, we have that This implies that Combining this with the induction hypothesis (9), we conclude that Y( + 1) ≺  Ỹ( + 1).
Case 9 (((), π()) = (, )).In this case,  serves the SQ and π serves a RQ.Suppose the service has occurred at the th longest queue of Ỹ(). will route the customer to the shortest queue in Y(), or  [] ().Since a departure occurred at the th longest queue of Ỹ() but did not occur at Y(), we must have  [] () = 0 for  ≥ .Thus, the customer is routed to one of the empty RQs under , which leads to  [] ( + 1) ≤ 1.We will consider two cases: () = X() and () > X().
Secondly, consider the case Ỹ[] () = 1.Suppose there were more than one queue with length 1 under π at time .
After the policies took the actions, we have that By induction hypothesis, there have been at least 2 > 0 more customers for Ỹ() than those for Y().Thus if we combine ( 31)- (32) we have that Accordingly, we claim that the following holds for all  ≤ : Thus, (8) holds at time +1.In conclusion, the system remains in WM phase at time  + 1.
Case 9.2 (() = X()).In this case, the system makes the transition to WFM phase which we introduce in the next section.The phase transition will be proved later as well.

Water-Filling Majorization (WFM) Phase.
We have the following definition.
Definition 7. Consider  and x in Z + and -dimensional vectors y and ỹ in Z  + .We say (, y) is water-filling majorized by ( x, ỹ) denoted by if the following holds: (1)  < x.
(2) Let  denote the difference between  and x; that is,  = x − .Let ỹ ∈ Z  + be a vector formed by adding  to the entries of ỹ in a "water-filling" manner.Specifically, let () be a number that satisfies Let us add [() − ỹ[] ] + to ỹ[] for  = 1, . . ., , and let ỹ denote the resulting vector.Then we have that We say that the system is in WFM phase if, at time , In other words, we have Figure 3: Example of the system in water-filling majorization (WFM) phase.The SQ under π has an excess of  fl X() − () = 2 customers compared to that under , which satisfies (40).The RQs at time  under JSQ-LCQ are not necessarily better balanced than those under π; we observe that Y() ¡ ≺  Ỹ(), if we compare the system in the upper left and the bottom left.However, if we construct Ỹ () by performing a "water-filling" routing of  = 2 customers from X() to Ỹ(), we observe that Y() is better balanced than Ỹ (), if we compare the queues in the upper right and the bottom right.Thus, we have Y()≺  Ỹ (), which satisfies (41).
We discuss the implication of the WFM phase.As mentioned earlier, JSQ-LCQ prioritizes serving the RQs and hence will generate many empty RQs.As a result, JSQ-LCQ may be forced to route customers, resulting in a short SQ, for example, as in (40), and will cause the RQs to build up.During the WFM phase, it is possible that Y() ¡ ≺  Ỹ(); even the number of customers in the RQs under JSQ-LCQ can be larger than that under π; for example, see the example in Figure 3.However, from a broader perspective, the queues are still "well-balanced" under JSQ-LCQ in WFM phase as follows.There are  more customers in the SQ under π, and if we distribute those  customers to the RQ in the "waterfilling" manner (equivalently, route  customers one after another using JSQ) and denote the resulting vector of RQs by Ỹ (), then we have Y() ≺  Ỹ ().Put differently, a shorter SQ means that JSQ-LCQ is still "ahead" of π in pushing customers closer to the destination.Suppose π attempts to "catch up" the difference in a balanced manner, that is, by routing  head-of-line customers in the SQ to the RQs in a water-filling manner.The RQs under JSQ-LCQ are still better balanced; that is, Y() ≺  Ỹ().In summary, if we compare only the RQs in WFM phase, JSQ-LCQ may appear worse than other policies; however, if we consider the SQ and RQs in a combined way, we find that JSQ-LCQ is better off in terms of balancing queues, which leads to enhanced opportunism.
In the proof of Lemma 6, we argued that in cases (5.2) and (9.2) the system makes the transition to WFM phase, and hence we will check if the transition actually occurred, that is, whether (( + 1), Y( + 1)) ≺   ( X( + 1), Ỹ( + 1)) holds in those cases.Below we will continue and conclude the proof.
Suppose there was a service from the th longest queue from Ỹ().The th longest queue of Y() must be empty; otherwise the queue must have been served under .Thus, we have that In order to construct Ỹ ( + 1), we first need to take action ((), π()) = (, ) and then perform water-filling routing of  customers to Ỹ( + 1).Suppose these steps are taken in the following sequence: (1) π() takes action .
Next we consider the coupling of queues under  and π in WFM phase.Lemma 8.There exists a coupling between queue length processes ((), Y()) and ( X(), Ỹ()) such that if the system is in WFM phase at time  ≥ 0, either the system remains in WFM phase or it makes the transition to WM phase at time  + 1.
Proof.As previously, the queue connectivity is coupled such that  [] () and Ỹ[] () have the same connectivity variable for  = 1, . . ., .The arrivals at () and X() are coupled; that is, they see the same arrival variable.Similar to WM phase, we consider a total of 9 action pairs of ((), π()).
Secondly, suppose that the th longest RQ has been served under  and the th longest RQ has been served under π.In the proof of Case 1 of Lemma 6, we have shown that  ≥  holds due to the LCQ policy in .We further showed that the following holds: Since Ỹ ( + 1) is obtained by routing  fl X() − () > 0 customers to Y( + 1), thus clearly we have that Ỹ( + 1) ≺  Ỹ (+1) holds.Consequently, Y(+1) ≺  Ỹ(+1) holds, which shows that (41) holds at time +1.Therefore, the system remains in WFM phase at time  + 1.
Next we will show that Y( + 1) ≺  Ỹ ( + 1).Suppose the service has occurred at th longest queue of Ỹ().Let us denote the index of this th longest queue by   .Due to coupling of connectivity, we must have that the th longest queue of Y() is zero.In order to construct Ỹ ( + 1), we need to take two steps on Ỹ(): (i) serve RQ   and (ii) route  + 1 customers to the RQs in a water-filling manner.We will rearrange these steps to compare Y() and Ỹ () as follows: (1) Serve a customer from RQ   under π.
(2) Route  customers to the RQ in the water-filling manner under π.(3) Perform JSQ of a customer from () and X() so as to yield Y( + 1) and Ỹ ( + 1), respectively.
Let Z() denote the vector of RQs under π after steps (1) and (2) are completed.We will consider two cases.
Case 4.1.This is the case where the following is assumed: Since  [] () = 0, we have that  Case 4.2.This is the case where the following is assumed: In order for (45) to hold, we must have Next, we will show that the following holds: Next, we consider Z().From ( 46) and ( 47), ( − 1)th longest queue is the shortest queue in Z(); otherwise Y() and Z() will differ by more than two customers, violating induction hypothesis (41).Thus, in step (3), one customer from X() will be routed to the ( − 1)th shortest queue of Z().Thus we have that, from (46), (50) Also, in (47), we must have  [−1] () > 0 because it is formed by water-filling of  > 0 customers.Then we have that Thus, we have that Y( + 1) ≺  Ỹ ( + 1); that is, (41) holds at time  + 1.We conclude that the system remains in WFM phase at time  + 1.
Case 5 (((), π()) = (, )).In this case, we can show that the system remains in WFM phase at time  + 1 in a similar manner to that used for (, ), because action pair (, ) can be regarded as a special case of (, ) with () = 0 (i.e.,  "routes" zero customers to the RQ).

Proof and Remarks.
We are now ready to prove Theorem 3.
Proof of Theorem 3. Lemmas 6 and 8 imply that we can couple the queue length processes such that the system is either in WM phase or in WFM phase for all  ≥ , by using forward induction [18].
which implies (55) as well.In conclusion, using the proposed coupling, we can construct sample paths under which (55) is satisfied for all  = 0, 1, 2, . ... This completes the proof of (3).
Remark 9.One could ask, can we construct direct coupling between the processes of sum-queues which leads to delay optimality?That is, if we define () fl () + ‖Y()‖ and Z() fl X() + ‖ Ỹ()‖, can we directly couple () and Z() to show optimality, instead of coupling the vectors of SQ and RQs as in our proof?We believe it is quite unlikely, because the information on individual queue lengths, which is lost if we take the sum of queue lengths, is vital in proving the optimality of JSQ-LCQ.For example, if we consider action pair (, π) = (, ), the number of customers in the system reduces by one under π; however it remains fixed under JSQ-LCQ.Thus π appears to get ahead of JSQ-LCQ in terms of reducing the sum-queue length.However, this is not true in the long term, as we have analyzed through coupling in WM and WFM phases which were defined based on the (weak) majorization of queue vectors.Importantly, in a half-duplex two-hop network, we route a customer at the expense of the opportunity to serve a RQ in that time slot and vice versa.Thus, routing and service are inherently in a tradeoff relation, and thus it is crucial that the scheduling and routing decisions properly balance queues so as to maximize the time-varying opportunity in channel connectivity.We have rigorously showed that JSQ-LCQ is able to capture those aspects and as a result achieves delay optimality in the stochastic ordering sense.

Simulation
In this section, we evaluate the performance of JSQ-LCQ via simulations.We used MATLAB as the discrete event simulator.A time-slotted system is simulated with the following parameters: the probability of a customer arriving at the source queue given by ; the probability that a relay queue is in ON state given by ; the number of relay nodes given by .The performance metric in our simulation is the average number of customers in the system.We compared  JSQ-LCQ with the well-known backpressure (BP) scheduling algorithm.Note that the BP algorithm is known to be throughput optimal; however it does not guarantee delay optimality.
Figure 4 shows the performance of JSQ-LCQ and BP against  or the number of relay nodes.We vary  from 2 to 9, where the parameters  and  are given by 0.14 and 0.1, respectively.We observe that JSQ-LCQ achieves the lower number of customers over all the values of .We also observe that the average number of customers decreases with increasing  in both cases.This is because the overall service rate at  relays is given by 1−(1−)  which increases in .The rate of decrease in the number of customers is higher at lower values of , which also can be explained from the dependency of service rate 1−(1−)  on ; that is, the service rate increases faster for smaller .
Figure 5 shows the performance of JSQ-LCQ and BP against  which is the probability that a relay queue is in ON state.The parameters  and  are given by 4 and 0.2, respectively.We observe that the number of customers is smaller with JSQ-LCQ over all the values of .We also observe that the average number of customers decreases as a convex function of  similar to Figure 4; however, the rate of decrease is not as steep as that with respect to , because the service rate 1 − (1 − )  is a polynomial function of  in contrast to its exponential dependence in .
Figure 6 shows the performance of JSQ-LCQ and BP against the arrival rate .The parameters  and  are given by 0.8 and 4, respectively.We observe that JSQ-LCQ outperforms BP over all values of .We also observe that the number of customers increases with increasing .The number of customers will blow up as  tightly approaches condition (2).In conclusion, the simulation results verify that JSQ-LCQ indeed achieves the smaller number of customers in the system over all simulated parameters, as is expected from our theoretical result on optimality.

Conclusion
In this paper, we studied a delay-optimal policy for halfduplex two-hop relay networks with symmetric connectivity on the service.We showed that JSQ-LCQ policy is strongly optimal; that is, the total queue length under JSQ-LCQ is stochastically dominated by any other policy.Our future work includes devising simple policies for two-hop relay networks with asymmetric connectivity and studying delay optimality for cooperative relay networks with more than two hops.Under certain realistic conditions for time-varying channels, for example, fading due to the user mobility having power-law distributions [39], the i.i.d.assumption of channel connectivity over time may not hold, and JSQ-LCQ is not guaranteed to be delay-optimal.In fact, for many realistic channel models, it is difficult to guarantee delay optimality.The significance of our work is that, however, we have established the delay optimality of a two-hop network model with time-varying connectivity, albeit its simplicity, considering that there exists a limited amount of works on delay-optimal scheduling.

2 Mathematical 1 Figure 1 :
Figure1: Two-hop queueing network model.The system consists of one source queue (SQ) and  relay queues (RQs).Only one customer can be served at any instant (half duplex).Binary variable   represents the connectivity of the th RQ,  = 1, . . ., .

Figure 4 :
Figure 4: Comparison of the average number of customers in the system versus the number of relay nodes.

Figure 5 :
Figure 5: Comparison of the average number of customers in the system versus the connectivity probability of relays.

Figure 6 :
Figure 6: Comparison of the average number of customers in the system versus the arrival rates.
That is, because only one customer is served from the th longest RQ in Ỹ(),  [−1] () and  [−1] () cannot differ by more than one customer; otherwise it would violate the induction hypothesis.
If the system is in WM phase at time , it is implied that