Load Balancing Opportunistic Routing for Cognitive Radio Ad Hoc Networks

Recent research activities have shown that opportunistic routing can achieve considerable performance gains in Cognitive Radio Ad hocNetworks (CRAHNs).Most of these studies focused on designing appropriatemetrics to select and prioritize the forwarding candidates. However, in multiple-flow networks, a small number of nodesmay always be with the higher priority order for different flows.Thus, some nodesmay easily become overloaded with toomuch traffic and be severely congested. To overcome this problem, we propose a load balancing opportunistic routing (LBOR) scheme to maximize the total throughput of the whole network. We first formulate the problem of maximizing the total throughput of the network as a linear programming problem. Then, we develop heuristic load balancing candidate forwarder sorting and selection algorithms. Simulation results and comparisons demonstrate that our proposed LBOR scheme outperforms existing opportunistic routing protocols with nonload balancing methods in CRAHNs.


Introduction
Cognitive Radio Networks (CRNs) [1] have been considered as a promising solution to improve the efficiency of spectrum usage and hence alleviate the spectrum scarcity problem.In CRNs, secondary users (SUs) can opportunistically access the spectrum as long as the primary users (PUs) do not occupy the licensed spectrum at a particular time and a specific geographic area.
Cognitive Radio (CR) technology can be combined with ad hoc networks, which is called cognitive radio ad hoc networks (CRAHNs), in which wireless devices can dynamically establish networks using the vacant spectrum bands allocated to PUs without the need of fixed infrastructures.Due to the changing spectrum availability and dynamic network topology, routing introduces a significant challenge in CRAHNs.
In recent years, several routing protocols [2][3][4][5][6][7][8] have been proposed for CRAHNs.However, most of these studies focus on how to design a proper metric to measure the quality of a preselected route from the source to the destination.Unfortunately, owing to the unpredictable and intermittent nature of links between nodes, a prefixed path would incur a lot of retransmissions in CRAHNs.To alleviate this problem, opportunistic routing [9] was proposed to exploit the broadcast nature of wireless transmissions that one transmission can be overheard by multiple neighbors.In opportunistic routing, the decision of the next relay node can change dynamically following the network conditions.Thus, opportunistic routing can better fit the need with unreliable wireless links in CRAHNs.
Recently, some initial work for opportunistic routing can be found in [10][11][12][13][14][15][16][17].Most of them are designed based on a predefined performance metric, such as the geographical distance [10][11][12][13][14], the packet delivery ratio [15,16], and the expected transmission time (ETT) [17].Indeed, these schemes can improve the performance of opportunistic routing.However, adopting one metric to select and prioritize the forwarding candidates for multiple paths, some nodes may easily become overloaded with too much traffic.For instance, multiple end-to-end paths may share some forwarding candidates.Adopting the same metric, a small number of nodes may always be with higher priority order for different paths.Thus, the nodes may be severely congested and have 2 Wireless Communications and Mobile Computing to drop a large number of packets due to buffer limitations.Therefore, load balancing in opportunistic routing becomes a critical problem that may affect the throughput of the whole network.
In this paper, we propose a load balancing opportunistic routing (LBOR) scheme to maximize the total throughput of the whole network.To the best of our knowledge, there has been no paper which jointly considers load balancing and opportunistic routing for CRAHNs.In summary, the major contributions are listed as follows.
(i) We analyze the throughput bound of one opportunistic routing module and formulate the problem of maximizing the total throughput of the network as a linear programming problem.
(ii) We present a new metric for selecting and prioritizing the forwarder with considering the traffic load and develop load balancing candidate forwarder sorting and selection algorithms.
(iii) We conduct simulations to show that our algorithm can provide better throughput performance than the state-of-the-art opportunistic routing approaches for CRAHNs.
The rest of this paper is organized as follows.Section 2 briefly overviews the related work.Section 3 introduces the system model.We analyze the throughput of one-hop opportunistic routing and formulate the maximum throughput problem in Section 4. In Section 5, we propose algorithms to solve the problem.Simulation results are presented in Section 6.Finally, the paper is concluded in Section 7.

Related Work
Owing to the frequent dynamic changes in the CRNs, traditional routing are not well adapted to be applied to CRNs.Nowadays, scholars have proposed many routing protocols for CRNs.To address the routing problem in CRNs caused by dynamic channel availability, Saleem et al. [2][3][4] presented a cluster-based routing scheme, called SMART, which consisted of clustering mechanisms and an artificial intelligence approach.Zareei et al. [5] proposed a novel ondemand cluster-based hybrid routing protocol for CRAHNs.They firstly introduced a novel spectrum-aware clustering mechanism, which divided nodes into clusters based on the spectrum availability, power level and stability.Then they developed a routing algorithm to minimize the delay while achieving acceptable delivery ratio.To provide secure and reliable routing in CRNs, SEARCH [6] implemented a secure and reliable routing based on distributed Boltzmann-Gibbs learning algorithm.The author considered the trust value as well as the total delay for the successful and reliable transmission of the packet.Guirguis et al. [7] presented a multihop routing protocol for CRNs in which they integrated the collaborative beamforming technique with routing.Lu et al. [8] first derived the actual spectrum accessible probability of SUs from the perspectives of social activities and then proposed a greedy spectrum-aware routing algorithm.However, most of these studies focus on how to design a proper metric to measure the quality of a preselected route from the source to the destination.Unfortunately, owing to the unpredictable and intermittent nature of links between nodes, a prefixed path would incur a lot of retransmissions in CRAHNs.
Opportunistic routing was proposed to exploit the broadcast nature of wireless transmissions that one transmission can be overheard by multiple neighbors.Instead of adopting a fixed relay path, a source node broadcasts packets to neighboring nodes and selects a relay based on the received responses under current link conditions.Coutinho et al. [10] proposed the GEDAR routing protocol for underwater wireless sensor networks, which utilized the location information of the neighbor nodes and some known sonobuoys to select a next hop forwarder set of neighbors.Rahman et al. [11] proposed an energy-efficient cooperative opportunistic routing (EECOR) protocol for underwater acoustic sensor networks.Tang et al. [12] proposed a novel distributed protocol that divides the whole path from the source to the destination into several smaller opportunistic route segments.In each opportunistic route segment, there is a temporary source node, a temporary destination node, and a set of potential relay nodes.The packets are transmitted through a number of intermediate destination nodes step-by-step until the ultimate destination is reached.Since their scheme utilizes only local spectrum opportunities, topology information, and geometric conditions to compute the forwarding set for each short-term opportunistic route segment, it can better adapt to dynamic spectrum environments and changing network topologies in CR ad hoc networks.Furthermore, they discuss a geographical opportunistic routing scheme combined with network coding for CRAHNs [13,14].Wang et al. [15] implemented a spectrum-aware any-path routing (SAAR) scheme with consideration of both the salient spectrum uncertainty feature of CRNs and the unreliable transmission characteristics of the wireless medium.SAAR significantly increases the packet delivery ratio and reduces the endto-end delay with low communication and computation overhead, which enables it suitable and scalable to be used in CRAHNs.Sanchez-Iborra et al. [16] introduced a novel opportunistic routing protocol, called JOKER, which takes into account both the packet delivery reliability of the links and the distance-progress towards the final destination.Cui et al. [17] proposed a DCSS-OCR protocol for CRAHNs to discover stable routing opportunities with novel metrics referred to as the path access probability and ETT.Zhong et al. [18] jointly considered energy, trust, and social feature, and they designed a secure OR, called ETOR.However, the problem that the misbehavior node attacks the normal node is still unsolved.Tang et al. [19] presented a centralized and a distributed opportunistic routing relay algorithm to achieve the optimal system throughput in multihop WLANs.Ben Fradj et al. [20] described the energy-efficient opportunistic routing, which focused on the selection of the forwarding list to minimize the energy consumption.Zikria et al. [21] proposed a heuristic forwarder selection scheme for wireless sensor networks, called HASORF.Most of these studies mentioned above only consider the performance of routing when designing the candidate nodes priority metrics and fail to take the problem of load balancing into account, which causes the network prone to block.
For the design of load balancing in wireless networks, many scholars have made some achievements.The OELR [22] algorithm is used to solve the problem of network service life due to the excessive load of some nodes in mobile ad hoc networks.Although the algorithm can reduce the energy consumption and ensure the reliability of the data transmission, it can only achieve load balancing for the local network.The ORPL-LB [23] algorithm is used to solve the influence of node active time on network load and energy consumption in a wireless sensor network.To achieve the balance of energy consumption, the algorithm firstly adjusts the active time of the node according to the energy and the flow.Then it regulates the current working cycle of the node based on the working period of the target.The LBIA [24] algorithm is used to solve the problem of network congestion caused by the excessive load of some nodes.In this algorithm, in addition to load balancing, the interdomain data flow interference and domain data flow interference are considered.So et al. [25] proposed a load balancing opportunistic routing algorithm for wireless sensor networks.This opportunistic routing scheme designs a new forward node selection metric based on the residual energy of network nodes.Xu et al. [26] proposed a dynamic resource allocation method, named DRAM, which is designed through static resource allocation and dynamic service migration to achieve the load balance for the fog computing systems.However, the existing network load balancing algorithms cannot be directly applied in CRAHNs.
To the best of our knowledge, jointly considering load balancing and opportunistic routing is novel in CRAHNs.First, the existing load balancing schemes [22][23][24][25][26] are designed for traditional wireless networks and cannot be directly applied in CRAHNS due to the varying availability of spectrums.Second, the forwarder set selection metric of existing opportunistic routing schemes [12][13][14][15][16][17][18] only consider the routing parameters such as the end-to-end delay, the throughput, and the number of transmissions.Different from these works, our metric is load-dependent and takes care of the link quality as well.Most importantly, our candidate forwarder selection and ordering algorithms can bypass blocked links and distribute the excessive loads of high loaded nodes to nearby nodes.

System Model
3.1.Network Model.We consider a dense-scaling secondary network (SN) coexisting with a primary network (PN) deployed in a specific area.The PN consists of M Poisson distributed primary users.We assume that the interference range of PU is   .If a PU is active and its transmission frequency overlaps an SU's channel, the SU which is inside the interference range of the PU is not permitted to transmit at this time.Meanwhile, the transmission range of PU is assumed to be   (without loss of generality,   >   ).The SN is composed of N SUs who know their locations.We assume SUs are static or quasi-static.Similar to PUs' network, the transmission and interference ranges of SUs are assumed to be   and   , respectively (without loss of generality,   >   ).In our proposed scheme, we need to compute the distance between neighboring nodes and the destination.Thus, it is essential to assume that all SUs can acquire their own location information by equipping an onboard localization device, whose accuracy is around 5 meters.
In our system, multiple PUs and SUs share a set of orthogonal channels, denoted by  = {ℎ 1 , ℎ 2 , . . ., ℎ  }.SUs can exchange messages over a common control channel (CCC).Each SU is equipped with two radios: one halfduplex cognitive radio that can switch among CH for data transmissions and the other half-duplex normal radio in CCC for exchanging control messages.Note that PUs have the rights of accessing the licensed bands for communications, while SUs can only opportunistically access the spectrum for ad hoc device-to-device transmissions.We model the occupation time of PUs in each data channel as an independent and identically distributed alternating ON (PU is active) and OFF (PU is inactive) process.For ease of simulation and explanation, we assume that the mean time spent on both ON and OFF states is   ℎ and   ℎ , respectively, and they follow an exponential distribution with expectation 1/  ℎ and 1/  ℎ .Then, we can obtain the probability of channel ch to be busy (ON) and idle (OFF), respectively, as follows: 3.2.Traffic Model.In this model, each node   can sense its environment and find a set of available frequency bands (i.e., those bands that are not currently used by primary users).We use   =   ∩   to denote the set of channels that are overlapped between node   and node   , which can be used for communications.Besides, we use   to represent the Euclidean distance between two nodes.We assume that SUs can communicate with other nodes within their transmission ranges when they have common channels (i.e., there exists a link/edge  between SUs).We use   to represent the packet reception ratio (PRR) of the link   between   and   .We use an indicator   to determine the state of the link   between   and   .There exists a link between   and   if   ̸ = 0 and   <   (i.e.,   = 1; otherwise,   = 0).
Thus, the SN can be modeled as a graph G = (V, E), where V is the set of nodes (SUs), and E is the set of all the possible links formed by nodes in V. We divide the network time into several time slots with the length of .We use C,    , and    to represent the maximum capacity, the used capacity, and the available capacity of the link  at the beginning of each time slot.   can be obtained through the exchange of information between nodes.Meanwhile,    can be calculated according to the used capacity (   ) and the maximum capacity (C), which is shown as follows: where the first equation of ( 4) is adopted when the link is not overloaded, and the second is applied to the case that the link is overloaded.The available capacity of a whole link set can be derived according to the available capacity of each link in the set.For a link set LS, we denote the available capacity of LS as    .Then, we have where ∑ ∈    represents the sum of the available capacity of each link.

Flow Model.
We assume that a data flow  is composed of all the packets sent by a node at a time slot, where   is the maximum bits of the data flow.The average size of  is V  (V  (0,   ]).We use ϝ to represent the data flow set, which includes all the end-to-end data flows in the network.For a data flow  (ϝ), we use s() and d() to represent its source node and destination node, respectively.For a link  ( ∈ ), we use V   to represent the average size of the data flow  transmitted among the link  at a time slot.Then, according to the flow conservation, for each node n (n ∈ V), we have where   () and   () are the input link set and output link set of node n, respectively.The first equation in ( 6) represents the flow conservation formula of the source node, where V  is the data sent and ∑ ∈  () V   is the data successfully received.Similarly, the second equation represents the flow conservation of the relay node.Note that all the symbols on data in this paper represent the data sent/received within a time slot.
According to (4) and ( 5), we can obtain the following property.
Property 1 (the data requirement).The total amount of data transmitted link  ( ∈ ), denoted by    , should meet the requirement    <    .The total amount of data transmitted on a set of links should meet the requirement ∑ ∈    <    .
A summary of major notations used in the paper is given in Table 1 for easy reference.The time required for one-hop relay Θ The load score

Problem Formulation
In this section, based on the system model, we first describe the working process of one-hop opportunistic routing in CRAHNs and then analyze its throughput.

Opportunistic Routing
Primer.We use one-hop opportunistic routing to represent the process that sender node   (  ∈ V) sends data to its candidate node set.For any node   (  ∈   ) in the candidate node set, it should meet the requirement that   (  ,   ) ≥ 0, where   (  ,   ) can be computed as follows: where D is the destination node,     is the Euclidean distance between   and D, and     is the Euclidean distance between   and D.
Then, we describe how to define an opportunistic module.For any node   (  ∈ V), we use   to represent its one-hop neighboring set.Every node in   must meet the following conditions.(1) The node operates on the same channel as that for node   .(2) The node is in the transmission range of node   .Then, a subset   of   can be selected as the forwarding candidate set of node   .We use (  ,   ) and (  ,   ) to denote the neighboring module and the opportunistic module of node   , respectively.
Then we will use an example to illustrate the concept of one opportunistic module.The node S and D represents the source node and the destination node, respectively.The node  is the forward node, and the neighbor set of  is N = { 1 ,  2 ,  3 ,  4 ,  5 }.In the neighboring set N, nodes  1 ,  2 , and  3 meet the conditions of   (,   ) ≥ 0 (1 ≪ j ≪ 3).Then, the forwarding candidate node set of  is F = { 1 ,  2 ,  3 }.Therefore, the neighboring set of  is (, ), and the opportunistic module is (, ) (see Figure 1).
In the opportunistic module, every node has its relay priority.After the sender node broadcasts the packet to its forwarding candidate set, one of the candidate nodes continues the forwarding process based on their relay priority.A forwarding candidate will send the message only when all the nodes with higher priorities fail to transmit.Only when none of the forwarding candidates has successfully received the packet, the sender will retransmit the packets.The forwarding process reiterates until all the packet is delivered to the destination.
In CRAHNs, the process of one-hop opportunistic routing is shown in Figure 2. The period of  1 to  2 is for channel detection, and the period of  2 to  3 is for data transmission.
In the channel sensing step, similarly to [27], the sender searches for a temporarily unoccupied channel in collaboration with its neighbors using the energy detection technique.Before detecting the data channel, the sender broadcasts a short message (i.e., sensing invitation (SNSINV) in the CCC) to inform neighboring nodes of the selected data channel, the location information of the sender and the destination, and the used capacity of the output link.The transmission of SNSINV message in the CCC follows the CSMA/CA mechanism as specified in IEEE 802.11MAC.After receiving the SNSINV, neighboring SUs set the selected data channel to be the nonaccessible state, so that no SU can transmit in the chosen data channel during the sensing period of the sender.Using the location information in SNSINV, the neighboring SUs evaluate whether they are eligible relay candidates (i.e., whether the relay distance is nonnegative).Eligible relay candidates will collaborate with the sender in channel sensing and data transmission.In this situation, other SUs cannot transmit in the selected data channel during the reserved period specified in SNSINV (the time for Energy Detection in Figure 2).When the channel is sensed idle (i.e., no PU activity is detected), the sender and its candidate nodes will forward the data.Otherwise, the sender selects another channel and repeats the channel sensing process.
In the data transmission step, the sender broadcasts the packet to its candidate nodes.The candidate nodes transmit ACK messages in priority order after a shorter back-off window (SIFS).A candidate SU keeps listening to the data channel until it overhears an ACK or it prepares to transmit the ACK.The waiting time required for each candidate node is .If the sender receives no ACK message, it will repeat the channel sensing and data transmission steps.

Throughput Bound of One Opportunistic Module.
In this subsection, we study the capacity region of one opportunistic module (  ,   ).This capacity region will serve as a bound of throughput corresponding to the links in the opportunistic module.
Assume that the data transfer rate of the sender node   is   and its candidate forwarding set is   = { 1 , . . .,   }, in which the priority of nodes satisfies  1 > . . .>   .We use   = { 1 , . . .,   } to represent its forwarding candidate link set.For a candidate node   (1 ≤  ≤ ), its corresponding candidate link is   , whose link state and PRR are   and   , respectively.Assume that the probability of channel ℎ (ℎ) to be idle at  0 is  ℎ  .Then, we can estimate the idle probability of channel ℎ at  1 ( 1 >  0 ), according to its state (busy or idle) at  0 , which is shown as follows: where  =    +    , and  is a 0-1 variable.The variable  indicates the state of channels, where  = 1 indicates the channel ℎ is idle at  1 , and  = 0 indicates the channel ℎ is busy at  1 .Then, the probability of channel's idle state in the process of channel sensing, denoted by  ℎ  , can be obtained by where  ℎ  ( 1 ,  2 ) is the probability that channel stays idle over the time ( 1 ,  2 ).According to [16],  ℎ  ( 1 ,  2 ) can be derived by In the data transmission step, due to the interference, some candidate nodes cannot receive the packet successfully.When the SU node is disrupted by PUs' appearance, the link between it and the sender becomes unusable.Therefore, the throughput is directly related to the success rate of the links in the opportunistic module.According to the analysis result above, we introduce the metric of effective forwarding rate (EFR), which indicates the success rate of data transmission of each candidate link in the opportunistic module, denoted by .Thus, for the candidate link   of the candidate node   (1 ≤  ≤ ) in the opportunistic module (  ,   ), the effective forwarding rate can be denoted by (  ), which is defined in the following equation: where is the probability that   received the packet successfully when all candidate nodes with higher priority than node   are failed to receive the packet.
According to the (11), accumulating all EFR values of links in the candidate link set, we can obtain the EFR of the opportunistic module as follows: where ∏  =0 (1 −   •   ) means that none of the candidate nodes have received the packet.
Assume that the sender node   chooses the channel ℎ as the data transmission channel.Then, according to (9)-( 11), we can deduce the equation for the probability of the candidate node   successfully receiving the packet, denoted as   (  ), which is shown as follows: According to the conclusions in [5], we can state that when the link state is stable, the channel remains idle throughout the data forwarding process (i.e.,  ℎ  •  ℎ  ( 2 ,  3 ) = 1).
Based on (13), we can compute the cognitive transport throughput (CTT) of one opportunistic module (  ,   ) in channel ℎ as follows: where ∑  =1  ℎ  (  ) is the data transmission success rate of the whole opportunistic module, L is the data size transmitted by the opportunistic module, and   is the time required for the whole process.To facilitate the subsequent research, we assume that the transmission time of all opportunistic modules in the network is the same.Therefore, ( 14) can be simplified as follows: In the problem P, we aim to maximize the throughput of the network, as shown in (16).Equation (17) represents the flow conservation condition of the source, and (18) represents the flow conservation conditions of other nodes.At each node, except for the source and the destination, the amount of incoming flow is equal to the amount of outgoing flow.Equation (19) indicates that the amount of flow received by each node is nonnegative.Equation (20) ensures that the outgoing flow from the destination of each data flow is 0, where () is the end node of link .The physical meaning of ( 21) is that the actual flow delivered on each link is constrained by the total amount of flow residing in the network.Equations ( 22) and (23) indicate that the amount of traffic passed through the opportunistic module is less than the available capacity and the throughput of the module, respectively.
In cognitive radio networks, the state of the link will frequently change due to the changing states (busy or idle) of the channel.The dynamic link state may also lead to the network topology change and make the candidate set become unavailable and nonoptimal.Thus, we cannot verify the accuracy and effectiveness of the load balancing algorithm which is designed based on local network topology information.Moreover, we cannot guarantee that the packets are always transmitted through the links with lower traffic load.Therefore, a possible practical solution for this problem is distributing the traffic according to the load and the data forwarding capacity of the link.By this way, we can improve the network throughput while guaranteeing the communication quality.

Opportunistic Routing
In this section, we first propose a new metric to measure the priority of candidate nodes and provide the sorting algorithm.Then, we design the load balancing based candidate forwarder selection algorithm under the stable condition of the link.

Load Balancing Candidate Forwarder Sorting Algorithm.
First, we take the metric of link's reliability as an example to analyze the reason for the uneven distribution of link load.In the algorithms which are based on the link reliability, with the rise of the delivery rate of the candidate link, the priority of the corresponding candidate node will ascend.Based on opportunistic routing, the node with higher priority will receive more data.As a result, for links in the candidate link set, with the rise of the priority, the load on the link will elevate.As the amount of transmitted data increase, the higher priority link's traffic load also increases.This case may cause the link blocked.In the meantime, the amount of data allocated to the link with lower priority is small, which may lead to the waste of link resources.Therefore, with the rise of the delivery rate of the candidate link, the available capacity of the link will decline.

Metric Design.
According to the consideration above, we redesign a metric called Load Score, to measure the priority of candidate nodes based on the delivery rate and available capacity of each candidate link of an opportunistic module, denoted as Θ.In the opportunistic module (n, F), F = { 1 , . . .,   } is the candidate node set, and LS = { 1 , . . .,   } is the candidate link set.For any candidate node   (1 ≤  ≤ ), the Load Score can be derived as follows: The physical meaning of the metric is the maximum amount of data transmitted in a time slot when the link has the highest priority.This metric considers both the transmission quality of the link and load of the link.We estimate the prioritization of candidate nodes based on the Load Score.Thus, it can achieve the balanced distribution of traffic load and reduce the number of packet drops and queuing delays.Meanwhile, it ensures that the link transfers the maximum amount of data.Next, we will analyze the special situations that may occur in the process of sorting candidate nodes.
According to the condition that multiple candidate nodes may have the same Load Score, we propose the following proposition.
Proposition .When multiple candidate nodes have the same Load Score, a higher priority candidate node is always associated with a lower packet delivery rate.
Proof.We assume there are  candidate nodes { 1 , . . .,   } in the opportunistic module (n, F) and assume they have the same value of Load Scores, denoted as Θ.The corresponding links of nodes are denoted by  1 , . . .,   .We use  to represent the maximum throughput of link set { 1 , . . .,   } in a steady state.Then, we have In (25),  and Θ are constant values.Then, the value of  is decided by 1 + (1 −   ) + . . .+ ∑ −1 =0 (1 −   ).For convenience, we use the notation Υ to denote the formula 1 + (1 −   ) + . . .+ ∑ −1 =0 (1 −   ).Then, the maximum value of  can be achieved when Υ has the maximum value.Obviously, Υ is the sum of the polynomials.As a result, the necessary and sufficient condition for Υ to attain the maximum value is that each term of Υ must reach the maximum value.That is to say, (1− 1 ), . . ., ∑ −1 =0 (1−  ) must take the maximum value at the same time.Intuitively, these items are inversely proportional to the delivery rate of the candidate link.Therefore, we can conclude that the maximum Υ can be achieved when  1 <  2 < . . .<   .Accordingly, the condition is valid.Then, Proposition 2 holds.

Load Balancing Candidate Forwarder Sorting Algorithm.
In this subsection, we will explain how to sort the candidate nodes in LS.In the opportunistic module (n, F), F = { 1 , . . .,   } is the candidate node set, FL = { 1 , . . .,   } is the candidate link set, and PR = { 1 , . . .,   } is a PRR set.Each PRR corresponds to a candidate link.Based on (24), we can calculate the Load Score of each candidate node, which is denoted as H = {Θ 1 , . . ., Θ  }.We develop a simple selection sorting algorithm to sort H = {Θ 1 , . . ., Θ  }, which is shown as in Algorithm 1.
In Algorithm 1, the inputs are the candidate node set , the PRR set , and the LS set .The algorithm sorts candidate nodes based on their LS.In the algorithm, line ( 4) is used to mark the highest priority nodes for each iteration, and lines ( 6) to (12) indicate the priority ordering process.According to the process of the algorithm, the time complexity of this algorithm is O(m 2 ), where m represents the number of candidate nodes.
For the ordered candidate node set   , we can calculate the maximum throughput of the opportunistic module (n,   ) according to (11), which is shown as follows:

Load Balancing Based Relay Selection Algorithm.
In this section, we further study the load balancing solution.In CRAHNs, the packet will be forwarded to the destination through several opportunistic modules until the packet for each Θ  H and a + 1 ≤ k ≤ m do (6) if Θ  < Θ  then (7) b ← k (8) else if Θ  = Θ  then (9) if   >   then (10) b ← k (11) end if (12) end if (13) end for ( arrives to its destination.In each opportunistic module, the selection of the last hop is limited by the forwarding capability of the next hop.To achieve the global optimal, it is necessary to predict the forwarding capability of the opportunistic module which is the final hop.Then we determine the candidate node set of the first hop opportunistic module, namely, the optimal candidate node set for the first hop opportunistic module.However, this method of selecting the optimal opportunistic module is not appropriate.In the cognitive radio network, the link status is dynamic due to the dynamic activities of PUs.Hence, we cannot verify that the prediction of the forwarding ability of the last hop opportunistic module is convergent.Thus, we cannot prove the optimal opportunistic module is accurate.In [16], the author chose an intermediate destination node to alleviate the influence of the dynamic change of link state.Inspired by this method, we adopt an opportunistic module iteration method to determine the optimal candidate node set of the current opportunistic module.Since this method is based on the forwarding ability of the next hop opportunistic module, it is necessary to predict the occupied capacity of each candidate link in this opportunistic module. In this paper, we use Minimum Mean Square Error (MMSE) [28] based flow prediction model to estimate the occupied capacity of the link in the next time slot.The formula of the model is as follows: where {  |t = 0, 1, 2, . ..} is the current flow and historical flow statistics.In [28], the author describes the prediction scheme in detail, and readers can refer to it for more details.For the opportunistic module (n, F), the candidate node set is F = { 1 , . . .,   }, the candidate link set is LS = { 1 , . . .,   }, and the corresponding LS set is H = {Θ 1 , . . ., Θ  }.

Wireless Communications and Mobile Computing
Input: F = {n 1 , . . ., n m }, H = {Θ 1 , . . ., Θ m } Output: The optimal node set F optimal (1) Initialize arrays of H  , F  , F  , F optimal and LS  (2) Initialize values of o The selection steps of the optimal opportunistic module are described as follows.
(1) For any candidate node   ∈ , we first determine its opportunistic module (  ,   ) according to relay distances.Then, we perform the following steps: (1)  (2) We compare the LS value Θ  of candidate node   with CTT((  ,    )) and assign a smaller value to Θ  .(3) We sort the candidate node set  based on Algorithm 1, and the ordered set of candidate nodes is the optimal candidate node set.
The details of the Load Balancing Relay Selection Algorithm are shown in Algorithm 2. In Algorithm 2, lines (1) and (2) define the relevant parameters and their initialized values.Lines (3) to (18) show the selection process for the optimal candidate forwarding node.According to the process of Algorithm 2 and the time complexity of Algorithm 1, the time complexity of Algorithm 2 is O(m|  | 2 ), where m is the number of candidate nodes of the opportunistic module (n, F), and |  | is the maximum number of candidate nodes in all the next hop opportunistic modules.We can select the optimal opportunistic module for each hop according to Algorithm 2. It balances the traffic load of each candidate link while ensuring the quality of communication.Meanwhile, it improves the throughput of the local network.

Performance Evaluation
In this section, we evaluate the performance of our proposed scheme with two well-known routing protocols for CRAHNs and present the simulation results on the SU-PU interference ratio, the packet delivery ratio, the throughput and the endto-end delay, under different number of flows, PUs' activities, number of channels, and number of PUs.
6.1.Simulation Settings.The network parameter settings are summarized in Table 2.In the simulations, multiple SUs and PUs are randomly deployed in a 1000 m × 1000 m area, and they are assumed to be stationary throughout the simulation.As stated in Section 3.1, the PU activities of each channel are modeled as an exponential ON-OFF process.
The status of each channel is associated with two parameters: one is the expected channel OFF time [  ℎ ] and the other is the probability of the idle state  ℎ  .The channel status is estimated by utilizing periodic spectrum sensing and on-demand sensing before data transmissions.In the simulations, 30 constant bit rate (CBR) flows are generated between randomly chosen source-destination pairs of SUs, and they are associated with the packet size of 512 Bytes and flow rate of 5 Kbps.Notice that the most representative opportunistic routing scheme ExOR [9] adopted the method in [29] to obtain the packet reception ratio between two nodes.Accordingly, in this simulation, the packet reception ratios of the links are also based on the distance-to-delivery ratio relationship measured in [29].Thus, we require the position information of different nodes to compute the distance between neighboring nodes and obtain the packet reception ratios of the links.We evaluate our scheme through NS-2 simulation [30].In NS-2, our LBOR scheme is implemented as the routing agent.In this work, we only focus on the network layer without considering MAC and physical layer issues.Since the IEEE 802.11MAC protocol is customized for CR support, we use the modified 802.11cr as the CSMA MAC agent in our simulation script.We also apply CRN patch [31] to our simulations that can support multiple channels at the physical layer.
In the simulations, we compare our LBOR scheme with SAAR scheme [15] and SAOR protocol [27].The SAOR protocol is a well-known multipath opportunistic routing protocol, coupled with spectrum sensing and sharing in multichannel CRAHNs.The SAAR scheme is a recently published anypath opportunistic routing scheme with consideration of the unreliable transmission feature and the uncertain spectrum availability characteristic of CRAHNs.
The performance metrics for our experiments are defined as follows: (i) The throughput is defined as the total amount of traffic (in bits per second) an SU receiver receives from the sender divided by the time it takes for the receiver to obtain the last packet.A routing scheme with higher throughput is desirable for CRAHNs.(ii) The end-to-end delay is defined as the average delay which is calculated by summing up the end-to-end delays of all packets received by all destination 10 Wireless Communications and Mobile Computing nodes and dividing it by the total number of received packets.A routing scheme with the lower end-to-end delay is preferred for CRAHNs.
(iii) The SU-PU interference ratio is defined as the ratio of the number of SUs' packets interrupted by PUs' activities to the total number of packets delivered by an SU source node.A routing scheme with lower SU-PU interference ratio is favorable for CRAHNs.
(iv) The packet delivery ratio is defined as the ratio of all successfully received data packets that are fully received by the SU destination node to the total data packets generated by the SU source node.A routing scheme with higher packet delivery ratio is desirable for CRAHNs.
To ensure the statistical significance of the results, we run each experiment for 500 seconds and repeat it 1000 times with different seeds to report the average value as final results.

Impact of Number of Flows.
In this part, we evaluate the impact of offered traffic load on the performance by changing the number of flows.We vary the number of flows from 15 CBR flows to 35 flows.
As shown in Figure 3(a), we observe that LBOR provides higher throughput than SAAR and SAOR.We first consider the case that the number of flows is below the saturation level.Recall that the throughput defined here is measured by the aggregate throughput of all the flows.In this case, when the number of flows increases from 15 flows to 25 flows, the throughput of all three schemes starts increasing as more SU receivers are involved, and more parallel concurrent transmissions occur.Then, we consider the case that when the number of flows is larger than 25, this indicates that the network traffic is becoming saturated.In this case, the throughput gain of SAAR and SAOR starts to decline.As the network resources (available channels and links) are limited, the increasing number of flows intensifies the congestion and interference, which explains the throughput degradation of SAAR and SAOR.However, when more flows are involved, the throughput of LBOR still increases with increasing number of flows.This is because LBOR has a load balancing feature and solves the congestion problem by routing the traffic through alternate forwarders.
Figure 3(b) shows that LBOR has superior end-to-end delay performance as compared to SAAR and SAOR.When the number of flows is equal to 15, the performance gap between LBOR and SAAR (both with SAOR) is small.In this case, each relay node does not need to handle too much traffic for multiple flows.Thus, the unfair distribution of network traffic will not lead to extra end-to-end delay.Nevertheless, if the number of flows is above 25, the end-to-end delay of SAAR and SAOR increase more sharply than that of LBOR.This can be attributed to the ability of LBOR that it sends packets to the links with lower traffic load and delay.Moreover, it can be concluded from the result that LBOR facilitates a better distribution of traffic, and the consequent reduction of queuing times contributes to a lower overall delay.

Impact of PUs' Activities.
In this part, we evaluate the performance of LBOR under different PUs' activity patterns.The expected channel OFF time [   ℎ ] varies from 20 ms to 120 ms.A small [   ℎ ] means that the PUs' activities are high, and thus the delivery of the secondary user is more likely to be interrupted.Different from the previous experiment, the number of CBR flows is fixed to be 30.
Figure 4(a) shows the SU-PU interference ratio of the three schemes under different values of [  ℎ ].In most cases, LBOR produces significantly lower SU-PU interference as compared to SAAR and SAOR.This is because the traffic load of LBOR is distributed efficiently and fairly.In LBOR, fewer flows will have a chance to share the same SU node, which is highly influenced by PUs' behaviors.Moreover, we notice that the improvement of load balancing scheme is significant under high PUs' activities ([  ℎ ] is below 60 ms), but not under low PUs' activities ([  ℎ ]) is above 60 ms).The reason behind this phenomenon is that when the nodes are highly influenced by PUs' behaviors and overloaded, LBOR will give priorities to other nodes with a lower traffic load to release the burden of these nodes.ℎ ] becomes larger.The result also shows that LBOR achieves higher throughput compared to SAAR and SAOR.When the PUs' activities decrease (the value of [  ℎ ] increases), the performance difference between LBOR and SAAR (both with SAOR) becomes negligible.Moreover, the traffic load of LBOR is evenly distributed across the SUs and different available channels.Therefore, it is expected that more improvements can be obtained for LBOR when PUs appear frequently.
The end-to-end delay and the packet delivery ratio performance with different types of PUs' activities are shown in Figures 4(c) and 4(d).With the increase of [  ℎ ], the packet delivery ratio will increase, and the end-to-end delay will decrease.When the value of [  ℎ ] is around 20 ms, LBOR can reduce the end-to-end delay by up to 36% and 40%, compared to SAAR and SAOR.However, LBOR can only increase the delivery ratio by up to 2% and 3% respectively, compared to these two schemes.There are two main reasons for this case.First, the opportunistic routing protocol of LBOR is based on multipath routing strategy, where the paths and relay nodes are determined on-the-fly.In addition, SAAR and SAOR also utilize the similar opportunistic routing principle, and they aim to increase the packet delivery ratio for the dynamic changing spectrum environment.Second, SAAR and SAOR always choose paths or forwarding nodes with the best performance of packet delivery ratio.Nevertheless, LBOR jointly considers the load balancing for selecting the relay nodes.Thus, the improvement of packet delivery ratio from LBOR is limited.

Impact of Number of Channels.
In this part, we compare the performance of the three schemes under different number of channels.The number of channels varies between 2 and 10, and the expected channel OFF time is set to be 60 ms.
Figure 5(a) compares the SU-PU interference ratio of the three schemes by varying the number of channels of the network.We can observe that the SU-PU interference ratio of LBOR is lower than that of SAAR and SAOR.One significant observation here is that when the number of channels is larger than 6, the SU-PU interference ratio of SAAR and SAOR tends to decrease as the number of channels is increased.In CRAHNs, a larger number of channels can service a larger number of SUs.Thus, in this case, SUs can choose more stable channels to avoid PUs' interruptions.In addition, if an SU is interrupted by PUs' appearance, it can easily find another available channel for data transmissions.Another important observation made is that when the number of channels increases from 2 to 6, the SU-PU interference ratio of LBOR will increase, but the other two schemes will decrease.An explanation for this phenomenon is as follows.Considering the load balancing strategy, LBOR tries its best to distribute traffic evenly among multiple nodes.When there are not enough available channels to avoid the mutual interferences among SUs, the packets cannot be correctly decoded at the receiver.Due to the same reason, we can also observe from Figures 5(b), 5(c), and 5(d) that when the number of channels is below 6, the improvements from load balancing are not significantly.
Figure 5(b) shows the throughput of LBOR, SAAR, and SAOR with different number of channels in the network.We observe that the throughput of these schemes increases as the number of channels becomes larger.The result also confirms that no matter how many channels are available; LBOR always performs better than SAAR and SAOR.When the number of channels is above 6, more improvement can be obtained from LBOR.
As shown in Figures 5(c) and 5(d), they provide similar results and demonstrate that LBOR can achieve lower endto-end delay and higher packet delivery ratio in most cases, compared to SAAR and SAOR.The improvements come from the load balancing based metric and the load balancing based relay selection algorithm.On the one hand, the proposed scheme adopts a load balancing based metric to choose the relay nodes with higher throughput and lower delay.On the other hand, the load balancing based relay selection algorithm can predict the occupied capacity of each link and thus ensure that the traffic load is evenly balanced among all of the relay nodes.

Impact of Number of PUs.
In this experiment, we evaluate the performance of three schemes under different number of PUs.The number of PUs varies between 10 and 40, and the number of channels is set to be 6. Figure 6(a) compares the SU-PU interference ratio of different schemes under different number of PUs.In most cases, the SU-PU interference ratios of SAAR and SAOR rise sharply than LBOR.That is because the network traffic of nonload balancing methods (SAAR and SAOR) tends to concentrate on a specific number of SUs and links.If these SUs' channels are occupied by PUs, most packets of the network may be affected.On the contrary, LBOR distributes traffic to a larger number of SUs, thus packets have less change to be affected by the PUs' appearance, and it induces a load balancing  improvement compared with SAAR and SAOR.When the number of PUs is below 30, the benefit from load balancing is not surprising since SUs are not heavily affected.When the number of PUs is 35, an interesting observation here is that the SU-PU interference ratio of LBOR is lower than the case of 30.The reason for this case is that the aggregated benefit from load balancing scheme overwhelms the negative effect of the increased number of PUs.More specifically, in LBOR, SUs are not overloaded and the packets are properly distributed.Thus, when data transmission failures are caused by the appearance of PUs, it is much easier for LBOR to reroute the blocked traffic flows to the light-loaded SUs whose channels are available, compared with other two methods.When the number of PUs is 40, we observe that the SU-PU interference ratio of LBOR is higher than the case of 35.In this situation, there are not enough available channels for SUs to transmit the traffic of the whole network.Thus, it is difficult to find light-loaded SUs to reroute packets for overloaded SUs, and more packets would be interrupted by PUs' activities.
Figure 6(b) shows the throughput by varying the number of PUs in the network for the three schemes.In the figure, we observe that the proposed LBOR scheme achieves higher throughput than SAAR and SAOR.We can also find that when the number of PUs increases, the throughput decreases because of available resources are decaying.Moreover, when the number of PUs is small, the improvement from LBOR is limited.On the contrary, when the number of PUs rises above 25, the throughput of SAAR and SAOR drops drastically, but the throughput degradation of LBOR is limited.This is due to the fact that LBOR can predict the availability and the occupied capacity of each link, and thus relieve the network's congestions.
Figures 6(c) and 6(d) can also confirm that LBOR can provide the lowest end-to-end delay and the highest packet delivery ratio among the three schemes.When the number of PUs increases, the unbalanced distribution of load among nodes becomes a critical issue and has a negative impact on the end-to-end delay and the packet delivery ratio.This can be understood intuitively as follows.On the one hand, the previous link may become unavailable due to the PU appearance, and the packets which arrived earlier have to wait for late packets in reordering buffers at the receiving destination, thus resulting in an increasing queuing delay.On the other hand, if late packets do not arrive within the imposed deadline, it will be treated as a lost one, thus resulting in an increasing packet loss.

Conclusion
In this paper, we have proposed a load balancing opportunistic routing scheme for cognitive radio networks.Considering the loading constraint of each node, we analyzed the throughput bound of one opportunistic module and formulated the problem of maximizing the total throughput of the network as a linear programming problem.Moreover, we designed a new metric to select and prioritize the candidate nodes which considers the traffic load.We also proposed load balancing based candidate forwarder sorting and relay selection algorithms.Simulation results have shown the advantages of LBOR over SAAR and SAOR concerning the SU-PU interference ratio, the packet delivery ratio, the throughput, and the end-to-end delay of CRAHNs.

Figure 2 :
Figure 2: The process of one-hop opportunistic routing for CRAHNs.

Figure 4 (
b) displays the throughput of LBOR, SAAR, and SAOR with different values of [  ℎ ].It shows that the throughput of these schemes increase as the value of [ Figure 3 Figure 4 SU-PU interference ratio versus varying number of channels End-to-end delay versus varying number of channels Packet delivery ratio versus varying number of channels

Table 1 :
Summary of notations.  =   ∩   The set of common channels between node   and     The Euclidean distance between node   and     The link between node   and     The packet reception ratio of     The state of The probability of channel ch to be busy/idle  The available capacity of the link set LS V The average size of the data flow   The maximum bits of the data flow  FThe data flow set  The data flowing through link l   ()/  ()The input/output link set of node n   (  ,   )The relay distance between node   and     The one-hop neighboring set of node     The forwarding candidate set of node     The data transfer rate of node     The priority order of node    The sum of    and     The state of channels  The effective forwarding rate   (  ) The probability of the candidate node   successfully receiving the packet  ℎ  The cognitive transport throughput of the opportunistic module (  ,   ) ) 4.3.Maximize the Total Throughput of the Network.In this subsection, we formulate the optimization problem of network throughput as linear programming (LP) problem: 1 , o 2 and ctt (3) for each n e F and 1 ≤ e ≤ m do (4) Search the opportunistic module(n e , F e ) of n e (5) F  ←the candidate node set of (n e , F e ) (6) LS  ←the candidate link set of (n e , F e ) predicting the occupied capacity of each link in candidate link set   based on (27); (2) calculating the available capacity of each link in   based on (14); (3) computing the LS of each node in the candidate node set   based on (24); (4) sorting the candidate node set   and obtaining an ordered candidate node set