Sum Rate Maximization of D2D Communications in Cognitive Radio Network Using Cheating Strategy

This paper focuses on the cheating algorithm for device-to-device (D2D) pairs that reuse the uplink channels of cellular users. We are concerned about the way how D2D pairs are matched with cellular users (CUs) to maximize their sum rate. In contrast with Munkres’ algorithm which gives the optimal matching in terms of the maximum throughput, Gale-Shapley algorithm ensures the stability of the system on the same time and achieves a men-optimal stable matching. In our system, D2D pairs play the role of “men,” so that each D2D pair could be matched to the CU that ranks as high as possible in the D2D pair’s preference list. It is found by previous studies that, by unilaterally falsifying preference lists in a particular way, some men can get better partners, while no men get worse off. We utilize this theory to exploit the best cheating strategy for D2D pairs. We find out that to acquire such a cheating strategy, we need to seek as many and as large cabals as possible. To this end, we develop a cabal finding algorithm named RHSTLC, and also we prove that it reaches the Pareto optimality. In comparison with other algorithms proposed by related works, the results show that our algorithm can considerably improve the sum rate of D2D pairs.


Introduction
The ever-increasing demand of wireless communication services motivates system designers and researchers to seek the possibility of reusing frequency resources or enhancing the system capacity.Cognitive radio (CR), as a promising technology in future wireless communication, enables secondary users (usually unlicensed) to reuse the frequency band of primary users with the later ones' grants [1,2].The merits of CR technology are that first it can offload the communication overhead of the base station in a centralized system, and second it substantially improves the spectral efficiency (SS) and energy efficiency (ES).Specifically, there are two modes of sharing the spectral resources, that is, overlay and underlay.In the overlay mode, secondary users firstly need to sense spectrum holes, which means the temporarily unused channels, and then commence real transmissions [3].On the other hand, if the channel gains between the primary and secondary users are sufficiently small so that both sides are able to successfully convey their signals, this situation is also acceptable in CR networks, which is named underlay mode [4].
With the global explosive growth of services in nowadays wireless communication system pushing the concepts like massive machine type of communication (mMTC), ultradense networks (UDN), and so on, the communication environment tends to be saturated in terms of spectrum resources.An important relevant technology supporting CR networks is device-to-device (D2D) communication which can build direct link from end to end [5,6].In [7], the systematic capacity of relay assisted CR network is studied, where D2D users can both work in overlay and underlay modes.The paper [8] investigates energy efficiency maximization for the secondary users subject to a QoS requirement for the primary user, using precoder design and energy resource allocation.
To fully utilize the CR networks and D2D communication technology, it usually turns out that we need to solve an allocation problem.For example, assume we have  D2D pairs who hope to reuse the channel of  CUs; how to arrange the matching between the two sides for certain purposes (e.g., SS or ES maximization) becomes an interesting problem, since different allocation schemes would dramatically differ from others in terms of system performance.For one-toone optimal matching, Munkres' algorithm, also known as Hungarian algorithm, is able to solve the problem in polynomial time [9,10].However, to find the matching by Munkres' algorithm, we need a centralized computation, which could cause some communication overhead, whereas the formation of D2D network is more like a natural process.Hence, we introduce another concept, namely, stability, which indicates the possibility that a system structure might change.For instance, if a secondary user prefers to be allocated to another primary user, and coincidently this primary user prefers the secondary user to its partner, they might drop their current partners and organize a new channel-reusing relationship.Nevertheless, Gale and Shapley propose the celebrated GS algorithm based on the preference list of both sides [11] to solve the stable matching problem, which has been adopted in a broad studies [8,[12][13][14][15].For a particular system, there could be more than one stable matching, while GS algorithm gives men side the best opportunities to get their partners that rank as high as possible in the men's preference lists.The paper [16] proposed a novel cheating strategy for the men side to further enhance their benefit.The "cheating" needs a group of men to falsify their preference lists exquisitely.The paper [13] firstly applies this cheating strategy in a CR network with secondary users being D2D pairs.However, compared with the existing studies, the main contributions of our work are as follows: first, in order to maximize the sum rate of D2D pairs, we need to search as many and as large cabals as possible, so we developed cabal finding algorithms that outperform existing algorithms greatly; second, for the proposed algorithm RHSTLC, we proved its Pareto optimality, whereas the algorithms in previous studies do not hold this property; third, we numerically demonstrated and evaluated the performance of our proposed algorithms.
The remainder of this paper is organized as follows.Section 2 elaborates the system mode of D2D communication involving cognitive radio networks.In Section 3, we detail the cheating strategy for D2D pairs including how to find a cabal, how to define the corresponding accomplice, and how to falsify their preference lists.The performance of our proposed algorithms and existing algorithms is shown and commented in Section 4. Eventually the study is concluded in Section 5.

System Model and Problem Formulation
The device-to-device cognitive network system is illustrated in Figure 1, where  D2D pairs D = { 1 , . . .,   , . . .,   },  CUs C = { 1 , . . .,   , . . .,   }, and a base station are involved.In this study, we assume  = , even though unbalanced cases seem more practical.Nevertheless, we can drop  −  poor CUs when  >  and do it likewise when  > , which could always lead us to equalized situation.The CUs and the station are working on frequency division duplexing mode, which means each CU occupies an orthogonal frequency band to upload its data.On the  meanwhile, D2D pairs expect to reuse the uplink channels of the CUs for fully utilizing the local spectral resources.The channel gain between the two devices in the D2D pair   is denoted by   , and between the transmit device and base station is  , .On the other hand, the channel gain between the CU   and base station is denoted by ℎ , and between the CU and the D2D pair   is ℎ , .Each CU is constrained by a minimum signal to interference and noise ratio (SINR)  ,min ; otherwise, the transmission is unsuccessful.The maximum transmit power of CUs is  , and of D2D pairs is  , .
However, in our system, D2D users are not aware of CUs' SINR constraints, forcing themselves to always adopt the maximum transmit power.For an arbitrary D2D pair, it is only allowed to reuse at most one uplink channel, and the same goes to the other side; that is, each CU is only allowed to accept one D2D pair.For instance, if   agrees with the channel-reusing proposal from   , with  0 denoting the noise power, the SINR of the D2D pair is given by and correspondingly the SINR at the BS side from the CU   is written as (1) Initialize: Obtain the whole set of D2D pairs D and the set of CUs C. Obtain the preference lists of all the D2D pairs L  and all the CUs L  .Build a temporary set TD = D. (2) while TD is not empty do (3) An arbitrary D2D pair   in TD proposes to the CU ranking highest in its preference list, say   .(4) if   has not been proposed yet then (5)   accepts the proposal of   .( 6) else (7) if accepts the proposal of   ;    backs to TD. ( 9) else (10)   proposes to the next CU in its preference list.(11) end if (12) end if (13) end while Algorithm 1: Gale-Shapley algorithm.
A preference list L of a participant (D2D pair or CU) is a set including all of its counterparts, sorted by the predicted SINR from the highest to the lowest.If   is located leftmost in   's preference list L   ,   would be the most wanted counterpart of   , and we denote rank(  ,   ) = 1.In addition, we use     ←      to indicate that   prefer   to    , and mathematically we can define the expression by A matching M arranges every D2D pair to be matched with a CU; however, it does not mean that the D2D pair can eventually commence a transmission, since the final say belongs to the CUs.In other words, as long as the D2D pairs satisfy the corresponding CU's SINR requirement, they can share the channel; otherwise the CU would solely occupy the channel.The collections of the counterparts of participants in  are given by M().For example, M(  ) =   indicates that   is allocated to   in matching M, and conversely it must hold M(  ) =   for one-to-one matching.Now, we can introduce the concept of blocking pair and stable matching.
) is called a blocking pair.On the contrary, we say M is stable, if it has no blocking pair.
Literally speaking, a stable matching M of our network means that when a D2D pair   proposes to any CU ranking higher than its partner M(  ), the CU will definitely refuse the D2D pair's proposal, so that the topology of the network remains constantly stable.To obtain a stable matching, the Gale-Shapley algorithm is adopted, which is a classical algorithm in graph theory solving the stable marriage problem.We demonstrate the way of using the GS algorithm for our system in Algorithm 1.
In GS algorithm, the proposers gain the most benefits; that is to say every D2D pair is assigned the best possible CU among all the stable matchings.Hence, the stable matching obtained by GS algorithm is also called the men-optimal stable matching (in the original GS algorithm, men are the proposers).Consequently, the sum rate of D2D pair according to Shannon's theory can be written as where  is the channel bandwidth.As we can see, the preference lists L  , L  and the GS algorithm determine the men-optimal matching M and hence determine the D2D pair sum rate.So, the problem in this study is how D2D pairs falsify their preference lists to maximize their average data rate: that is, max where M 0 is the original men-optimal stable matching without D2D users falsifying preference lists and M  is the new stable matching according to the falsified preference list L  * .The constraint indicates that every D2D pair must get a CU at least as good as the one in the original matching.

Coalition Strategy.
The study [16] introduced a novel cheating strategy for proposers, which requires some of them to falsify their preference lists.As a result, some of the proposers are better off, while none of them are worse off.In a more complicated system model, [13] adopted this strategy to achieve better throughput of D2D users.This study actually simplifies the model and focuses on how to improve the cheating strategy.Now, we introduce the cheating method.First we need to find a cabal  = {  |  = 1, 2, . . ., } whose members are all from the D2D pairs: that is,   ∈   ←  M 0 (  ).That is to say, the D2D pair   prefers  −1 's cellular partner to its own partner in the matching M 0 (except that  1 prefers   's partner).If we could manage to let all the members in the cabal exchange their partners, they are better off in the new matching.However, we should keep two things in mind.First, the new matching must be also obtained by GS algorithm so that it remains stable.Second, besides the members in the cabal, the remaining D2D pairs should not be worse off or they lose the motivation helping the cabal.
However, there are two sorts of D2D pairs that get in the cabal's way of getting the new better partners.So, we need them to falsify their preference lists, and this set of D2D pairs is called the accomplice of the cabal A(), specifically We assume that the accomplices are willing to help the cabals to get their new better partners under one crucial condition that they are not getting a worse partner in the new matching.To satisfy the accomplices' condition, we find the following Theorem 2 proposed and proved by [16] to be important.Before entering the theorem, we have to define some notations.  (⋅) stands for the random permuting operation that disorders a set.PL() and PR() are the CUs located on the left and right side of M() in   's preference list, respectively, so that the preference list of   could be expressed as {PL(), M(), PR()}.
According to this theorem, [13,16] give the cheating strategy for the men side, which is briefed in Algorithm 2.
Submitting the falsified lists, the accomplices make the cabal successfully exchange their partners in the new matching, and hence the average data rate of D2D pairs could be enhanced.

Searching Large Cabals.
Generally speaking, the larger the cabal is, the more improvement in data rate the D2D pairs can achieve.Therefore, the maximization problem in the last section results in a classical problem in graph theory, namely, finding the largest circle in a directed graph, which is a NP hard problem.In detail, each vertex is a D2D pair, and each directed edge shows a preference relationship.Hence, [13] suggests a random cabal search (RCS), which starts from an arbitrary D2D pair, and stops as a cycle is found.In order to obtain a relatively larger cabal, [13] conducts the RCS for each D2D pair and adopt the largest one during the process.However, in this study, inspired by the method in [17], we develop a heuristic search for the largest cabal (HSTLC) that is able to possibly find the largest circle.
Before entering the HSTLC algorithm, we have to build the directed graph; hence, we need the following things: (i) the whole set of D2D pairs D, (ii) D2D pairs' preference lists L  , and (iii) the original men-optimal stable matching M 0 .For each parent vertex   , we find all its child vertices by the We draw a directed edge from   to every vertex in its child vertices set; therefore, the directed map G is constructed.
The pseudo code of HSTLC is given in Algorithm 3, which mainly consists of two parts: the main function ensures that each vertex is reached at least once; the visit next function defines a dyeing scheme that contributes to finding a large cabal.In detail, we start from an arbitrary vertex   , dyeing it gray.Then we exploit   's child vertices, the D2D pairs whose cellular partner is preferred by   to its partner in M 0 .Next the algorithm chooses a white child vertex to move on, and during the process if any exploited child vertex is found dyed gray, a cabal is then formed.Each time we find a cabal, it is compared with the largest cabal ever obtained, that is,  max .If it is larger than  max , we update it as the new largest cabal and drop the preceding one; otherwise, we drop it and move on.Eventually, the algorithm ceases only when all vertices have been exploited.A cabal  is found; (7) if size() > size( max ) then (8)  max = ; (9) end if (10) else (11) continue; (12) end if (13) end for (14) Color() = black Algorithm 3: Heuristic search for the largest cabal (HSTLC).

Algorithm 4: Recurrent HSTLC (RHSTLC).
Finding a large cabal is sufficiently able to improve the average data-rate of D2D pair; however, after removing the found cabal by HSTLC, the remaining graph may still include some cabals, which means there are possibilities to further improve the system performance.For example, in Figure 2, the first run of HSTLC might find a large cabal  1 = { 1 ,  2 ,  3 }; however, apart from  1 , we can see that { 4 ,  5 } is also a cabal.Based on this scenario, we propose a recurrent HSTLC (see Algorithm 4) that runs HSTLC over and over again until no more cabals can be found.

Computational Complexity.
To obtain the men-optimal stable matching, a computational complexity by GS algorithm O( 2 ) is needed.Also, after determining an appropriate cabal, we need to change the preference lists of D2D pairs by the coalition strategy, which costs a complexity of O(), where  is the size of the cabal.Since the size of a cabal cannot exceeds the size of D, the complexity of the coalition strategy is at most O( 2 ).To search a random circle from a given vertex, it costs O( + ) computations, where  is the amount of edges in the graph G. Hence, the ergodic RCS (ERCS) algorithm given by [13], which runs RCS upon every vertex, is conceived as with the complexity order of O(( + )).On the other hand, HSTLC reaches out to each vertex once, leaving us with a complexity order of O( + ).The worst case that RHSTLC can only remove a smallest cabal (2 D2D pairs) from the graph G gives us the complexity of O(( + )).To sum up, we conclude the above discussion into Table 1.Besides, since the entire process of falsifying preference lists follows the procedure:  GS (to find the original stable matching M 0 ) → cabal finding algorithm → CS, we also compute the complexity order of the overall complexities throughout in the table.

Pareto Optimality.
In this section, we solely view the optimality from the D2D pair side, so we have the following theorem.
Theorem 3. If the D2D pairs follow the RHSTLC to find cabals and change their preference lists according to the coalition strategy, then after the GS algorithm, D2D pairs reach Pareto optimality in the final stable matching M  .
Proof.Suppose not, there is a subset of D2D pairs S = {  } ⊆ D that can improve their own data rates without reducing other D2D pairs' data rates, which means all the D2D pairs in S get better off while the D2D pairs outside remain the same.Since   gets better off, it must be allocated to some CU   that is located on the left hand side of M  (  ) in   's preference list.Therefore, the D2D pair   who was originally allocated to   must seek for a new CU   that ranks higher than M  (  ) in   's preference list.Eventually some D2D pair   must be allocated to M  (  ).The D2D pairs {  ,   , . . .,   } swap their partners and everyone gets better off, so this group is a cabal.However, as we can see from RHSTLC, the terminating condition is when no more cabals can be found; therefore, a contradiction occurs and hence we complete the proof.However, as can be easily seen in the Pareto optimality of ERc HSTLC, the optimality in terms of the average data rate of D2D pairs cannot be guaranteed.

Performance Evaluation
In this section, we investigate the performance of the proposed algorithms.The default simulation environment is elaborated in Table 2.All the CUs and D2D pairs are uniformly distributed in the cell, and we assume that the D2D communication distance is uniformly over 10∼30 m.Munkres' algorithm (MK) and original GS algorithm are benchmarks, which, respectively, show the upper bound and lower bound of performance of our proposed algorithm.Particularly, the MK curves indicate the optimal matching that can achieve the maximal throughput of the system, regardless of stability.GS curves show the status of D2D pairs without falsifying their preference lists.In addition, the outcomes by ERCS algorithm are also plotted to validate the efficacy of HSTLC and RHSTLC.
In our simulation, the channel gain between CUs (or D2D users) and the base station is given by ℎ , = 128 + 37.6 log 10 (  ) − 10 log 10 () + , where   is the distance from the CU to the base station. follows the exponential distribution Exp(1), which models the Rayleigh fading effect. is normally distributed by N(0, 3), modeling the shadowing effect.This equation also works for  , , that is, the channel gain of D2D users.In addition, we believe the path loss between users should be larger than in (7) due to the complex indoor environment, which leads us to ℎ , = 148 + 40 log 10 ( , ) − 10 log 10 () + , where  , is the distance from the CU   to the D2D pair   and the same equation goes to   , that is, the channel gain between the two devices in the D2D pair   .
In Figure 3, we present the relationship between the systematic sum rate of all users including CUs as well as D2D pairs and the number of users.We recall that in our system the number of CUs equalizes the number of D2D pairs,  = .As can be seen clearly in the figure ,  all other stable matchings, where RHSTLC achieves 20.03% higher system throughput than GS when  = 30.Though being a little shy to RHSTLC, HSTLC still outperforms ERCS and original GS algorithm.According to this result, we can understand that our proposed algorithms are conducive to the enhancement of system throughput, even though they are designed to solely improve the sum rate of D2D pairs.Figure 4 illustrates the sum rate of D2D pairs versus the number of D2D pairs or CUs.Due to more D2D pairs participating into cabals, the effect is strong and clear: that is, RHSTLC outperforms other stable matching algorithms.As we can see that the improvement in sum rate of D2D pairs is larger than the improvement in throughput of the whole system, which means that, on the other side, the sum rate of CUs is reduced.This phenomenon is understandable, since D2D pairs and CUs are two competitive sides in terms of communication resources.In comparison, we plot the sum rate of CUs in Figure 5.It can be clearly seen that the curves are upside down compared with Figure 4, which actually verified what we mentioned.With D2D pairs falsifying their preference lists and extracting more communication resources, the performance of cellular degrades.
In Figure 6, we show averagely how many D2D pairs get their favorite CU, that is, the one that ranks the first in their preference lists, and how many get their second favorite one and so on.We set the cell range  = 300 m and user number  = 20 in the simulation, and, respectively, 23.08%, 18.03%, and 13.75% of the total D2D pairs get the channel reuse  granted from their favorite CUs in RHSTLC, HTLC, and ERCS.Generally speaking, D2D pairs in RHSTLC are most likely to get partners that rank high in their preference list.From the simulation data, we calculate the expected values of the rank of CUs in RHSTLC, HTLC, and ERCS which are 6.252, 7.123, and 8.986 correspondingly.
We demonstrate the sum rate of D2D pairs versus cell range in Figure 7, where the simulation result is averaged over 2000 system realizations with CUs  = 30.Still the RHSTLC is higher than other algorithms, while a little shy to the MK algorithm which is unstable.An interesting phenomenon is found that the sum rate of D2D pairs increases at first but decreases later on instead as the range continues to rise.It can be explained by the fact that when the cell range is short, as the range increases, the interference caused by CUs decreases so that the D2D pairs gain more data rate.However, when the cell range is beyond some point, more and more CUs are likely to refuse D2D pairs' proposal due their own SNR constraints; therefore, less D2D pairs can eventually reuse the channel and the sum rate starts to fall.Also because of the same reason, when the cell range is large, the optional CUs for each D2D pair are shrunk, which means the edges in the in the map G are shrunk as well.Even if we recurrently search for cabals, few more of them can be found, so the RHSTLC and HSTLC curves tend to converge together as the cell range becomes larger.We can also observe the phenomenon from Figure 8 which shows the throughput versus cell ranges.The overall system performance manifests a monotonous decreasing property with the cell range.

Conclusion
We considered a cognitive radio system, where D2D pairs could reuse the uplink channels of CUs.To build a stable system, we adopted GS algorithm, whereas D2D pairs are able to falsify their preference lists to gain more data rate.With the fact that the more D2D pairs join the cabal, the more they gain, we developed algorithms to search cabal as large as possible.We computed and compared the computational complexity of all elaborated cabal finding algorithms and proved the Pareto optimality of RHSTLC.In the simulation results, with respect to all stable matchings, the one obtained by RHSTLC shows superiority over others.Also the D2D pairs following the RHSTLC are more likely to be matched to the CUs who rank as high as possible in D2D pairs' preference lists.

Figure 3 :Figure 4 :
Figure 3: System throughput comparison among different algorithms versus user size.

Figure 5 :Figure 6 :
Figure 5: Comparison of cellular users' sum rate among different algorithms versus user size.

Figure 8 :
Figure 8: System throughput comparison among different algorithms versus cell range.