Playing Radio Resource Management Games in Dense Wireless 5G Networks

This paper considers the problem of an efficient and flexible tool for interference mitigation in ultradense heterogeneous cellular 5G networks. Several game-theory based approaches are studied, focusing on non-cooperative games, where each base station in the end tries to maximize its payoff. An analysis of backhaul requirements of investigated approaches is carried out, with a proposal of a mechanism for backhaul requirements reduction. Moreover, improvements in terms of energy use optimization are proposed to further increase the system gains. The presented simulation results of a detailed ultra dense 5G wireless system show that the discussed game-theoretic approaches are very promising solutions for interference mitigation outperforming the algorithm proposed for LTE-Advanced in terms of achieved spectral efficiency. Finally, it is proved that the introduction of energy-efficient and backhaul-optimized operation does not significantly degrade the performance achieved with the considered approaches.


Introduction
As increase in signal coverage, throughput guarantees of efficient service delivery, and finally highly effective spectrum utilization were said to be the main goals of the second, third, and fourth generations of cellular systems [1,2], it is the integration of various networks with energy efficiency that is foreseen as the key factor for further development of wireless networks.Higher internetwork integration results in higher steering/control data exchange, which influences traffic growth in backhaul and core networks.At the same time, the topic of energy efficiency in future wireless networks became a significant research area in recent years.It is of high interest for mobile network operators (MNOs), since by the application of sophisticated solutions, overall energy consumption can be improved, leading to a further reduction of operational expenditures [2][3][4][5].
1.1.Related Work.In general, the minimization of energy consumption has been considered in various scenarios, including cellular networks (e.g., where optimization among all access network [6,7], backhauling, and core networks [8,9] can be considered), other noncellular wireless networks (i.e., noncellular networks [10][11][12]), and wired networks (including optical networks [13][14][15]).Various solutions have been proposed, targeting different aspects of communications and networking, such as advanced radio resource and interference management enabling throughput enhancement or transmit power reduction (e.g., [16][17][18]), turning selected network nodes into sleep mode (e.g., base stations, optical modules [19][20][21]) depending on the traffic requirements, or energy-efficient routing (e.g., [22,23]).One can observe that the holistic view of all network elements is necessary in order to assess real gains that can be achieved by the application of selected energy-aware solutions.Indeed, it is probable that all benefits observed in one place as a result of the application of a selected algorithm can be lost by the increase of energy consumption in another place.In other words, it is important to analyze as wide spectrum of aspects as possible during the development of any new energy-efficient solution.
1.2.Scope and Novelty.In this work, we concentrate on the application of advanced radio resource and interference management algorithms for dense urban wireless networks (access network) which maximize the total cell throughput and optimize the energy usage by the base stations.The detailed and accurate simulation scenario proposed in the EU METIS2020 project for such an environment has been selected as the enabler for making simulations close to reality [24].The algorithms for interference coordination proposed for 4G networks (known as enhanced intercell interference coordination (eICIC) [25][26][27]) have been compared with our solutions based on the application of game-theoretic tools.Let us note that the game theory has already been considered many times as a valuable tool in the context of cellular networks.In our work, we tried to analyze the effectiveness of the proposed game-theoretic tools from the point of view of their practical implementation.Thus, we not only discuss the convergence time (which results in a delay in the system) but also analyze how the traffic observed in the backhaul and core networks will be affected by the application of our algorithms.The novel aspects covered by this paper are the following: (a) Provision of new solutions for radio resource and interference management in dense urban 5G networks that maximize cell throughput with the minimization of overall energy consumption, taking into account the traffic increase in the backhaul network (b) Summary and definitions of various game-theoretic tools that can be used for achieving this goal and their comparison with the eICIC technique (please note that scenarios with and without network node coordination have been considered) (c) Analysis of the influence of applying the proposed tool on backhaul traffic and the potential increase of the energy consumed in that part of the network Let us stress that the main aim of this paper is to compare different slow-time-scale interference management mechanisms that can be considered as candidates for future 5G networks, assuming an identical simulation setup.However, we do not intend to use identical game and auction models for every case, as different definitions can be assumed depending on the game type and optimization criteria.
The paper is organized as follows.First, the system model considered for experimentation is discussed in Section 2. Once it is presented, the proposal for energy-efficient gaming in such a scenario is provided.Then, Section 3 presents detailed game definitions together with suggested ways of achieving the equilibrium expected for each game.The analysis of backhaul traffic, being often a heavy burden of new sophisticated algorithms, such as the coordinated multipoint (CoMP), is provided in Section 4, while the achieved computer simulation results are shown and described in Section 5. Finally, the paper is concluded.

System Model of a Dense Network
In order to analyze the efficiency of the proposed algorithms in reducing energy consumption in the context of next-generation networks, a dense urban scenario has been selected for consideration, where multiple outdoor user terminals communicate with macro or micro base stations deployed on buildings and managed by MNO.

Detailed Model Characterization.
In this work, the usecase developed in the METIS 2020 project [24] has been considered, where a Manhattan-like dense urban wireless network is modeled.It is assumed that mobile users utilize the orthogonal frequency division multiple access (OFDMA) technology with the frequency reuse factor equal to 1 to communicate with one of the sectoral antennas deployed by MNO on surrounding buildings, as shown in Figure 1.One can observe that  = 21 antennas are installed in the considered area, with three main 120 ∘ sectoral macro antennas mounted on the central-building and 9 pairs of 120 ∘ sectoral micro antennas deployed close to the neighboring buildings.As the macro base stations are mounted on rooftops (5-meter high masts have been used), the micro base stations' antennas are located at the heights of 10 m and 3 m separated from the building wall.Each building contains six floors (3.5 m high each) and has a square base of the size of 120 m × 120 m.The streets' width including both sidewalks and lanes is set to 12 m.The number of full-buffer, static, and uniformly distributed user equipment (UE) pieces in the considered area was set to  = 520.It means that the average number of users served by one base station Ĵ is equal to Ĵ = 520/21 = 24.8.
The bandwidth available for each of the base stations (BSs) is divided into time-frequency blocks, with the BSs transmitting in each of the blocks using one of the selected power per subcarrier levels.These levels are selected out of the set  = { low ,  high }, as shown in Figure 2.
Let us now describe the scenario using mathematical formalism.The set of all users is represented hereafter as , with   denoting the set of users served by BS .At each time interval, each BS divides the available resources among up to 10 UE pieces according to the proportional fairness (PF) rule.Let |ℎ ()  , | 2 denote the channel gain between the th BS and th UE on subcarrier  (ℎ ()  , ∈ C), and let  2  be the noise variance at receiver  (we assume that each mobile user possesses the knowledge on channel attenuation based on the observed pilot signals from all base stations; this information can be then delivered to the serving base station).The signalto-interference plus noise ratio (SINR) for UE  served by BS  on subcarrier  is given as follows: where  ()  denotes the transmit power of BS  on subcarrier  and { ∪ } represents the set of base stations (players).
In our work, we assume that all BSs are interested in achieving at least the minimum throughput  min and minimizing the operating costs expressed by the total consumed energy.The throughput achieved by the th UE served by BS  can be then calculated as Here   , () represents the allocation of subcarrier  at time  to UE  by BS , with   , () ∈ {0, 1}, and the rate of UE  served by BS  on subcarrier ,   , , is calculated depending on the transmitted transport block size determined as specified in the 3GPP Long Term Evolution (LTE) specification [28].

Backhaul Connection.
All BSs can exchange control and signaling information using a dedicated interface.For further backhaul traffic analysis (see Section 4) it is arbitrarily assumed that each micro base station is fiber-connected directly to the macro base station, as illustrated in Figure 3.It is, for example, the fiber-to-the-antenna together with radioover-fiber technology that can act as the technical enabler for the realization of such a backhaul network.Please note, however, that other realizations of backhaul connections are possible (wired but nonfiber, wireless, etc.), but such a detailed analysis is out of the scope of this paper.Let us note that backhaul load optimization is not directly included in any of the compared interference management solutions, with backhauling considered only as a contribution to overall energy consumption.

Energy-Efficient
Gaming in Dense Networks.Spectrum and energy efficiency are the key figures of merit in the context of next-generation networks, especially in the case of dense user (UE) deployment.Referring to the former requirement, in the considered scenario, all base stations share the same frequency spectrum; that is, a full frequency reuse case is implemented together with advanced interference management solutions among base stations and served users.In the context of contemporary cellular networks, such techniques as intercell interference coordination (ICIC) in LTE or eICIC with almost-blank subframes (ABS) in Long Term Evolution-Advanced (LTE-A) have been proposed [25][26][27].We would like to concentrate on a solution that will achieve at least the same efficiency level as the nowadays solutions.Thus, among various proposals for such radio resource management (RRM), we select game-theoretic tools where each wireless network entity is treated as a player in a specified game.These tools have already been proved to be effective in RRM in various scenarios, for example, [29][30][31][32][33][34].
In this work, we have decided to utilize game-theoretic tools in a practical scenario but focusing on achieving both spectrum and energy efficiency.In particular, the following assumptions have been made: first, the game is played among base stations (mobile users do not participate actively in that game) in a noncooperative way; second, the same set of game strategies is defined for each player; third, the game and players' payoffs are defined in a way that promotes energy efficiency while achieving throughput/rate better than or at least comparable to eICIC solutions.Following the approach known from LTE-A RRM algorithms, we assume that energy efficiency can be achieved by the adaptive assignment of power levels and radio resource blocks among users, depending on their position and effective signal-to-noise ratio.Taking into account the assumptions, 16 transmission strategies have been identified.They define the transmit power levels on certain frequency subbands that might be selected by BSs, as illustrated in Figure 4.
First, the selection of the given strategy will be made for the period of time of one frame, which consists of 10 subframes (also known as transmission time interval (TTI)) of 1 ms each; thus the potential change of the power allocation scheme in the time-frequency plane will be made every 10 ms.In our proposal, the considered frequency band of  RB = 100 resource blocks (RBs) (equivalent to 20 MHz) is further divided into three subbands of 33, 33, and 34 RBs, respectively.In other words, the bandwidth of these subbands equals 33 ⋅ 200 kHz = 6.6 MHz and 34 ⋅ 200 kHz = 6.8 MHz.Each player (macro or micro base station) can transmit with either base or reduced transmit power, denoted as  TX, and  TX, .Sixteen power allocation schemes among 10 TTIs have been selected, as represented in Figure 4.It has been arbitrarily assumed that the value of the reduced transmit power is 10 dBm lower  than the base transmit power in the entire 10 MHz width frequency band.Furthermore, it has been decided that the total transmit power for a macro base station is set to 46 dBm (or equivalently 26 dBm per one RB) and for a micro base station 33 dBm (or equivalently 13 dBm per one RB).
One can observe that in such an approach the energy efficiency will be achieved by the advanced assignment of selected strategies among playing base stations.However, as it will be discussed later, the application of such an approach results in a high need of distributing detailed and accurate 6 Mobile Information Systems context-information between network nodes (players).For example, the exchange of the exact values of signal-to-noise ratios between the base station and each user could be necessary.In our work, we will also discuss this practical aspect of our solution; that is, a backhaul traffic analysis will be provided, as the inclusion of this part of the network is crucial for a fair overall analysis of energy efficiency.
In order to assess the energy efficiency of the proposed solutions, we express this variable arbitrarily in terms of the average number of bits per Hz per Joule per cell.The second metric used for the assessment of the algorithm quality is the total power consumed in a given period of time.

Energy Efficiency Game Definition
3.1.Game Definition.The problem of intercell interference mitigation can be described using the following normal-form game definition: Let us assume that, at each time instant , BS  selects its action from a finite set , following the probability distribution where ( ()  )  () denotes the probability that BS  plays action  () .As a result of playing one of the strategies, the th base station will receive a payoff, denoted hereafter as   ( ()   ).Such a payoff can be defined as, for example, total throughput observed by the base station reduced by the costs that this base station has to pay for playing this strategy (e.g., energy consumption).Thus, in general, the aim of each BS is to maximize its payoff with or without cooperation with other BSs, achieving the so-called game equilibrium.In what follows, we will extend and adapt this simple model according to the requirements referred to the selected equilibrium.

Selected Equilibria
3.2.1.Nash Equilibrium.One of the most popular concepts is the one known as the Nash equilibrium.When the base station plays the Nash equilibrium strategy denoted as  *  [35,36], the following relation will hold: where   represents the possible strategy of the th BS, whereas  − defines the set of strategies chosen by the other BSs; that is,  − = {  },  ̸ = , and  is the BSs set of cardinality .The idea behind the Nash equilibrium is to find the point of an achievable rate region (which is related to the selection of one of the available strategies), from which any player cannot increase its utility (increase the total payoff) without reducing other players' payoffs.
Payoff Definition.The achievement of the Nash equilibrium will be considered as an example of a noncooperative game, where base stations optimize their own periods and do not consider the overall system performance.Moreover, we assume that no additional information needs to be sent between the players.In such a case, the utility or better the payoff of the BS  will be defined as the rate observed by this player: Boltzmann-Gibbs Distribution.An immediate question arises: how to guarantee the achievement of the Nash equilibrium.
One of the well-known learning approaches, effective in terms of the speed of convergence, is to use the logit learning, also known as Boltzmann-Gibbs learning.The equilibrium achieved following the Boltzmann-Gibbs rule is a special case of the -Nash equilibrium that can be learned in a fully distributed manner [37].The Boltzmann-Gibbs distribution can be described using where, for player ,  , ( , (  ,  − )) is the probability of choosing action   at time ,   is the set of available actions, and, for   = , ∀, the value 1/ is interpreted as the temperature parameter that impacts the convergence speed.
Strategy Identification.In our work we are comparing various equilibria.In order to ensure the clarity and readability in each section, we provide a unique identifier used further on in the text with reference to a certain scenario.Hereafter, the Nash equilibrium and the scenario described in this subsection will be referred to as NE.

Correlated Equilibrium.
Let us now focus on the idea of the correlated equilibrium, where, in a nutshell, the joint probability of performing selected actions by players is taken into account [38].Contrary to the Nash equilibrium, the achievement of the correlated equilibrium assumes in our case active information exchange among players.In general, at each time instant, each BS plays one of the  strategies  () , 1 ≤  ≤ .Therefore, assuming that set  is discrete and finite, at least one equilibrium exists that represents the system state when a player cannot improve its payoff (utility) when other players do not change their behavior.Such a state is known as the correlated equilibrium (CE), which is defined as follows: In (8), ( *  ,  − ) is the probability of playing strategy  *  in the case when other BSs select their own strategies   ,  ̸ = .Probability distribution  is a joint point mass function of the different combinations of BSs strategies.As in [29], the inequality in the correlated equilibrium definition means that when the recommendation for BS  is to choose action  *  , then choosing any other action instead of  *  cannot result in a higher expected payoff for this BS.
Payoff Definition.Let us formulate the set of actions selected by all BSs as  = {  ∪  − }, where  − is a set of actions selected by all BSs other than .We can introduce the rate-dependent Vickrey-Clarke-Groves (VCG) [30] auction mechanism, where each BS aims to maximize utility   , ∀, defined as where   denotes the cost (rate loss) introduced by BS  to all other BSs, which is evaluated as follows: The use of the VCG auction mechanism based on rate leads to the maximization of the overall performance of the system by exploiting cooperation between nodes.However, in modern wireless systems, UE pieces are more interested in fulfilling their quality of service (QoS) requirements than in maximizing their rate.Therefore, as an alternative, one can consider a satisfaction-based VCG auction mechanism, with satisfaction V  defined as in ( 19) and ( 21), that can be formulated as where   denotes the satisfaction-based cost evaluated as follows: Regret-Matching Learning.To achieve CE, a centralized approach can be applied, which is, however, very complex [30].According to [39], the procedure of regret-matching learning can be used to iteratively achieve CE.In [29][30][31], a modified regret-matching learning algorithm is proposed to learn in a distributive fashion how to achieve the correlated equilibrium by solving the VCG auction, which aims at minimizing the regret of selecting a certain action.Regret REG () of BS  at time  for playing action  () instead of other actions is given as follows: where where    ( (⋅)  ,  − ) is the utility at time .   ( ()  ,  (−)

𝑖
) is the average payoff that BS  would have obtained if it had played another action compared with  ()   every time in the past.Thus, the positive value of    ( ()  ,  (−)

𝑖
) means that BS  would have obtained a higher average payoff when playing a different action than .Finally, given the regrets for all  actions, the probability of BS  selecting strategy  can be formulated as follows: where Strategy Identification.The correlated equilibrium achieved through the application of the regret-matching algorithms described above will hereafter be denoted as CE pure .

Correlated Equilibrium with Reduced Complexity.
One of the main burdens related to the practical application of the correlated equilibrium concept described in the previous subsection is the need for a fast exchange of detailed information about channel states for each mobile user or at least payoffs observed by each base station (BS) (player).The exchange of accurate data will result in a high traffic increase observed in the backhaul network, as will be discussed in detail in Section 4. Thus, let us now introduce the concept of achieving equilibrium with the regret-matching algorithm where the complexity is reduced.In this approach, the utility functions as well as the whole regret-matching algorithm and VCG auctions are kept unchanged, except for the fact that in each iteration only the payoff of the strategy selected by each BS is circulated in the network among other players instead of the whole table of payoffs.
Strategy Identification.In order to distinguish this solution from the application of the pure correlated equilibrium we denote this scenario as CE reduced .

Generalized Nash Equilibrium: Satisfaction Equilibrium.
The achievement of the satisfaction equilibrium [40], representing a specific case of the so-called generalized Nash equilibrium, is an example of the goal of a noncooperative game with assumed information exchange.The process of learning the satisfaction equilibrium (SE) can be described using the elements of the following game: where {∪} represents a set of players (BSs),  denotes a set of available actions, and   is the satisfaction correspondence of player , which indicates whether player is satisfied.The correspondence is defined as   ( − ) = {  ∈  :   (  ,  − ) ≥ Γ  }, with   (  ,  − ) representing a player's observed utility when playing action   and Γ  denoting the minimum utility level required by player .A state of the game when all players satisfy their individual constraints simultaneously is referred to as the satisfaction equilibrium (SE), which is defined as follows [40].

Mobile Information Systems
Action profile  + is an equilibrium for the game Satisfaction Correspondence and Payoff Definition.The existence of SE depends mainly on the set of constraints imposed on the utility function, with the feasibility of the constraints as a necessary condition.
For the considered scenario, where BSs act as game players, the satisfaction correspondence can be defined in relation to the satisfaction level of all users served by the BS, as shown below: where  , (  ,  − ) is the satisfaction of UE  when BS selects action   .Individual UE satisfaction can be defined using the binary representation: Alternatively, one can consider a relaxed version of individual UE satisfaction using the sigmoid function: Please note that compared to the case of the correlated equilibrium the payoff of each player is strictly defined by the satisfaction correspondence and does not refer explicitly to the rate, as it is the case with the correlated equilibrium.
Learning Satisfaction Equilibrium.We assume that the game players undertake actions in consecutive time intervals, with only one action selected per interval.At each interval, a player also observes whether it is satisfied or not.The selection of actions in each time interval is done based on the probability distribution: () = ( ( (1)   )  () ,  ( (2)   )  () , . . ., which is known as the probability distribution of exploration [40].Under such assumptions, SE can be found using the behavioral rule, which states that the next action taken by player  is as follows: The choice of probability distribution   () may impact the convergence time and should also allow for the exploration of all actions (thus all actions should have nonzero probability).
A simple choice may be to use uniform probability distribution  ( ()  )  () = 1/||.On the other hand, more sophisticated probability distribution update methods may be used that increase the convergence speed, for example, based on the number of times an action has been selected previously [40].
The main problem with the learning solution presented above is that it neglects the utilities observed by players in the process of updating the probability distribution.An alternative approach has been proposed in [41], where the decentralized optimization is performed using the modified behavioral rule that accounts for observed utilities.This approach, known as the satisfaction equilibrium search algorithm (SESA), utilizes the knowledge of individual utilities to increase the probability of selecting actions that provide a higher payoff.
Strategy Identification.Solutions based on the use of the satisfaction equilibrium will be denoted hereafter as SE binary and SE sigmoid , where the former describes a case where the satisfaction is represented by a binary function as in (20) and the latter identifies a case where the sigmoid function defined in ( 21) is used.

Energy Efficiency Factor in Game
Definitions.So far, we have discussed strategies that guarantee rate maximization through appropriate resource and interference management.Let us note that the achievement of the satisfaction equilibrium can be interpreted as a game where energy efficiency is taken into account-once a player is satisfied, it will not be considered in the ongoing process of resource allocation for other players.It means that the wastage of redundantly assigned resources will be minimized, thus leading to better energy utilization in the system.However, the solutions presented in the previous section that utilize the correlated equilibrium have to be modified in order to lead to an overall energy efficiency improvement in the system.Thus, we propose including the cost of energy consumption in the game definition, and in particular we propose modifying the payoff functions.
Payoff Definition.In order to achieve better energy utilization by the base stations while keeping the average cell rate unchanged, we propose defining the payoff of the th BS as follows: where  ()  stands for the total transmit power of the th BS and ζ denotes the cost (rate loss) introduced by BS  to all other BSs, which is evaluated as follows: Similarly, to account for the energy utilization factor when considering the satisfaction equilibrium, ( 19) is modified as follows: with the satisfaction of UE   , (  ,  − ) calculated using (21).
Strategy Identification.The above concept has been applied to the following strategies: CE pure , CE reduced , and SE sigmoid .In order to uniquely distinguish the new solutions, we use the subscript (⋅)  ; that is, the following identifiers will be used: CE pure, , CE reduced, , and SE sigmoid, .

Backhaul Traffic Reduction.
As has been discussed in Section 3.2.3, the application of each of the solutions described in this section requires the exchange of a relatively high amount of traffic.Although its influence on the energy consumed by the backhaul network will be discussed later, it is obvious that any reduction of the amount of control information is beneficial.Thus, following the solutions discussed in Section 3.3, we propose a further simplification of the algorithm by the application of the quantization procedure to the payoff values distributed among players.So far, we have assumed that each base station circulates either the full information about the channel between itself and all served users or the table of payoffs.In the strategy denoted as CE reduced, , only the values related to the selected strategy have to be delivered to other players.However, both the channel information and payoff value are double values, which have to be binary represented using, for example, 32 bits.
In order to reduce such information overhead, we propose quantizing the information about the payoff value to four bits; that is, the index of one of only sixteen representative values of the payoff can be circulated.One can observe that by the application of such an approach the backhaul traffic will be reduced at least 8 times (if the 32-bit representation is used).
Strategies Identification.As previously, the idea of information quantization has been applied to the strategies: CE pure , CE reduced , and SE sigmoid , which are now denoted as CE pure, , CE reduced, , and SE sigmoid, .

Mixed Solution.
Finally, we have jointly applied the concepts described in Sections 3.3 and 3.4.The strategies are denoted as CE pure,, , CE reduced,, , and SE sigmoid,, .

Strategy Comparison.
In order to briefly compare the solutions described above, we gathered the concise information in Table 1.

Backhaul Traffic Analysis
One of the key problems in the practical realization of this approach is the amount of data that has to be circulated among active players-BSs (e.g., in the correlated equilibrium scenario).This parameter is strictly related to the observed delays in data delivery to the BS; thus it is important to assess the information burden added to the backhaul network.Moreover, it is easy to predict that even highly sophisticated solutions in terms of the guaranteed rate or throughput will not be practically deployed if they either will cause nonacceptable delays in the network or will require high infrastructure investments.In both cases, the application of such an algorithm will result in a direct or indirect cost increase that may not be acceptable for the MNO.As we deal with a highly realistic scenario of dense wireless networks in this paper, the key goal of this chapter is to discuss the technical feasibility of the considered game-theoretic solutions.
Following the assumption presented in Section 2.2, let us remember that each player has a fiber connection with another in a star topology (i.e., one or two hops are required to deliver data between any two nodes) (fiber connection is our technology of choice, since solutions like fiber-to-theantenna, FTTA, and/or radio-over-fiber, RoF, based techniques are often of the highest interest from the MNO point of view, e.g., [42][43][44]; an interested reader is encouraged to also follow the related work on the connection between wireless and optical parts of the communications network [45][46][47]).Although we select optical fibers as a way of ensuring base station connectivity, other solutions, such as Gigabit Ethernet or wireless backhauling, can also be considered.In the usecase discussed in this paper, we intentionally chose a fiberbased network as a quite mature technology that enables the achievement of high data rates in the network.Once the reference technology is selected, we need to estimate the traffic load due to the application of the proposed solutions.In our calculations, we will focus on the algorithm utilized in the correlated equilibrium case, since it is characterized by the highest needs for data exchange.However, let us again note that backhaul optimization is not part of any of the considered games and is used only for the purpose of comparing the energy consumption.
In the simplest case, every base station needs to exchange with other nodes the information about the channel characteristics to the served users ℎ ()  , and the probability distribution for each of the possible playing strategies.In the following, we fix the binary representation of each value exchanged between nodes to 32 bits.Such a matrix with channel information contains  RB ⋅ Ĵ entries, where  RB stands for the total number of RBs, and Ĵ for the average number of users served by one base station (i.e., Ĵ = /, where  is the total number of base stations).Assuming uniform user deployment in the considered area (i.e., approx.Ĵ = 520/21 ≈ 25), the required number of bits to be transferred is equal to  1 = 32 ⋅ Ĵ ⋅  RB ⋅  ⋅  = 32 bits ⋅ 25 users ⋅ 100 RBs ⋅ 21 first hop ⋅ 17 ⋅ 21 second hop = 440.64⋅ 10 6 bits per 10 TTIs, which corresponds to approx.44 Gbps.Additionally, in order to circulate the selected strategy (one of 16 in our case), each base station needs to send 4 bits resulting in total traffic  2 = 4 ⋅  ⋅  = 1296 bits per 10 TTIs.Such a great number of bits that would have to be exchanged will definitely disqualify such an algorithm from further considerations due to its nonpracticality.Hopefully, such a great burden can be strongly reduced since instead of channel information payoff matrices can be exchanged.In particular, in order to distribute the matrix of payoffs (of the size calculated as the number of strategies times the number of base stations) we need to send  3 = 32 bits ⋅ 16 strategies ⋅ 18 first hop ⋅ 18 second hop = 10512 bits per 10 TTIs.Thus, the total traffic increase observed in the backhaul network equals approx.  = 1.1 Mbps.
In order to assess the cost of energy consumption increase due to the higher traffic in the backhaul network, we have evaluated the typical values of power consumed by the contemporary equipment used by operators of fiber networks.A detailed analysis of this problem is presented in [14].Due to the short distances between the nodes in the network (less than 2 km), there is no need for any in-line amplifiers.Thus, one has to account for the power consumed by the booster, the power amplifier, and, eventually, the optical cross connects with the regenerator deployed on the intermediate node (macro base station).Following [14], two models of energy consumption can be analyzed-one that relies on the measurements of contemporary devices available on the market and another that is based on an analytical model.In the former case, the change in traffic of 1.1 Mbps has, in fact, no measurable effect on the power consumed by the optical devices.It is due to the fact that the values of power consumption refer to particular classes of optical devices or simply do not depend on the traffic load (e.g., optical line amplifier, OLA, used for a short span of 2 km consumes a constant power of 65 W, while a transponder/muxponder for 2.5G traffic and for 10G traffic needs 25 W and 50 W, resp.).A change of the overall traffic by 1.1 Mbps will not result in a change of, for example, transponder class.In other words, following the first approach, backhaul power consumption will be kept unchanged.Thus, let us now discuss the problem of power consumption with the application of analytic models presented in formula (4) in [14].One can observe that the exact power consumption depends on various parameters, such as power efficiency values (denoted as /), cooling and facilities overhead   , traffic protection  pr , hop count , demand capacity   , and the number of traffic demands   .An exemplary formula for power consumed by, for example, OLA is where  OLA is the optical amplification span length and   is the average (lightpath) link length.One can observe that for a given backhaul network topology, all of the components remain the same, except for the number of traffic demands and average demand capacity.Analogous conclusions can be drawn for a power model of any other backhaul network element.Thus, one needs to find the value of the following relation  (1)  OLA / (2)  OLA =  (1)    (1)   / (2)    (2)   , where the superscripts (1) and (2) represent the states of the system with and without the backhaul traffic generated by the algorithms discussed in this paper.However, based on the discussion and results (Figures 8-10) from [14], it can be concluded that the increase of the total consumed power due to a traffic increase of 1.1 Mbps is rather negligible.
The above discussion is valid for the most demanding solution, that is, the one that utilizes the concept of the correlated equilibrium with full information exchange.Since all other algorithms require less steering data to be exchanged in the backhaul network, it can be concluded that from the point of view of energy consumption by the backhaul network the application of any algorithm discussed in this paper has only a negligible effect.Clearly, such an analysis has to be repeated for certain technologies and solutions applied by MNO.

Simulation Results and Analysis
To investigate the properties and validity of the considered game-based solutions, system-level Monte-Carlo simulations of the system described in Section 2 have been carried out.
As a simulation parameter, the micro BS cell range expansion factor has been used, with five values considered {5, 7.5, 10, 12.5, 15} [dB].As a reference, four configurations have been used: (i) a plain system with no interference mitigation (denoted as no ICIC), (ii) a system utilizing LTE-A fixed eICIC with four ABS specified for macro BSs (hereafter denoted as LTE-A eICIC), (iii) a system using a dynamic LTE-A eICIC mechanism proposed in [48] (denoted as dynamic eICIC), (iv) a system using an adaptive mechanism of Fast Muting Adaptation with PF criterion employed, proposed in [49] (denoted as eICIC FMA).
Among the game-based solutions, three of them, namely, CE pure , CE reduced , and a proposed version of SE sigmoid , have also been considered in the energy-efficient, backhauloptimized, and mixed forms.
5.1.Baseline Solutions.In Figure 5, the average cell spectral efficiency (number of bits per second per unit bandwidth) for baseline solutions is presented.One can notice that the highest spectral efficiency is achieved in approaches that assume rich information exchange, with CE pure outperforming other solutions.The gain observed for CE pure versus the plain system or the system using LTE-A eICIC is over 20%, with the CE reduced performing only slightly worse (up to a 5% decrease compared to CE pure ).Therefore, when analyzing the spectral efficiency, one can state that the game-based approach using the correlated equilibrium is a very promising solution for interference mitigation.One can also notice that there is no improvement or even decrease in the spectral efficiency when eICIC methods are used, compared to the case with no interference mitigation.This indicates that the main victims of interference in the considered case are the macro UE pieces that are affected by micro BSs transmission.
In such a situation, eICIC cannot improve the throughput of macro UE pieces, as it specifies almost-blank subframes (ABSFs) for macro BSs only.On the other hand, the proposed approaches using CE or SE optimize the use of resources in both macro and small BSs, thus improving the performance of macro UE pieces experiencing high interference.
The performance of all methods improves with the increase of the range expansion (RE) parameter, which corresponds to a higher number of UE pieces connected to small BSs.In the case of high RE, the UE pieces that would otherwise be assigned to a macro BS and experience high interference are connected to small BSs and benefit from the use of eICIC or other interference mitigation solutions.Moreover, small BSs usually provide services for a smaller number of UE pieces than macro BSs.Thus, offloading users to small BSs results in a higher number of UE pieces being scheduled for transmission.
The solutions based on the SE concept perform more poorly than CE because of the nature of the correspondence function.For UE pieces that achieve the required throughput, any further increase in data rate does not increase their satisfaction.Therefore, the system spectral efficiency is traded for improvement in general user satisfaction represented by achieving certain required throughput.

Energy-Efficient Solutions.
The properties of the considered game-based solutions can also be used to improve the energy efficiency of the system.Therefore, energy-efficient versions of selected algorithms have also been considered in the investigation.Figures 6 and 7 present the comparison of baseline and energy-efficient versions in terms of the   achieved average cell energy efficiency and total power consumed by the system, respectively.One can notice that a huge gain in energy efficiency can be achieved when using energy-optimized versions of CE pure and SE sigmoid for high RE values, where most of the traffic is offloaded to micro BSs.Moreover, it can be observed in Figure 7 that 3-4 times lower transmit power is used in the case of the energy-optimized solutions when compared to the plain system or the system using eICIC.
In most cases, the highest energy efficiency is achieved when using the SE sigmoid approach.The reason for such behavior is the nature of the SE correspondence function.For UE pieces that achieve the required throughput levels, a further increase in their utility can be achieved by decreasing the transmit power.Therefore, energy efficiency increases at the cost of a slight reduction of spectral efficiency, which is indicated in Figure 8.
A very interesting observation can be made about the behavior of the energy-optimized CE reduced solution.One can notice that due to the reduced exchange of information between the BSs the optimization mechanisms of energy usage cannot determine the gains from the use of other strategies; thus they mostly operate on the basis of throughput analysis.This results in almost identical performance of the energy-optimized approach as of the baseline CE reduced .
The introduction of energy efficiency optimization does not degrade the gains of the considered approaches in terms of spectral efficiency.As shown in Figure 8, the spectral efficiency achieved with the energy-optimized solutions is almost the same as that for the baseline algorithms.The only  exception is the CE pure, in the case of the RE factor equal to 15 dB, where the very high energy efficiency is achieved at the cost of reduced spectral efficiency, which is still higher than for the plain system.

Backhaul-Optimized Solutions.
The algorithms based on CE or SE can yield high gains in terms of spectral and energy efficiency.However, control information needs to be exchanged between BSs.In order to reduce the burden on the backhaul network, an approach based on payoff quantization has been proposed, where four bits are used to represent each payoff value exchanged in the backhaul.Figure 9 presents the comparison of spectral efficiency achieved with baseline and backhaul-optimized solutions.One can notice that the introduction of quantization results in a minor performance degradation for several cases; however, the gains in terms of spectral efficiency are still significant when compared to the plain system.An interesting observation is that there is hardly any loss in spectral efficiency when using CE reduced, or SE sigmoid, compared to CE pure, .This indicates that quantization has a bigger impact when using solutions based on full exchange of information.When using the SE approach, the characteristics of the correspondence function reduces the cost of quantization.Similarly, for CE reduced, , the limited exchange of information, and thus slower convergence, mitigates the impact of quantization errors.
Similarly, the mixed approach has been evaluated, where both energy-efficient and backhaul-optimized approaches are applied simultaneously.Figures 10 and 11 present the     achieved average cell spectral efficiency and average cell energy efficiency, respectively.One can notice that the use of the mixed approach does not significantly degrade the performance of the considered algorithms with reference to the baseline solutions.The conclusions are the same as those for the energy-and backhaul-optimized solutions.Thus, both improvements are promising and perfectly applicable approaches that provide the means of practical implementation of the considered game-theoretic schemes.

Summary.
The proposed game-based solutions have been evaluated in terms of spectral efficiency, energy efficiency, and total consumed power.The simulation results clearly indicate that the most promising solutions are the CE pure and SE sigmoid algorithms, with both providing significant increase in terms of spectral and energy efficiency.In practical systems, SE might be a more suitable solution, as usually UE pieces use services that require some minimal or aggregate data rate that can be represented using the satisfaction correspondence function.By using different satisfaction definitions for different services, one can easily distinguish the different payoffs of many UE pieces based on the services they use.One can notice the significant improvement of CE and SE compared to the eICIC methods, as these are not suitable for scenarios with small-to macro-layer interference.By treating the micro and macro BSs equally in the case of CE and SE, we can improve the performance of UE pieces connected to the macro BS that are victims of strong interference.Moreover, the use of the dynamic eICIC approach does not bring any improvement, as the considered scenario is a low mobility one.
The proposed energy-efficient modification further increases the energy savings of both the CE and SE solutions.High energy efficiency has been observed especially in the case of the SE approach.In the case of CE, a high reduction of energy consumption is achieved only in the case of a high RE parameter, where most of the UE pieces are connected to small BSs.
From the practical point of view, the most suitable solution would be the CE reduced one, as it requires the exchange of the smallest amount of information in the backhaul network.However, one can notice that it benefits less from the energyoptimized approach than other solutions.Furthermore, the practical application of the considered approaches with full information exchange is possible and cost-effective thanks to the robustness of the considered algorithms against the inaccuracy of payoff values caused by quantization.Only a minor reduction in the achieved spectral efficiency has been observed for CE pure, and SE sigmoid, when compared to their baseline solutions.
Finally, one can state that the use of different power levels in selected subbands by the BSs provides an increase in the spectral efficiency of the system.This increase is independent of the type of users that are affected by the highest interference.For the case when both macro UE pieces or small cell UE pieces are the interference victims, an improvement in system performance can be observed when using the considered game-theoretic solutions based on CE and SE.This is in contrast to the eICIC solutions based on the use of ABSF, as these aim at the reduction of interference from the macro to small cell layer in the downlink.

Conclusion
The goal of this work was to propose an efficient and flexible tool for radio resource management in the context of its practical implementation in future wireless networks.Based on the presented results of simulations carried out for a highly accurate model of a dense urban network, it can be concluded that the application of the proposed gametheoretic algorithms guarantees the achievement of high cell throughput, while at the same time minimizing the energy consumed by the base station.All algorithms discussed in this paper concentrate on the optimization of the overall energy consumption and such strategies are preferred that minimize the consumed energy while ensuring high data rates observed by all users.All of the algorithms have been compared with the solution known from the LTE-A standards, eICIC, and have proved their effectiveness.Moreover, based on the detailed discussion of the traffic increase in the backhaul network it can be concluded that the cost of practical implementation of the proposed solutions for RRM will be rather negligible when considering the backhaul requirements.

Figure 2 :
Figure 2: Resource allocation among three base stations.

Figure 5 :
Figure 5: Average cell spectral efficiency for the considered baseline solutions.

Figure 7 :
Figure 7: Average consumed power for the considered baseline and energy-efficient solutions.

Figure 9 :
Figure 9: Average cell spectral efficiency for the considered baseline and backhaul-optimized solutions.

Figure 10 :
Figure 10: Average cell spectral efficiency for the considered baseline and mixed solutions.

Figure 11 :
Figure 11: Average cell energy efficiency for the considered baseline and mixed solutions.

Table 1 :
Comparison of strategies.
Figure 6: Average cell energy efficiency for the considered baseline and energy-efficient solutions.
Figure 8: Average cell spectral efficiency for the considered baseline and energy-efficient solutions.