Dynamic Resource Allocation in an Adversarial Urban IoBT Environment

The advances of the Internet of Battle ﬁ eld Things (IoBT) would improve the ﬂ exibility and e ﬃ ciency of military operations. Without an e ﬀ ective dynamic adversarial mechanism, soldier devices might malfunction, and machine intelligence technologies could hardly support military operations on a battle ﬁ eld. In this paper, we propose a game theoretical model considering the adversarial and dynamic nature of the urban IoBT environment. Our algorithm is designed to optimize the whole network ’ s e ﬃ ciency with the premise that both the attacking and defending parties can maximize their bene ﬁ t. Meanwhile, we also attempt to consider the interactive e ﬀ ects of channel fading by using a Nakagami distribution based Markov process. The experimental results show that considering the impact of the adversarial and dynamic nature of urban IoBT, our proposed algorithm can improve network performance by 30%-50%.


Introduction
In the Internet of Things (IoT) applications, a remarkable amount of data has been produced by intelligent mobile devices such as sensors, mobile computers, and drones [1][2][3][4][5][6][7]. As a potential technology, IoT has been applied and promoted in various industrial domains. For example, in an application of Internet of Medical Things (IoMT), an intelligent hospital collects information through sensors and uploads it to a doctor's device for real-time monitoring. In case of emergency, the patient's medical examination reports can timely be transferred to remote experts, thereby reducing the risk of accidental death [8][9][10].
Unlike the consistent network environment in a regular IoT, e.g., IoMT, the Internet of Battlefield Things (IoBT) has highly changeable and adversarial characteristics in nature [11,12]. The overall vision of an IoBT is to minimize soldier mortality by collecting battlefield information through intelligent devices and by enabling human decision-making with intelligent means [13]. However, in an extreme battlefield scenario, physical devices and channels are vulnerable to various adversarial attacks [11]. For example, high-power electromagnetic weapon attacks might lead to the physical destruction of base stations or end devices for network communication [14]. Thus, the dynamic adaptability for a highly adversarial environment is the most dominant feature in IoBT, so it is crucial to establish a dynamic mechanism to optimize the entire network's utility [11,13,15].
In the special interest of mitigating the adversarial problems in urban IoBT environments, we propose a dynamic adversarial mechanism under a Stackelberg game theoretic framework in consideration of channel fading effects. Despite a bunch of literature on the optimization of IoBT networks [16][17][18], most existing algorithms may not work well in urban adversarial scenarios, which face the challenges of adversarial battlefield environment and changeable fading channels. The main contributions of our work are summarized as follows: (1) We address the architecture of communication networks in adversarial urban scenarios, and propose a network utility optimization problem in consideration of each player's benefit on a battlefield. To our knowledge, there are very few studies on network optimization in a secure IoU network in an adversarial scenario (2) We establish a Stackelberg game theoretic model to characterize the dynamic adversarial process between both parties of a battlefield in an IoBT. The existence of Nash equilibrium is proofed in such a game, and a closed-form mathematical expression of equilibrium is presented when an inequality constraint holds (3) We also show the existence of Nash equilibrium in the proposed Stackelberg game when an inequality constraint does not hold, and present a numerical algorithm to compute the equilibrium In Section 2, we review the literature on adversarial cases in IoBT. Following this, Section 3 demonstrates a system model of an adversarial game with both defenders and attackers. Based on the system model, we describe the opti-mal solutions in Section 4. Section 5 shows the simulation results. The conclusion is finally discussed in Section 6.

Related Work
Military missions depend on real-time information processing and data analysis for making accurate decisions in IoBT networks. However, any connectivity problems might result in inaccurate decisions on military operations, so the connectivity problem has triggered much academic debate. For example, in [15], a mechanism on connectivity reestablishment at the presence of dumb nodes that cannot transmit the data to the neighborhood nodes is proposed, and it can enable reestablishment of connectivity between dumb nodes and the centralized node. In [17], a fusion-based defense scheme is employed for defending the attacks at the network level. By characterizing the attack and defense as a zero-sum game, the proposed method can effectively improve network stability even with a fragile network structure. The abovementioned studies merely consider all sensors/devices with the same type and capabilities. To remedy this issue, in consideration of heterogeneous characteristics of the devices in a network, Abuzainab and Saad use a multistage Stackelberg game to mitigate the IoBT connectivity problem by either activating sleeping nodes or by changing the roles of current nodes [13].
For adversarial IoBT networks, security attacks can be categorized into two types including disruption ones and manipulation ones. While disruption attacks try to paralyze IoT networks by launching physical destructions or jamming the entire system, i.e., denial of service attacks (DDoS) and manipulation attacks seek to control a few nodes in network to inject false information. The attacks mentioned in the previous paragraph are mostly relevant to disruption ones. However, other literature also considers the impact of the action that injects misinformation or imposes human interventions on IoBT nodes. In [18], the misinformation attack has been countered by determining the optimal probability of accepting the information. Similarly, building on a psychological game theory, the authors focus on how the misinformation from human psychological interventions influences game-theoretic decision making on the battlefield [19].
Most previous studies on IoBT have paid particular attention to connectivity problems in which the researchers focus on optimizing the network by measuring the number of connected nodes from the network layer. However, few studies have attempted to use a specific indicator such as bit error rate (BER) or power to optimize network resources from the physical layer. Building on the work [20], a power control based connectivity reconstruction game can reduce energy consumption while maintaining the performance of localization. That is, the number of connected nodes is a fairly broad indicator that can hardly reflect the quality of services (QoS) in an IoBT network. Even if all nodes in the network are successfully connected, the quality of communication might still be unsatisfactory. Thus, it is necessary to use a specific indicator such as power to measure the QoS of IoBT. Furthermore, the current relevant studies primarily

System Model
As discussed in the previous section, the two factors that dynamic adjustment to the fading channels and reasonable allocation for network resources are fairly important in the urban IoBT environment. To familiarize readers, in the following, we firstly discuss a model that reflects the effects of channel fading in the urban IoBT environment. By using Stackelberg game theory, we then propose a dynamic channel-based adversarial game model that reflects the adversarial process between both parties of a battlefield. The strategies of the network players depend on those of their competitors. To be specific, the attackers in the network actively deploy strategies, while the defenders make passive adjustments based on the attackers™ strategies. This framework can match the adversarial situation in an IoBT well. The key notations in Table 1 will be used in the following sections.
3.1. Channel Fading Models. By facilitating an effective decision making process, intelligent tools used in the urban IoBT scenarios may include on-board servers, sensors, mobile computers, and drones. In cities, vehicle speed is limited to less than 60 meters per minute, and intervehicle distance is from a few meters to approximately 100 meters. As shown  Figure 1, two sides including defenders and attackers are involved in an IoBT network. The center on the defending side plays an essential role in monitoring and decision making. An electronic defending system could facilitate communication tasks between user ends (UEs) and the centralized node, as well as monitoring tasks. The collected information and data would be forwarded to the center for further analysis and actions.
By considering both the urban IoBT environment and widely used channel models, in [21][22][23], the researchers conclude that the channel fading in all line of sight and nonline of sight cases can be modeled as Nakagami distributions with particular parameters. The Nakagami distribution can be used to capture the changes of signal amplitude after channel fading in an urban IoT scenario. The channel characteristics of Nakagami are determined by parameters ϕ and m, and thus the generalized Nakagami distribution of channel fading α can be shown as where Γð:Þ is a gamma function and ϕ, m are two determinant parameters of a Nakagami distribution. Represent h t 1 and h t 2 as the channel fading characteristics at the time slots of t 1 and t 2 , respectively. Building on the generalized Nakagami channel model Formula (1), we can denote the joint probability density function as [24]: where I m (.) denotes the m-order Bessel function, ϕ and m denote the parameters of a Nakagami fading channel (1), and ρ denotes the correlation between channels [24]: Step 1: Initialize the relevant parameters: B that represents bandwidth, h mi that represents a set for interferences at the first round (h mi would dynamically change according to the Markov transition probability matrix), δ that represents the power of noise, and P m that represents a reasonable maximum power that the attackers can accept. Additionally, set U a = U d = P a = 0: Step 2: By using a searching algorithm, the maximum value of U a can be found. The maximum U a corresponds to the optimal solution of the attacker's power P * a . Let X = 0: ΔP: Step 3: Building on the above steps, the algorithm searches the optimal solution within the closed interval range½0, P m to determine the game equilibrium P * a and the corresponding utilities of the attackers and defenders U * a , U * d .  In the following, we consider M channel states, i.e., S i ði = 1, 2 ⋯ MÞ. S i is dependent on the values of channel fading h k at time slot k. Let h k ∈ ðS t i−1 , S t i Þ, and let h k and h k+1 denote the channel fading at the kth and ðk + 1Þth time slots, respectively. Thus, the transition probability p ij can be characterized as By submitting (2) into (4), we can achieve the probability of a transition between channel states and simulate the future channel states based on previous information. We denote Equation (4) as the probability of one-step transition between channel states, and build up the matrix of probabilities as one-step transition matrix. Mathematically, we can denote the one-step transition matrix as Based on the Markov properties of Nakagami fading channels [25], we can compute the N-step transition matrix as P N . Actually, we can estimate one-step transition probability by statistically averaging the data of observations in a long period. Figure 2, attackers might launch adversarial attacks on defenders by using high power electromagnetic weapons. The performance of base station or edge server would degrade or even corrupt in the presence of attackers, so computing tasks need to be processed from the cloud server to heterogeneous edge servers or end devices. Thus, it is essential to build a dynamic algorithm to optimize insufficient network resources in such a dynamic, adversarial, and unpredictable scenario. The adversarial process between defenders and At each stage of a game, attackers and defenders have their respective strategy sets. In the following, we will     Wireless Communications and Mobile Computing consider the utility function from the attacker or the defender perspective. Firstly, the attackers' strategies primarily depend on two factors, including the decrease of network performance and the cost of interference to defenders. Therefore, the overall utility of attacker k, i.e., U a k can be expressed as

Adversarial Game Model. As shown in
where L d i represents the decrease of channel capacity of defender i. L a k represents the cost of attacker k to degrade the network performance. N represents the number of defenders in a network. Building on the channel fading model in the previous section, the transfer of channel fading h k mi between stage k and stage k + 1 follows a transfer matrix, shown in Equation (5). The dynamic channel fading h k mi is illustrated in Figure 3. S 1 , S 2 …S M refer to M states of channel fading.
Thus, L d i can be denoted as    represents the dynamic interferences of defender i at kth stage, and δ represents the power of noise. In equation (7), B log 2 ð1 + P d i /δÞ represents the channel capacity of defender i before a network is attacked, and B log 2 ð1 + P d i /∑ M k=1 P a k h k mi + δÞ represents the channel capacity of defender i after the network is attacked. M represents the number of attackers in a network.
Similarly, by Shannon formula, the cost of the kth attacker L a k can be denoted as where P a k represents the transmission power of attacker k and α represents the cost per unit power consumption by an attacker.

Wireless Communications and Mobile Computing
Replacing Equation (6) with both Equation (7) and Equation (8), we can mathematically denote U a k as On the other hand, a defender's strategy is primarily dependent on two factors, including the defender's channel capacity after being attacked and the defender's transmission power. Therefore, the overall utility of defender i, i.e., U d i can be expressed as where C d i is defined as the channel capacity of defender i after being attacked and c L d i is defined as the cost of defender i to maintain the capacity of his/her channel.
By Shannon formula, C d i can be denoted as Also, by Shannon formula, L d i can be denoted as where η represents the cost per unit power consumption by a defender.
Similarly, U d i can be formulated as

Optimal Solution
Each party in a battlefield expects to adjust power to maximize its user capacity. This utility optimization problem can be defined as a Stackelberg game. In this section, a Nash equilibrium for the game would eventually be achieved on both sides. The defender always adjusts its strategy based on the attacker's, so the defender is defined as a leader while the attacker as a follower. Building on the above-mentioned mechanism, we would explore and prove the existence and exact solution of the best responses of both parties on a battlefield.

Optimal Strategy of the Defender.
In the following, we discuss the optimal solution to maximize the utility of (13), and it can be characterized as Theorem 1.

Theorem 1.
The optimal solution of U d i to function (13) for the defender exists and can be denoted aŝ Proof. We first prove the existence of the optimal solution of U d i . The existence can be proved by computing the second In the equation, bandwidth is positive, so the secondorder derivative −B/ð∑ M k=1 P a k h k mi + δ + P d i Þ 2 < 0, which proves that the utility function U d i is concave. The optimal solution of the function that would maximize the utility of the defender can be computed by setting the first-order derivative to 0.
By employing the equation, we can attain the optimal utility of the defender asP

Optimal Strategy of the Attacker.
In the following, we discuss the optimal solution to maximize the utility of (9), and it can be characterized as Theorem 2.

Theorem 2.
In consideration of the optimal strategy of the defender, the optimal solution of U a k to function (9) for the attacker can be achieved when the following equation holds: Proof. By substituting the optimal solution of the defender into the function (9), and the function can be transformed into Based on the function (18), we first prove the existence of the optimal solution of U a k . The existence can be proved by computing the second derivative of the utility function.
It is not hard to reach the conclusion that the secondorder derivative of U a k is less than 0, so the utility function U a k is concave. The optimal solution of the function that would maximize the utility of the attacker can be computed by setting the first-order derivative to 0.
Let ∂U a k /∂P a k = 0, we have 4.3. Stackelburg Equilibrium Algorithm. The mutual best response is the Nash equilibrium of the Stackelberg game that maximizes the utility for both attackers and defenders. Building on the above-mentioned description on the Stackelberg game, we present an algorithm to determine the Nash equilibrium using a searching algorithm within a reasonable range of power that the attackers can accept. As shown in the Pseudo codes of Algorithm 1, we firstly initialize the relevant parameters, including B that represents bandwidth, h mi that represents a set for interferences at the first round (h mi would dynamically change according to the Markov transition probability matrix), δ that represents the power of noise, and P m that represents a reasonable maximum power that the attackers can accept. Following this, we use a search algorithm with a reasonable range of power ½0, P m for the attackers to determine the game equilibrium P * a and the corresponding utilities of the attackers and defenders U * a , U * d .

Simulation Results
We experiment through a hardware platform that includes a NI-PXIe 1085 and three USRP-RIO-1082 devices. As shown in Figure 4, the NI-PXIe 1085 device is designed to display graphic results, while three USRP RIO-1082 devices are designed to simulate transmitters, receivers, and interference generators. Two USRP devices are equipped with four antennas, and we use them to simulate two transmitters and two receivers. The third USRP device is equipped with two antennas to simulate two interference generators. Additionally, we use a NI-PXIe 1085 platform to monitor the graphic results of interference, shown in Figure 5. We also load the data generated by USRP-RIO-1082 devices to MATLAB for subsequent numerical analysis. We compare the analytic results using the proposed algorithm and the algorithm in [19], respectively. The simulation parameters are set as follows: the parameters of Nakagami channel models are m = 1 and η = 0:5; the signal to noise ratio ranges from 0 dB to 20 dB; the level of signal to interference ratio ranges from 0 dB to 20 dB; the cost parameter of transmission power by defenders ranges from 0.1 to 1.

10
Wireless Communications and Mobile Computing 5.1. Individual Utility of Defenders. In this section, we compare the individual utility of defenders using our proposed algorithm with the algorithm in [19], which is viewed as a benchmark. As shown in Figure 6, for each of defenders, our proposed algorithm can achieve a higher individual utility in comparison with the benchmark algorithm. The comparison result illustrates that our proposed algorithm outperforms the benchmark algorithm. Specifically, an extra 30%-50% of individual utility can be achieved using the proposed algorithm than the benchmark algorithm. The reason is that the proposed algorithm considers the dynamic variation of the system environment, and defenders can adjust their own strategies in view of both attackers' strategies and channel fading. Across the defenders, the individual utility of each defender primarily depends on the strategies of attackers and the channel fading. Given the same attacker's strategies in each round of game, the difference of individual utility among defenders is dependent on their own channel fading. In the following, we investigate the channel fading of each defender. As shown in Figure 7, the individual utility decreases with the channel fading. For example, while defender 4 has the lowest channel fading and has the highest individual utility, defender 7 has the highest channel fading and has the lowest individual utility.

Total Utility of Defenders.
In section, we investigate the total utility of defenders using both our proposed algorithm and the benchmark algorithm in [19]. Specifically, we consider the total utility with different values of cost parameter η, bandwidth, signal to noise ratio (SNRs), Nakagami channel model parameters m and ϕ, respectively.
As shown in Figure 8, the total utility of defenders decreases with the values of η, and the proposed algorithm can achieve a higher utility than the benchmark algorithm across various η. The reason is that a higher η indicates the defenders need to achieve a certain level of utility with a higher level of cost, and thus the total utility of defenders decreases with the values of η.
As shown in Figure 9, the total utility of defenders increases with the values of bandwidths, and the proposed algorithm can achieve a higher utility than the benchmark algorithm across various bandwidth. The reason is that a higher bandwidth indicates the defenders can achieve a higher utility with the same cost of transmission power, and thus the total utility of defenders increases with the values of bandwidth.
As shown in Figure 10, the total utility of defenders increases with the values of SNRs, and the proposed algorithm can achieve a higher utility than the benchmark algorithm across various SNRs. The reason is that a higher level of SNR indicates the defenders can achieve a higher utility when occupying the same amount of bandwidth, and thus the total utility of defenders increases with the values of SNRs.
As shown in Figure 11, the total utility of defenders decreases with the values of m in Nakagami, and the proposed algorithm can achieve a higher utility than the benchmark algorithm across various values of m. The reason is that a larger value of m in Nakagami leads to a higher channel fading, and thus the utility achieved by defenders is lower when occupying the same amount of bandwidth and paying the same cost of transmission power. Thus, the total utility of defenders decreases with the values of m in Nakagami.
As shown in Figure 12, the total utility of defenders increases with the values of ϕ in Nakagami, and the proposed algorithm can achieve a higher utility than the benchmark algorithm across various values of ϕ. The reason is that a smaller value of ϕ in Nakagami leads to a higher channel fading, and thus the utility achieved by defenders is lower when occupying the same amount of bandwidth and paying the same cost of transmission power. Thus, the total utility of defenders increases with the values of ϕ in Nakagami.

Conclusion
This paper has proposed a game theoretic model in consideration of the adversarial and dynamic nature of the urban IoBT environment. By employing a Stackelberg game theoretic method, our proposed framework can effectively leverage network resources and improve network performance in an adversarial scenario. We also consider the interactive effects of dynamic channel fading by using a Nakagami distribution based Markov process. The detailed analysis illustrates that with considering the impact of the adversarial and dynamic nature of urban IoBT, our proposed algorithm can improve the entire network performance. It is known that the security issues are the most significant perspective in the IoBT environment. So in our future work, in combination of network optimization, we will explore an authentication model to fit in the characteristics of IoBT scenario.

Data Availability
We have no data used in this work.

Conflicts of Interest
The authors declare that they have no conflicts of interest.