Location Aided Cooperative Detection of Primary User Emulation Attacks in Cognitive Wireless Sensor Networks Using Nonparametric Techniques

Primary user emulation (PUE) attacks are a major security challenge to cognitive wireless sensor networks (CWSNs). In this paper, we propose two variants of the PUE attack, namely, the relay and replay attacks. Such threats are conducted by malicious nodes that replicate the transmissions of a real primary user (PU), thus making them resilient to many defensive procedures. However, we show that those PUE attacks can be effectively detected by a set of cooperating secondary users (SUs), using location information and received signal strength (RSS) measurements. Two strategies for the detection of PUE relay and replay attacks are presented in the paper: parametric and nonparametric. The parametric scheme is based on the likelihood ratio test (LRT) and requires the existence of a precise path loss model for the observed RSS values. On the contrary, the nonparametric procedure is not tied to any particular propagation model; so, it does not require any calibration process and is robust to changing environmental conditions. Simulations show that the nonparametric detection approach is comparable in performance to the LRT undermoderate shadowing conditions, specially in case of replay attacks.


Introduction
Cognitive radio (CR) is considered a key concept for the development of future wireless networks [1].One of the technologies enabled by CR systems is dynamic spectrum access (DSA), which promises a much more efficient use of spectrum resources by reusing frequency channels in different networks as they become available.The usual approach in DSA is to assume the existence of a set of licensed primary users (PUs) who have unrestricted use of the available spectrum and a network of unlicensed secondary users (SUs), equipped with CR devices, who are allowed to occupy the portions of the spectrum left temporarily free by the PUs.The stringent requirement here is that SUs should not interfere with the PUs (interweaving paradigm), a fact that forces the SUs to continually perform spectrum sensing to determine the existence of a frequency band with no PU activity in the neighborhood of the SUs (also known as "white space" or "spectrum hole") so as to allocate their own transmissions.Notice that, for a given frequency, a spectrum hole can occur both in time (during a period when the PUs are not transmitting) and in space (if the communication ranges of the PUs and SUs do not overlap).
As the SUs are required to refrain from using the unlicensed band as soon as PU activity is detected, a potential attacker could exploit this fact by filling white spaces with fake emissions that mimic those of a real PU transmitter.This is the so-called primary user emulation (PUE) attack [2], that, if successful, could potentially render the CR network inoperative, because the SUs would not be able to find spectrum holes to establish links between them.A PUE attack can be launched by a set of malicious nodes, equipped with radio interfaces compatible with those of the PUs.
Although a number of detection schemes of PUE attacks have been suggested [3], the most effective countermeasures require some form of cooperation between the PUs and 2 Journal of Sensors the CR network by means of cryptographic protocols.For instance, the PUs and SUs could interchange in advance some secret keys, and then the PUs would use them to authenticate themselves by including digital signatures in their transmissions.In this way, as long as the malicious nodes do not have access to the secret keys, SUs could distinguish legitimate PU transmissions from fake ones, because the latter would not carry the correct signatures.However, as we will see, cryptographic protocols are vulnerable to two kind of threats, namely, the relay (or "wormhole") and the replay (or "playback") attacks.
A wormhole is a direct high-speed private communication link between two distant malicious nodes [4].To launch a PUE relay attack, one of the wormhole nodes is placed near a PU and the other in the neighborhood of the CR network.Then, the node in the vicinity of the PU captures its transmissions and sends them through the wormhole to the other node, which simply relays the packets unaltered so that they are received by the SUs.Therefore, even if the SUs were too far from the PUs to cause interferences, they are deceived into believing there is a nearby PU, thus preventing them from using the unlicensed channel.On the other hand, a PUE replay attack can be carried out by a single malicious node that records network packets from an active PU and retransmits them whenever the PU is inactive.In this way, the malicious node fills white spaces with spurious PU transmissions and as the SUs in the neighborhood of the malicious node are continuously detecting emissions carrying valid PU signatures, they cannot use the channel for their own communications.
As PUE attacks using relay or replay techniques are resilient to cryptographic countermeasures, other strategies should be used to detect them.The conventional approach is to exploit the fact that the malicious nodes and the PU transmitter are located at different places and, assuming that the actual position of the PU is known by the SUs, use localization techniques to validate the origin of the detected PU transmissions.Such localization procedures are based on the capability of any radio receiver to obtain some kind of measurement related to its distance to the transmitter; in the case of a wireless sensor networks (WSNs) device, this is typically the average power measured at the receiver input, usually defined as received signal strength (RSS).However, most approaches assume the existence of a precise analytical model that relates RSS values to distances, which, in many cases, is a rather unrealistic assumption.
In this paper, we address the problem of detecting PUE relay and replay attacks based on localization techniques using RSS measurements in cognitive wireless sensor networks (CWSNs).In case the observations are guaranteed to fit a predefined path loss model, we propose the use of a cooperative detection strategy based on the likelihood ratio test (LRT); on the other hand, for most situations, we propose the use of a nonparametric model-free scheme to detect PUE attacks.
The rest of the paper is organized as follows: Section 2 reviews related work concerning PUE attack detection.Section 3 defines the particular attacks to be counteracted.Section 4.1 formulates the cooperative location-based PUE detection problem under the framework of statistical hypothesis testing and derives the LRT.Section 4.2 presents our nonparametric approach to PUE detection.Section 5 evaluates the performance of the proposed PUE attack detection strategies through simulations.Finally, Section 6 draws some conclusions.

Related Work
The topic of PUE attacks has been extensively studied in recent years, and a lot of different defensive measures are described in the related literature.
The first reference to this kind of attacks is in [2], where they are classified under two different categories: selfish (the attacker tries to maximize its own spectrum usage) and malicious (the objective of the attacker is to disturb the DSA process of SUs).In this paper, a transmitter verification scheme is proposed to counteract PUE attacks that uses both signal characteristics and location information; the localization of the active transmitter is performed (using RSS values and a parametric measurement model) by an auxiliary WSN composed of a high number of devices distributed along the deployment area of the SUs.
The authors of [5] present a Wald's sequential probability ratio test to detect PUE attacks conducted by a set of malicious nodes.Their approach is based on RSS observations and location information but assumes that there is no cooperation between SUs, so each node performs its own test independently of the others.
The use of a cooperative belief propagation procedure is proposed in [6] to detect a potential attacker by using RSS measurements and interactions between neighboring SUs.This approach also requires a parametric model relating the received power to the distance between transmitter and receiver.
In [7], the authors propose a method for the localization of attackers using time differential of arrival (TDOA) instead of RSS.Thus, it requires modifications of the physical layer to measure signal time delays.
The authors of [8] propose to analyze anomalous behavior (such as changes in transmitted power and/or bandwidth) and use collaboration between SUs to detect PUE attacks in CWSN scenarios.Therefore, for this method to be effective, it is needed to precisely define what is considered an anomaly.
In [9], a cryptographic method is proposed to authenticate the PU transmissions, but it requires modifications in the PU transmitter and receivers to accommodate encrypted data and also a mechanism to securely distribute secret keys to both the PUs and the SUs.Another cryptographic mechanism is proposed in [10] to authenticate PU transmissions with the aid of a helper node placed physically close to the PU transmitter.As stated before, cryptographic protocols can be defeated by relay and replay attacks.
Device-specific features ("fingerprints") along with unique IDs are used in [11][12][13] to detect potential attackers in CR networks.Those fingerprints are features extracted directly from the analog signals transmitted by the network nodes, which vary from device to device because of the inevitable differences between hardware components of Most of the PUE attack detection methods described above have significant drawbacks when applied to CWSNs.In particular, those based on the use of location information lack robustness because they are tied to a parametric model for the measurements that must be precisely tuned to every specific environment.

Attack Models
We will consider two types of PUE attacks, namely, relay and replay attacks, as shown in Figure 1.Both have in common the existence of an adversary who tries to deceive the SUs into believing that there is a PU transmitting in their neighborhood, so that they are forced to refrain from using the PU frequency channel (i.e., they are malicious PUE attacks [2]).
Figure 1(a) represents an example of a relay attack.Here, we assume that there is an active PU transmitter, but it is located far away from the SUs so that the coverage areas of the PUs and SUs do not intersect; therefore, the SUs could use the unlicensed channel without causing interferences to PUs.However, a malicious node located in the neighborhood of the PU transmitter is sending the PU emissions through a private link to another node that, in turn, broadcasts them to the SUs.Thus, a PUE relay attack fills spectrum holes in the space domain by making remote PUs appear as local in the CR network.
Figure 1(b) illustrates a replay attack.Such attack assumes that the SUs are in the vicinity of a PU transmitter, which is presently inactive.However, a malicious node that has previously recorded a set of PU transmissions is now replaying them to the SUs.Therefore, a PUE replay attack fills spectrum holes in the time domain by making inactive PUs appear as active in the CR network.
Notice that as the malicious nodes do not manipulate the information contained in the packets, these PUE attacks resist defensive measures solely based on cryptographic protocols.In the sequel, we will see how location information, along with RSS measurements collected by the SUs, can be effectively used to defeat both relay and replay attacks.

PUE Detection Using RSS
Our detection schemes will be based on the following assumptions: (i) There is one (and only one) active transmitter that can be either a legitimate PU or a malicious attacker.
(ii) There are  SUs carrying out a spectrum sensing stage, during which every SU takes one RSS measurement and sends it to a central fusion node.
(iii) The fusion node is aware of the locations of the PU and SUs and uses this information and the  RSS values to decide whether there is an attack or not.
Therefore, following an approach analogous to [14], we state a PUE attack detection procedure as a binary hypothesis testing problem: given a vector of  RSS observations, we must decide between hypotheses  0 (no attack) and  1 (a PUE attack is launched).
In the sequel, we will use the following notation: p is the position in the plane of the PU.So, the observations collected by the SUs are, under  0 (no attack), the RSS values of packets transmitted by the PU and received by the SUs.Thus, according to our notation, the null hypothesis (no PUE attack) can be formulated as Now, depending on the assumed model for the observations, we can define both parametric and nonparametric tests, as explained in the following subsections.

Parametric Approach.
In order to formalize a parametric test, we need a suitable statistical description of the observations.In our approach, we will use the standard log-distance path loss model [15] that links RSS values (in logarithmic scale) to distances between nodes as where  is the mean RSS at unit distance of the transmitter (depends on the transmitted power and antenna gains),  is the path loss exponent (depends on the specific environment), and  is a zero-mean Gaussian random variable with standard deviation  (in dB) that takes into account shadowing effects.Therefore, (x, y) is also a Gaussian random variable with standard deviation  and mean (x, y), with where [⋅] stands for "expected value."Now, assuming that the shadow fading is spatially uncorrelated, the observations are independent and identically distributed (IID) and, therefore, the distribution of the vector of RSS values r is multivariate normal with mean vector  = [ 1 ,  2 , . . .,   ]  and covariance matrix  2 I, where I is the identity matrix of order ; thus, the joint probability density function (PDF) of the RSS measurements is The only parameter of (4) that depends on the position is , so we will formulate the PUE attack detection problem as a test of the mean vector of r: where  0 is determined assuming that there is no attack present.
We can see that  0 is a simple hypothesis, because all the magnitudes involved in (6), namely, the path loss model parameters ( and ) and the positions of the PU and SUs (p, s 1 , s 2 , . . ., s  ), are assumed known.On the other hand, under  1 (PUE attack), the packets received by the SUs are transmitted by the local malicious node, as shown in Figure 1; therefore, the RSS values for these packets will be unrelated to position p of the emulated PU.Therefore, we can obtain the likelihood ratio test (LRT) [16] as where Λ(r) is the likelihood ratio and  is a threshold selected so that we have a given probability of false alarm (PFA).The numerator of ( 8) is easily obtained, according to (4) as max   (r; ) = (2 2 ) −/2 (9) while the denominator of ( 8) is, taking into account ( 4) and (6), with (r) the "sum of squares" Now, taking into account ( 8), (9), and (10), we can compute the logarithm of the likelihood ratio as so that a test equivalent to (7) is where   is another suitable threshold selected so that  [ (r) >   |  0 ] =  FA (14) with  FA the probability of false alarm.To determine the value of the threshold   we can use the fact that, under  0 , (r)/ 2 can be expressed as the sum of  independent standard normal random variables, and so it has chi-square distribution with  degrees of freedom.We can see from (11) that (r) is a measure of "goodness of fit" of the PU position to the observations, assuming that the path loss model ( 2) is correct.Notice also that as the number of terms in (11) is , the complexity of the LRT grows linearly with the number of SUs.

Nonparametric Approach.
The PUE attack detection strategy of Section 4.1 assumes the existence of a well-defined measurement model that describes the statistical relationship between observed RSS values and distances.However, in most instances, such model can only be stated under idealized conditions or is tied to a specific scenario; in this latter case, estimating its parameters often requires a costly calibration phase which must be repeated every time the environmental conditions change.Therefore, it would be desirable to devise a PUE attack detection procedure that is "nonparametric" or "model-free" in the sense that, unlike test (13), this alternative strategy does not impose a particular distribution for the observations; thus, such test will be robust against departures from any predefined model.
As we will see, in order to obtain a nonparametric test, we only need a simple assumption: we will accept the universal validity (under ideal conditions) of the "monotonicity constraint" that loosely relates the Euclidean distances of a single transmitting node (x) to two different receiving nodes (y and z) and the RSSs measured by the receivers as Such constraint is based on the fact that radio waves attenuate with distance and assumes that transmitters and receivers are using omnidirectional antennas.In practice, channel effects such as multipath and shadow fading could cause some violations of condition (15); however, we expect that if the number of SUs participating in the detection process is sufficiently high, such outlying RSS measurements will be averaged out in the computation of the test statistics.Notice that (15) implies the existence of an inverse relationship or negative correlation between RSS values and distances.
To carry out the PUE attack detection, a fusion node with complete location information collects RSS measurements from the SUs and uses them to look for possible violations of monotonicity constraint (15).If the CWSN network has been compromised by a PUE attack like those of Figure 1, the source of those packets will be the wormhole local node, whose position is, with a high probability, different from that of the PU, so that many of the measured RSSs will be in complete disagreement with the distances between the legitimate PU and the SUs.
As a measure of dissimilarity between distances and RSS measurements, we have used a slight modification of the classical Kendall tau distance [17], which is a metric that counts the number of pairwise disagreements between two lists.In our case, the test statistic counts the number of violations of monotonicity constraint (15) for every possible pair of PU to SU distances and the measured RSS values as where |Ω| denotes the cardinal number of a set Ω.As the test statistic (r) is a discrete random variable (it only takes integer values), the decision procedure should include two parameters to exactly obtain a predefined PFA: an integer detection threshold  and a real number  (with 0 ≤  ≤ 1), such that where PFA is the desired probability of false alarm.
The number of comparisons between RSS measurements needed to obtain (r) in ( 16) is (  2 ), so the complexity of this test is approximately proportional to the square of the number of SUs.
Notice that the proposed test is not exactly a nonparametric version of the test described in Section 4.1: while the sum of squares (11) tries to determine if the RSS observations are consistent with the PU position (assuming a particular path loss model), the Kendall tau distance ( 16) evaluates the correlation of the observed RSS values with the distances from the SUs to the PU transmitter.

Simulation Results
We have conducted some simulations to evaluate and compare the performance of the PUE attack detection strategies described in Sections 4.1 and 4.2.We consider a static scenario, where the location of a given network node is always modeled as a two-dimensional random variable, independent of the positions of other nodes, and uniformly distributed inside a region of the plane that depends on the type of the node and the simulated attack.
We consider three different types of nodes: (i) Secondary users: they constitute a CWSN composed of a set of  nodes at positions s 1 , s 2 , . . ., s  , deployed in an area modeled as a circle with radius   and centered around the coordinate origin.
(ii) Primary user: it is a single transmitter located at position p, which is assumed known by all the SUs.
(iii) Attackers: they can be two malicious nodes (relay attack) or a single node (replay attack).
The locations of the PU and the attackers depend on the kind of attack: Relay attack: the PU is located outside the deployment area of the SUs, so we choose its position p lying inside a ring with inner radius   and outer radius   (with   >   ) and centered around the coordinate origin.Under this kind of attack, a wormhole is established between two malicious nodes: a remote receiver near the PU transmitter (whose exact position is irrelevant in our simulations) and a local wormhole transmitter located at a point m inside the deployment area of the SUs.
Replay attack: both the PU and the (single) attacker (at positions p and m, resp.) are located inside the deployment area of the SUs.
In our simulations, we have chosen the following values for the location parameters:   = 20 m and   = 40 m.On the other hand, the RSS values are computed using the log-distance path loss model (2), with parameter  = 2.3, as suggested in [18].We assume that both the PU and the attacker use the same transmission power, so the mean received power at unit distance  is the same in all cases.
For the simulated scenarios, we have conducted both the LRT (Algorithm 1) and the proposed nonparametric test (Algorithm 2).The detection thresholds are empirically obtained so that a predetermined PFA is obtained when no attack is present.
Some results are represented in Figure 2, where we have plotted the attained probability of detection for the PUE attack detection schemes of Sections 4.1 and 4.2 under (2) Compute the test statistic (r) using (11).
( different situations.The PFA is fixed at 0.05, the number of SUs was 40, and we conducted 1000 simulation runs with varying path loss standard deviation  (that controls the effect of shadowing).In a real-world situation, the parametric approach requires a previous calibration phase during which the propagation model parameters (namely, the path loss exponent  and the mean received power at unit distance ) are estimated.So, for the case of the parametric test (LRT) of Section 4.1, we considered two situations: an ideal one that assumes that the environment is perfectly calibrated and the exact value of the path loss exponent  is known (perfect estimation) and another one that assumes that the value of  is underestimated by 10%, and so the PUE attack detector uses in (11) a value of the path loss exponent that is slightly lower than the one used to generate the observations (biased estimation).By examining Figure 2(a), we can observe that, for a relay attack, the parametric approach performs quite well as long as the path loss model parameters are well estimated; this can be easily explained because the local malicious node uses the same transmitting power as the remote PU, and so the values of the RSS at the SUs are usually too high to be accepted as originated from distant PU transmissions (assuming that shadowing effects are not too severe).On the contrary, the nonparametric test does not use any model parameters and so it cannot be aware of the dissimilarity between the observed RSSs and their expected values provided by the path loss model.Notice, however, that when the model parameters used in the LRT depart from the "true" ones, the performance of the parametric test suffers a significant degradation and can be even surpassed by the nonparametric procedure.
In the case of a replay attack, we can see in Figure 2(b) that the nonparametric test achieves a comparable performance to that of the LRT and even performs better for moderately high values of the path loss standard deviation (specially if there is bias in the estimation of the model parameters).This is because, under this kind of attack, the LRT cannot take advantage of any significant differences between the average values of the RSSs induced by the attacker transmitter and those attributable to an active PU, as both the PU and the malicious node lie in the same region and, therefore, their averaged distances to the SUs are approximately the same.Notice also from Figure 2(b) that the detection performance of the LRT rapidly decreases with increasing shadowing, while the nonparametric attack detector only degrades smoothly.This may be explained because this latter detector is based on a nonparametric estimation of the correlation between the RSS measurements and the distances from the SUs to the PU, which appears to be fairly robust to significant deviations of the RSSs from their mean values due to channel impairments.

Conclusions
In this paper, we presented two models of PUE attacks to a CWSN that are resilient to naive defensive measures based on cryptographic protocols but can be effectively counteracted by using location information and RSS measurements.The attack detection process can be formulated as a problem of hypothesis testing for which we derived the LRT, assuming the RSS values follow a standard log-normal path loss model.In order to make the attack detection process robust to changing environmental conditions, we further proposed a nonparametric procedure that is not tied to any propagation model but instead assumes a simple monotonicity constraint for the RSS values.Simulations suggest that the nonparametric test can compete in performance with the parametric scheme and even performs better under moderate shadowing conditions or significant departures from the assumed measurement model, specially in case of replay attacks.

s 1 ,
s 2 , . . ., s  are the positions of the SUs.r = [ 1 ,  2 , . . .,   ]  (where the superscript  denotes "transpose") is the vector of RSS values (in dBm) measured by the SUs.(x, y) = ‖x − y‖ is the Euclidean distance between two arbitrary network nodes at positions x and y. (x, y) is the RSS (in dBm) measured by the receiver of a node at position y for a signal transmitted by a node at position x.