A Support Vector Data Description Approach to NLOS Identification in UWB Positioning

Non-line-of-sight (NLOS) propagation is one of themost important challenges in radio positioning, and, in recent years, significant attention has been drawn to the identification and mitigation of NLOS signals. This paper focuses on the identification of NLOS signals. The authors consider the NLOS identification problem as a one-class classification problem and apply the support vector data description (SVDD), providing accurate data descriptions utilizing kernel techniques, to perform NLOS identification in ultrawide bandwidth (UWB) positioning. Our work is based on the fact that some features extracted from the received signal waveforms, such as the kurtosis, the mean excess delay spread, and the root mean square delay spread, are different between lineof-sight (LOS) and NLOS signals. Numerical simulations are performed to demonstrate the performance, using a dataset derived from a measurement campaign.


Introduction
The availability of positional information plays an important role in many applications, such as commercial, public safety, and military sectors [1].Currently, the global positioning system (GPS) is the most important technology for providing location awareness around the globe through a constellation of at least 24 satellites [2].Unfortunately, since the received GPS signals have extremely low power, they can be easily obstructed.For instance, GPS-based techniques fail to provide satisfactory performance due to signal blockage in many scenarios such as in urban terrain, inside buildings, and in forests or jungles.At the same time, GPS receivers are also susceptible to jamming and unintentional interference for this same reason [3].
To overcome the above problems, ground-based wireless positioning system [4], which can be served as an augmented or complementary system to GPS, has been introduced to provide positional information.Driven by the attractive characters of ultrawide bandwidth (UWB) transmission in cluttering environments, such as superior multipath resolution, obstacles-penetrating propagation, and high-resolution ranging capabilities [5][6][7], UWB techniques have attracted more and more attention in the field of positioning.
However, UWB systems still face a number of technical challenges such as signal acquisition, multiuser interference, multipath effects, and non-line-of-sight (NLOS) propagation in practical scenarios [8,9].The latter issue, namely, NLOS propagation, which results in positive bias to the distance measurement, has been recognized as one of the major challenges for high-resolution localization systems.It is noted that the scenarios as described previously where NLOS conditions frequently occur are usually when GPS does not work well either.Hence these areas are where UWB systems are expected to work in.Therefore, it seems rather necessary to identify the NLOS signals and to mitigate their effects on positioning systems.
The goal of NLOS identification is to determine whether the range measurement between a transmitter and a receiver is biased or not.The topic has been discussed extensively in the literature, both within the cellular network and within UWB-based network.For more information about NLOS identification and mitigation, we refer the readers to [8][9][10][11][12][13] and the references therein.
It is worth noting that, in [8], a nonparametric approach relying only on UWB waveforms measurement to NLOS identification has been evaluated.The approach in [8] performs NLOS identification based on support vector machines (SVM), without requiring a statistical characterization of waveforms under LOS and NLOS conditions.In their following publication [9], the authors propose two nonparametric regression techniques to estimate the ranging error, based solely on the received waveform and the estimated distance, utilizing machine learning tools.
SVM is one of the most widely used classification techniques.However, it requires at least two classes of inputs and labels, and unfortunately, in some cases, only one class of samples can be acquired (due to the complexity or the expensive costs).Furthermore, due to diverse defect modes (NLOS due to different obstructed modes; NLOS under various environments, etc.) and their occurrence frequencies, the true distribution of the defective received signal patterns is difficult to obtain.On the contrary, normal signals are easy to collect and they involve only small variations in uniformity, as shown in Figure 1.Hence, the NLOS identification problem can be regarded as a normal received signal (LOS signal) and a defective received signal (NLOS signal) classification problem, and one-class classification is more appropriate than two-class classification for solving the NLOS identification problem.
In this paper, we apply the SVDD [14] to perform NLOS identification in ultrawide bandwidth (UWB) positioning.The SVDD is one of the best-known one-class support vector learning methods, in which one tries the strategy of using balls defined on the feature space in order to distinguish a set of normal data from all other possible abnormal objects [15].Compared to the existing SVM-based method, our approach has the added benefit that it can be applied when only the LOS training data is obtained.To the best knowledge of the authors, this is the first time to consider NLOS identification as a one-class classification problem (also called outlier detection or novelty detection).Our finding is validated by the dataset of UWB waveforms.
The remainder of the paper is organized as follows.Section 2 presents the problem formulation.In Section 3, the proposed techniques for NLOS identification are described.In Section 4, we perform numerical simulations based on a dataset derived from measurements.Finally, conclusions are given in Section 5.Under the assumption that the first  range measurements are biased, range measurements between device  and a beacon  can be expressed as

Problem Formulation
where the symbol ‖⋅‖ denotes Euclidean distance,  ∈ ,  ∈ ;   denotes the relative NLOS measurement error between them, expressed in distance units; V  are measurements noise.
The basic idea in this paper is utilizing the received signal waveforms to identify whether the range measurement   is biased or not, based on features extracted directly from the received waveform ().The following features are considered in the existing literature: energy of the received signal (energy); maximum amplitude of the received signal (max); rise time (rise time); mean excess delay (MED); root mean square (RMS) delay spread (RMS-DS); kurtosis.

NLOS Identification Building on SVDD
Several features have been extracted directly from the received waveform, in order to capture the differences between LOS and NLOS signals.In [10], the kurtosis, mean excess delay, and RMS delay spread are utilized to perform NLOS identification.Four cases, namely, residential case (case 1), indoor office case (case 2), outdoor office case (case 3), and industrial case (case 4), are considered, and the log-normal probability density functions (PDFs) for the above four cases of the IEEE 802.15.4a channel models are demonstrated in Figure 1.The results derived in [8] demonstrate that the set of features, namely, energy of the received signal, rise time, and kurtosis, provides a good complexity/performance trade-off.
In this section, we propose a technique to identify the NLOS signals.Our technique is nonparametric and based on SVDD.We first describe the features for distinguishing LOS and NLOS situations, followed by a brief introduction to SVDD.We then describe how SVDD can be used for NLOS identification in positioning applications, without needing to determine parametric joint distributions of the features for both the LOS and NLOS conditions and without needing to obtain training dataset of NLOS as well.
3.1.Feature Selection for NLOS Classification.We have selected several features that have been presented in the literature.These features were selected based on the consideration about numerical simulation in Section 4. The features we will consider are as follows: (i) energy of the received signal: (ii) maximum amplitude of the received signal: (iii) kurtosis: where 2 ;  is the observation window, (iv) mean excess delay: where () = |()| 2 /  , (v) RMS delay spread: (vi) rise time: where and   is the standard deviation of the thermal noise.The values of  and  are chosen empirically in order to capture the rise time.In this paper, the values are chosen as the same as in [8].

SVDD.
SVDD, which only depends on a few target objects, is one of the best-known support vector learning methods for one-class classification problems.
For a dataset containing NLOS training samples {x 1 , x 2 , . . ., x  }, SVDD attempts to find a spherically shaped boundary, with center  and radius , which contains as many LOS observations as possible while keeping as much minimum volume as possible [16].To allow the NLOS in training data, a variable   (called slack variable) is introduced to penalize outliers for the largest distance between x  and .Then, the primal formulation of an optimization problem is given as follows: min .

𝑅,𝜇,𝜉
where the regularization parameter  gives the trade-off between the number of errors made on the training set and the size of the sphere description.By introducing the Lagrange multipliers for each inequality constraint of ( 9), the Lagrangian function is constructed as where Lagrange multipliers   ≥ 0 and   ≥ 0. Setting partial derivatives of (, ,   ) to zeros, the following equations are obtained: The corresponding dual formulation is given by max According to the Karush-Kuhn-Tucker condition, the following equation should hold at the optimal solution: From ( 13), we indicate that SVDD needs only samples x  with   > 0, namely, support vectors, which are on the boundary or outside the hypersphere.Once SVDD is constructed on the training data, we need to decide whether a given test sample z is LOS or NLOS.The criterion for the decision can be stated as where  (condition) equals one if the condition is true, otherwise zero.In order to express more complex decision regions, kernel tricks can be used in SVDD by replacing inner products of samples (x  ⋅ x  ) by a kernel function (x  , x  ).In this paper, Gaussian kernel function is utilized, which is expressed as follows: In NLOS identification problem, the receiver received several signals from the anchors at some time instant, with both LOS and NLOS signals included.Note that the objective of SVDD is to find the support of the LOS objects, and anything outside the support is viewed as NLOS.To identify the NLOS signals, we just need to obtain the training data for the LOS signals, and, after the training procedure, a decision could be made whether the received signals are the LOS signals or not.

Numerical Simulation
In this section, to investigate how the proposed method works, we perform numerical simulations based on a dataset of UWB waveforms.For the purpose of comparison, we evaluate the detection performance of the LSSVM utilized in [8].At the same time, example of how the classification performance is affected by the number of training samples is given.We use 10-fold cross-validation to assess the performance of the LSSVM and SVDD.
In 10-fold cross-validation for LSSVM, both LOS and NLOS signals in the dataset are utilized, while, for SVDD, only the LOS signals in the dataset are utilized and randomly partitioned into 10 parts of equal size.The SVDD is trained on 9 parts and the performance is evaluated on the remaining LOS part and a randomly selected NLOS set with an equal size.
Data Overview.We compare the detectors using a dataset derived from a measurement campaign, which is performed by the Wireless Communication and Network Sciences Laboratory of MIT during fall 2007.The dataset provides 1024 measurements, including 512 LOS and 512 NLOS signals.For more information about the channel model and the UWB measurement campaign, we refer the reader to [8,9].
Preprocessing.There are two preprocessing steps involved.Firstly, for reasons of numerical stability, features in the dataset are converted to the log-domain before training and evaluating.Then, for the sake of averaging the results of 10fold cross-validation, a step was taken to reduce the total In Figure 2, the ROC curves are provided to evaluate the ability of the SVDD to detect NLOS signals compared to the existing LSSVM.Several subsets of the features are considered, namely, set A ( max ), set B ( max ,  rise ), set C (  ,  rise , ), set D (  ,  max ,  rise , ), set E (  ,  max ,  rise ,  MED , ), and set F (  ,  max ,  rise ,  MED ,  RMS , ).The corresponding AUC values are given in Table 1.
From Table 1, it can be seen that, for both LSSVM and SVDD algorithms, set D provides the best performance among the several subsets of the features.As is shown in Figure 3, from the viewpoint of ROC curve, it can be seen that, in the region that NLOS signals accepted rate is smaller than 0.04, SVDD outperforms LSSVM, and vice versa.Utilizing the merit of AUC, we can observe that the performance of SVDD is comparable to the LSSVM, while the latter is somewhat better.
In Figure 4, the performance of SVDD as a function of the number of available LOS samples is shown.It can be observed that the case utilizing 510 LOS samples yields the best performance, and the detection performance of the SVDD detectors generally increases with the available samples size.

Conclusion
In this paper, we presented a novel NLOS identification approach building on SVDD.The technique was based on the fact that there exists some dissimilarity between LOS and NLOS signals, and several features extracted from the received waveforms can be utilized to indicate the dissimilarity.Contrary to the existing parametric approaches, the proposed technique does not rely on any statistical models   and relies solely on received UWB waveforms.Furthermore, our technique was built on a single class compared to the existing LSSVM-based approaches, which seems quite valuable in the cases that only one class of samples can be acquired.Our results were validated by a dataset that is derived from an extensive indoor measurement campaign.As   a next step, since the run-time complexity of SVDD is linear in the number of support vectors, we will develop fast SVDD to reduce the run-time complexity.
Consider a positioning system utilizing UWB techniques, including two types of nodes: beacons are nodes with known positions, while devices are nodes with unknown positions.Let  be the set of devices and  the set of beacons; denote by || the cardinality of  and by || the cardinality of .Positional state of beacon  ∈  and that of device  ∈  are indicated, respectively, by x  = [      ],  = 1, . . ., || and x  = [      ],  = 1, . . ., ||.A set of unknowns U = [x 1 , x 2 , . . ., x || ] are to be determined.

Figure 1 :
Figure 1: Log-normal PDFs of the kurtosis, mean excess delay, and RMS delay spread for various scenarios of the IEEE 802.15.4a channel models.

Figure 2 :
Figure 2: The ROC curves for the LSSVM and SVDD.For LSSVM, both LOS and NLOS signals are needed for training, and the number of each of them is 510.For SVDD, only LOS signals are needed for training, and the number of available LOS signals is 510.

Figure 3 :
Figure 3: The ROC curves for the best cases in the LSSVM and SVDD algorithms. 0

Figure 4 :
Figure 4: Illustration of how the performance of the SVDD detectors varies with the number of available LOS samples.Set D is utilized.

Table 1 :
The AUC values for the LSSVM and SVDD.Performance Evaluation.The detectors are evaluated by comparing their receiver operating characteristics (ROC) curves and the responding area-under-the-ROC curve (AUC) value.The ROC curve shows how the fraction true positive (i.e., LOS signals accepted rate in this paper, the rate deciding LOS when the signal was LOS) varies for varying fraction false positive (i.e., NLOS signals accepted rate in this paper, the rate deciding LOS when the signal was NLOS).AUC is a way to summarize a ROC curve in a single number, and higher values indicate a better separation between LOS and NLOS signals.