An Analysis of Multiple Criteria and Setups for Bluetooth Smartphone-Based Indoor Localization Mechanism

Bluetooth Low Energy (BLE) 4.0 beacons will play a major role in the deployment of energy-efficient indoor localization mechanisms. Since BLE4.0 is highly sensitive to fast fading impairments, numerous ongoing studies are currently exploring the use of supervised learning algorithm as an alternative approach to exploit the information provided by the indoor radio maps. Despite the large number of results reported in the literature, there are still many open issues on the performance evaluation of such approach. In this paper, we start by identifying, in a simple setup, the main system parameters to be taken into account on the design of BLE4.0 beacons-based indoor localization mechanisms. In order to shed some light on the evaluation process using supervised learning algorithm, we carry out an in-depth experimental evaluation in terms of the mean localization error, local prediction accuracy, and global prediction accuracy. Based on our results, we argue that, in order to fully assess the capabilities of supervised learning algorithms, it is necessary to include all the three metrics.


Introduction
A large number of proposals have been reported in the literature aiming to develop accurate indoor localization mechanisms.Most recent studies are being developed using the Received Signal Strength Indication (RSSI) of various reference wireless transmitters as a mean of estimating the position of a smartphone device.Among the technologies being considered, Wi-Fi networks have attracted the attention of many researchers and practitioners over the last years.Many experimental studies have been conducted to construct radio maps and models enabling the estimation of the distance between a reference transmitter and a smartphone device.Due to the characteristics of the wireless signal, the use of Kalman filters [1,2], among others, have been required to remove the noise.Novel Bluetooth Low Energy (BLE) devices have become a strong alternative to Wi-Fi-based indoor location mechanisms.Their lower cost, low energy consumption, and size of the Bluetooth devices are among the most important design features involving battery-operated smartphone devices, mainly smartphones and tablets.
Several studies have been conducted aiming to develop RSSI-based localization systems [3,4] or simply computer vision using Kalman [5] or particle filters [6].Early studies limited the use of Bluetooth localization mechanism to determine the locations of stationary smartphone devices at a room level [7].Moreover, recent studies have shown that BLE4.0 signals are very susceptible to fast fading impairments making it difficult to apply the RSSI/distance models commonly used in the development of Wi-Fi-based localization mechanisms [8,9].In [10] the authors explore various methods used in smartphone-based indoor localization with different techniques and technologies, analyzing in-depth map, trilateration, and fingerprint techniques.
Other recent studies reported in the literature have explored alternative methods.In [11], Pei et al. have proposed a hybrid method combining fingerprinting with trilateration.In [12], the same authors have explored the impact of the  presence of people over the wireless signal used on the development of the localization mechanism.In [13], Guo et al. have analyzed the RSSI in different indoor environments, improving the accuracy and mean positioning error for smartphones with BLE4.0 beacons.The authors of [14] evaluate the mean localization error under various scenarios.
In [15], the authors distribute efficiently the BLE4.0 beacons and make use of the information provided by additional sensors attached to the smartphone devices.Finally, Pagano et al. have proposed a system based on the ranging time of arrival, between anchor nodes and BLE4.0 beacon node [16].
In this work, in order to properly justify our proposal, we first study the signal propagation of the BLE4.0 beacon.From this first analysis, we justify the use of supervised learning algorithm as a feasible methodology to characterize the BLE4.0 beacon signal propagation to be used as a basis to develop indoor localization mechanisms.Later, we analyze the main configurable parameters based on BLE4.0 beacons and algorithms.The results obtained in a real-world scenario validate the proposal.
Figure 1 shows the overall schema proposal in this work.The rest of the paper is organized as follows.Section 2 reviews the related work and describes the main contribution of our work.Section 3 analyzes the BLE4.0 signal propagation and justifies the use of classification algorithms on developing BLE4.0 beacons-based indoor localization mechanisms.Section 4 shows an in-depth RSSI attenuation study analyzing the impact of physical materials in our laboratory and noise introduced by other peripheral devices.Subsequently, Section 5 describes the experimental tools including the challenges to be faced when developing a BLE4.0 fingerprint-based localization mechanism.We also include a brief description of the two classification algorithms used on our proposal, experimental setups and survey campaign characteristics.Based on these preliminary results, Sections 6.1 and 6.2 present the results obtained in two different scenarios using BLE4.0 beacons as transmitter and smartphone as receiver.Moreover, we analyze the performance of the two algorithms in terms of three main metrics: (i) global accuracy; (ii) mean positioning error; and (iii) local accuracy.Section 7 briefs the results obtained in the two previous experimental areas and highlights the main findings and lessons learned in two main areas: (i) the system configuration of a BLE4.0 beacons-based indoor localization setup and (ii) the performance evaluation of supervised learning algorithms on the development of indoor localization mechanisms.Finally, we conclude the paper in Section 8 exhibiting the final conclusions and our future work plan.

Related Work
Wireless indoor localization is a hot topic of research nowadays.Depending on the wireless sensor network technology, the use of a technique/algorithm may be more suitable or feasible with respect to others.In this section, we first overview the major trends and results.We then overview the various approaches being explored when using BLE4.0 beacons.

Standard Wireless
Positioning.Nowadays, the main three technologies being explored to develop indoor localization mechanisms are ZigBee [17,18], Wi-Fi, and Bluetooth.Main localization techniques for indoor localization are based on trilateration or indoor channel propagation models [9] or by means of classification algorithms [19,20].Main metrics used in indoor localization are global accuracy and mean positioning error [19,21].
In [22], the authors compare the performance of Wi-Fi and BLE4.0 beacons and conclude that BLE4.0 beacons outperform Wi-Fi by 27% in terms of mean positioning error.In [19], the authors show that the positioning error of Wi-Fi is around 5-10 m and 1-2 m using BLE4.0, that is, an overall 50% improvement.Nevertheless, in [23] good results are presented for Wi-Fi fingerprint, obtaining an average error of 2.5 meters.In [6], the authors make use of RANSAC with the aim of improving the quality of the information to be used to feed a particle filter.Their main goal has been to enhance the mean positioning error and accuracy of Wi-Fibased indoor localization mechanisms.In [24], the authors have shown that the use of particles filters or Kalman filter algorithms may not always be a good choice for BLE4.0 beacon, where the use of different classification algorithms seems to provide better results.From the above, it is clear that the development of accurate and robust wireless indoor localization mechanisms is still a long way to go.Besides the technological development, mainly radio systems and antennas, the use of filtering and/or classification algorithms is still one of the main research topics [25].

BLE4.0-Based Localization
Mechanisms.Nowadays, it is widely recognized that multipath fading is one of the main challenges faced on the development of robust and accurate BLE4.0-based indoor localization mechanisms [12].
In order to overcome this challenge, the research community is actively exploring on defining the best system configuration, for example, density of BLE4.0 beacons and relative placement [26], and on identifying the most suitable data processing methodologies, that is, filtering and classification algorithms.Some works have explored the use of regression model, separate channel fingerprints supplemented by Extended Kalman filters (EKF) [27] or particle filters [28].In the later work, the mean positioning error of less than 4 m has been reported.Both works have shown that the physical area plays a major role on the results, not only the materials but also the dimensions.
In order to reduce the mean positioning error, different classification algorithms have been studied as new localization techniques based on fingerprinting.However, one of the main challenges is to properly tune the various parameters of the classification algorithms, since they play a major role in the achievable accuracy and mean positioning error [29,30].In [29], the authors have compared three different classification algorithms, Neural Networks, SVM, and -NN.
Their results have shown that -NN reports the best mean positioning error, approximately 4 m.In [30] better results are presented by using a combination of BLE4.0 beacon and Wi-Fi technologies and the same classification algorithms.A more in-depth analysis on the parameters of different classification algorithms is presented in [31], where the best results have been obtained using a weighted distance (WD) for -NN.In [32], Peng et al. have obtained similar results using -NN, testing different values for "." In summary, all the abovementioned works present localization results studying different parameters, ranging from the dimensions of the area under study to the hyperparameters of the classification algorithms.In this paper, we discuss in depth the impact of two sets of parameters: (i) system configuration: the deployment and setting of the BLE4.0 beacons, namely, density and transmission power, and (ii) algorithms: the parameters governing the different classification algorithms.In this context, Figure 2 shows the overall system and algorithmic parameters studied in this paper.

BLE4.0 Signal Characterization
Recent studies have shown that BLE4.0 beacon signals are highly sensitive to interference and fast fading.Similar to Wi-Fi, BLE4.0 operates in the 2.4 GHz band divided into 40 channels, each 2 MHz wide.In order to avoid interference between BLE4.0 and Wi-Fi devices, BLE4.0 mainly uses channels 37 (2402 MHz), 38 (2426 MHz), and 39 (2480 MHz) [21].BLE4.0 devices transmit on these channels cyclically, and they only make use of other channels when paired with other devices.As for the multipath effect, it requires the development of tools and methodologies enabling the setup planning of robust and accurate BLE4.0-based indoor localization systems.In this section, we experimentally study the channel propagation of BLE4.0 signal.Our main goal is setting a baseline experimental prototype allowing us to identify the key system parameters.We then motivate the use of supervised learning algorithms as a viable methodology to enhance the accuracy and robustness of BLE4.0-based indoor localization mechanisms.

Radiopropagation Model. A large number of recent efforts on developing RSSI-based localization mechanisms
have made use of a RSSI-distance mapping function [9] depicted in the following equation: where   denotes the RSSI in dB at  0 distance (typically 1 meter),  is the path loss coefficient factor, and  is the distance in meters between two wireless devices: a transmitter and a receiver.All the parameters in (1) can be experimentally measured except , which needs to be estimated.Classical approaches determine the  value that minimizes the estimation error in a set of ground truth preliminary measurements [24].This is usually done using the lowest squared error as metric.

BLE4.0 Beacon RSSI Distance Modelling.
In order to verify the suitability of applying this approach using BLE4.0 beacons, we conducted a preliminary experimental test.We initially placed the smartphone device at 1 m from the BLE4.0 beacon and, progressively, moved it away from the BLE4.0 beacon in steps of 1 m up to the maximum distance of 15 m.At each location, we sampled the RSSI level for a period of one minute.We made use of an Android 5.1 smartphone, from now on referred to as the receiver.We conducted three sets of independent measurements by varying the transmission power (Tx) level of the BLE4.0 beacon, namely 0x01, 0x02, 0x03, 0x04, 0x05, 0x06, and 0x07, which, respectively, correspond to 4 dBm, 0 dBm, −4 dBm, −8 dBm, −12 dBm, −16 dBm, and −20 dBm [33].All measurements were conducted under Line-of-Sight (LoS) conditions.
Figures 3(a), 3(b), and 3(c) show the results for Tx = 0x01, Tx = 0x04, and Tx = 0x07 transmission power levels, respectively.As seen in the figures, the RSSI decreases as a function the distance between the BLE4.0 beacons and the target receiver.We also include in the figures the results of adjusting the samples to the model given by (1).  1 provides the mean squared error and standard deviation for all Tx tested.The results show the infeasibility in looking for a direct relation using the aforementioned RSSI-distance model in combination with a triangulation technique: a given RSSI value may correspond to different distance estimates [6].Nevertheless, the results obtained establish the basis to explore alternative solutions, as well as the guidelines on setting the BLE4.0 beacons.In fact, a closer look to the results depicted in Figure 3(c) show that the RSSI-distance model may hold up to an approximate distance between 7 and 8 meters.In all cases, we notice that the RSSI drops substantially between one and three meters, but it then exhibits smaller variations in the range between three and six meters.In the case when the transmission power is set to Tx = 0x04 and Tx = 0x07, we notice a higher decrease on the RSSI in the 6 to 8 m interval than in the case when Tx = 0x01.Beyond the distance of 8 m, the RSSI levels show severe discrepancy with the RSSI/distance model.
Table 2 provides the mean squared error and standard deviation for all transmission power levels for a maximum distance of 8 m.According to these results, it is clear that the use of the lowest transmission power, Tx = 0x07, offers the best solution.This is an important result, since one of our main aims is to limit the power consumption as a means to span the lifetime of the BLE4.0 beacons.The results also show that limiting the size of the experimental area will play an important role in the localization process.For instance, in the case of Tx = 0x07, a mean RSSI value of −96 dBm may correspond to a distance of 8 m when limiting the maximum distance to 8 m, while −95 dBm, a higher RSSI, may correspond to 11 m or 12 m when we consider a maximum distance of 15 m.These results provide the basis on determining the initial deployment and power setting of the BLE4.0 beacons.In fact, this analysis provides us the basis to configure the setup to obtain the data required to guide the supervised learning algorithms.

Bluetooth Signal Attenuation
In this section we analyze in depth the RSSI fingerprint throughout the experimental area and its behaviour at different times of the day.Our main aim is to get an insight on the factors that may drastically affect the localization process [23,30].For this experiment we choose a medium transmission power level, that is, Tx = 0x04.We defined an experimental area fragmented into 15 zones, of 1 m 2 each, separated by a guard zone of 0.5 m 2 to better differentiate the RSSI of joint sectors (see Figure 4 for details).The experimental setup consists of a total area of 9.6 m by 6.3 m, where the minimal distance between a BLE4.0 beacon and the receiver will be 1.5 m.
In Figure 5, four different views of the physical area represented in Figure 4 (our laboratory) are shown, which are the perspective images taken from four different BLE4.0 beacons positions.As seen in the picture, BLE4.0 beacons "Be10" and "Be11" have been placed by the window side while BLE4.0 beacons "Be07," "Be08," and "Be09" have been placed by the drywall side.

RSSI Fingerprint.
During our first survey, we monitored the RSSI of each BLE4.0 beacon at each one of the fifteen sectors of our experimental area.
Figures 6(a)-6(e) depict the RSSI average values of each of the five BLE4.0 beacons over the experimental area.Note that the RSSI reported by the BLE4.0 beacons placed close to the windows, namely, BLE4.0 beacons "Be10" and "Be11," are characterized by a lower signal strength; see Figures 6(d) and 6(e).In the case of BLE4.0 beacon "Be11," the signal vanishes quickly starting at neighbouring sectors.
Hence, from these figures we can extract the following conclusions: (i) We can easily identify the location of each BLE4.0 beacon from the RSSI fingerprint.(ii) The RSSI level of the beacons placed by the drywall side is higher than the one reported by the beacons located by the window side.
These results show the need to evaluate the attenuation of the BLE4.0 beacons under different conditions.

Intraday RSSI Surveys.
In order to illustrate the challenges faced on developing RSSI-based indoor localization mechanisms, we carried out three survey campaigns.Similar to our previous survey campaign, we monitored the RSSI levels of the various BLE4.0 beacons throughout the experimental area.The campaigns were carried out at three different times throughout a day: morning, midday, and afternoon.We will refer to the sample traces as Take 1, Take 2, and Take 3, respectively.In the following, we will discuss our main findings on the analysis of the data obtained for BLE4.0 beacon "Be09."Our choice has been based on the fact that  BLE4.0 beacon "Be09" was placed at the midpoint of the drywall side.This should allow us to compare the RSSI levels at the two opposite sides of the experimental area over the same distance.
Figure 7 shows the RSSI traces for Sectors 1, 3, 9, and 15.Recall that BLE4.0 beacon "Be09" is located at the right side of Sector 7. We have also included the mean RSSI corresponding to each one of the traces.Table 3 summarizes the main statistics of all three traces.From the analysis of the traces, we can make the following observations: (i) The RSSI varies substantially throughout the time.
The values reported for Sector 13, located close to the corridor, exhibit the major differences between the highest and lowest RSSI value; see Table 3.This clearly  shows the need of taking into account the floor plan when developing an indoor localization mechanism.
(ii) The mean RSSI levels of sectors located at the same distance from Be09 substantially vary from one to another.For instance, the RSSI levels of Take 3 of Sectors 3 and 15 located both at the same distance from BLE4.0 beacon "Be09" exhibit a difference as high as 8 dBm.The gap between mean of the three traces for these two sectors consistently report a high value.
(iii) The RSSI varies substantially from one survey campaign to another in the sectors located by the windows.We notice that the RSSI level of Sector 1, the one located at the corner of two drywalls, exhibits a more constant value.From these results, we confirm that counting with the floor plan of the experimental area is a must to be able to properly analyze the results.
According to the previous analysis and the experimentation carried out in Section 3, we can remark the following characteristics for the RSSI using BLE4.0 beacons: (i) Some of the levels of the RSSI for the sectors close to the window side are more than 10% lower than the ones reported for the sector located close to the drywall.This behaviour of the RSSI affects the classification process since the RSSI varies substantially.
(ii) The RSSI from BLE4.0 beacons is very sensitive and depending on structural characteristics of the surrounding walls.
(iii) The levels of the RSSI reported for sector located at the same distance from the reference BLE4.0 beacon, "Be09" in our case, may substantially vary from one sector to another.
Our results show the need of exploring alternative data processing mechanisms towards the development of a RSSIbased localization solution.In order to be able to focus on the characterization of the signal in an indoor environment taking into account only the floor plan, we have restricted the access to the lab premises during our experiments.

Experimental Apparatus and Algorithms
In this section, we introduce the specifications and technical details of our experimental setting.Firstly, we describe the experimental tools developed in our research.Next, the two classification algorithms used in our experiments are explained with their configurations and metrics analyzed in the third and fourth part, respectively.Finally, we described the physical layout of the testbed we have used to carry out all the indoor localization experiments.

Experimental Tools.
From the previous sections analysis, we argue that the following holds: (i) Supervised learning algorithms are worth exploring.
We therefore suggest evaluating the use of Support Vector Machine (SVM) and -Nearest Neighbour (-NN) algorithms.
(ii) The actual distance between the BLE4.0 beacon transmitter and the target plays a central role in the estimated RSSI.We should therefore consider multiple experimental setups by varying the number and distance between the reference BLE4.0 beacons and the target.
(iii) Line-of-Sight seems to be an essential requirement in order to get a first insight into the Bluetooth capabilities towards the development of indoor location fingerprints.
(iv) Data preprocessing.Given that our interest is to evaluate the system configuration, beacons density, and power, we have decided not to apply any filtering or outliers detection technique.We therefore use the raw data collected during our surveys.
(v) The transmission power (Tx) level should also be carefully considered to ensure the long run of the BLE4.0 beacons.We use Tx = 0x07 throughout our first set of experiments and Tx = 0x04 and Tx = 0x07 during the second set of experiments.These two transmission power levels offer the best characteristics for our study: lower power consumption and an almost monotonic decrease of the mean RSSI level as a function of the distance in the 0-8 m interval.

Supervised Learning Algorithms.
In this work, we propose making use of supervised learning algorithms (SLAs) to estimate the position of the receiver.SLAs consist of two phases, a training phase, where input data should have been previously annotated with their corresponding category.This phase generates a classification model, which is subsequently used to infer the category of provided test data during the classification phase.That is to say, when applied to localization, SLA is used to generate the RSSI fingerprint from which the location can be obtained.
In this work, we explore the use of two popular SLAs, namely, the -Nearest Neighbour (-NN) [21,[29][30][31] and the Support Vector Machine (SVM) [28,29] algorithms.A brief description of these two algorithms is included in the following: (i) -NN: given a test instance, this algorithm selects the -Nearest Neighbours, based on a predefined distance metric of the training set.In our case, we use the Euclidean distance since our predictor variables (features) share the same type, that is, the RSSI values, properly fitting the indoor localization problem [34].
Although -NN uses the most common neighbour of the  located categories (which is the mode of the category), some variations are used (e.g., weighted distance) to avoid removing relevant information.
In this paper, we have set the hyperparameter to  with values 1, 3, and 5.We have verified that further increasing  does not improve our results.We use both mentioned versions of the algorithm: the weighted distance (WD) and mode (MD).
(ii) SVM: given the training data, a hyperplane is defined to optimally discriminate between different categories.If a linear classifier is used, SVM constructs a line that performs an optimal discrimination.For the nonlinear classifier, kernel functions are used, which maximize the margin between categories.In this paper, we have explored the use of linear classifier and polynomial kernel with two different grades, namely, 2 and 3. Finally, we present only the best results which were obtained with a polynomial kernel with a quadratic function [34].

Experimental Setups.
From our preliminary experimental analysis, it is clear that we should keep in mind the following aspect.If the locations of the BLE4.0 beacons change, we must carry out a new off-line/on-line sample collection campaign.The measurement campaigns mainly consisted of the following three steps.
(i) Off-Line.RSSI measurements collection phase: during this phase, we use the receiver for collecting a set of RSSI samples at predetermined locations spread over the experimental field covered by  BLE4.0 beacons.
(ii) Data Storage.The data is organized into a th dimension vector and labelled with the coordinates of the receiver position.
(iii) Classification.We evaluate the performance of the two SLAs using the RSSI measurements as our source data.The evaluation will be measured in terms of the accuracy of the estimated location of the target.

Classification Metrics.
Prior to the training phase, RSSI measurements are obtained by placing the receiver at different locations.These captures are then stored in a database during an off-line phase including the ⟨, ⟩ coordinates and RSSI level for each sample.Afterwards, the RSSI receiver measures are captured again in an on-line phase.These latter instances are then compared with the model derived in order to predict the location of the receiver, that is, generate the RSSI-based location fingerprint.We evaluate the localization performance of the two classification algorithms in terms of the following metrics: (i) Global accuracy: it is the algorithm's precision in the classification phase.The value is calculated in percentage (%) between the exact positioning operations and the total number of positioning operations over the whole experimental area.(ii) Local accuracy: it is the individual algorithm's precision in the classification phase for each sector of the experimental area.The value is calculated in percentage (%) too.
(iii) Mean positioning error: it is the average error for the whole experimental area.This error is calculated in meters () taking into account the total dimensions of each area.From now it is named as "mean error."

Survey Campaign Characteristics.
Our experiments were conducted in a lab of our research institute.We placed four BLE4.0 beacons at each one of the four corners in a rectangular area, and we considered two experimental areas with different dimensions.A fifth BLE4.0 beacon was placed in the middle of one of the longest edges of the room.
For the experimental areas used in this paper, we carried out a survey campaign as follows: (i) We fixed the Tx of all BLE4.0 beacons to the same level.
(ii) We placed the receiver at the center of each one of the sectors, and we measured the RSSI of each one of the five BLE4.0 beacons during  minutes depending on the experimental area.
(iii) The survey was carried out through a time period of five days.The lab occupancy was limited to two people, the same that were in charge of collecting the data during the afternoon hours.
Once the initial parameters are established, in the next sections we proceed to analyze other parameters for indoor localization in the two different environments taking into account the physical area represented in Figure 5.

Performance Evaluation Results
This section analyzes the results for the classification algorithms in two different areas with different physical characteristics.This analysis has been performed taking into account the three classification metrics: (i) global accuracy; (ii) local accuracy; and (iii) mean error.

Experimental Area 1.
In this first setup, we explore the distribution and number of BLE4.0 beacons in the experimental field.The total size of the experimental field used for this first experiment is set to an area of 4 m × 3 m subdivided into twelve sectors of 1 m 2 , as shown in Figure 8. Five BLE4.0 beacons, denoted by "Be07," "Be08," "Be09," "Be10," and "Be11," were placed around the area.With the main goal of identifying blind spots in the experimental field and the number of required BLE4.0 beacons, we carried out six independent trials: in the first configuration, we used five BLE4.0 beacons while, in the following five trials, we removed one BLE4.0beacon at a time.As already mentioned, by limiting the maximum distance between the BLE4.0 beacon and the target to lower than 8 m, we avoid huge discrepancies on the distance-estimation model.
RSSI samples were collected in each of the 12 sectors during approximately five to six minutes.We evaluate and store the arithmetic mean of all the collected samples.No samples were discarded during this phase.
For each trial, the data training set consisted of 231 vectors and a validation set of 99 vectors, randomly selected for each experiment.The results show the classification metrics of the algorithm executed 50 times.Global Accuracy.Table 4 shows the global accuracy according to the different BLE4.0 beacons setups for Experimental Area 1 using the MD of the  values.We can notice the best global accuracy is obtained for  = 1, being the best configuration without BLE4.0beacon "Be10."Moreover, the worst setup occurs when the BLE4.0 beacon "Be07" is eliminated, being for  = 5 the worst results.As seen from the results as the  value is increased, the global accuracy decreases.In fact, this case represents the worst case; it clearly shows that attempting to estimate the position of the target using neighbouring values without taking into account their relevance has a negative impact on the results.
Mean Error.Table 5 shows the mean error for different setups.
As seen from the table, increasing the number of neighbours  has a negative impact on the mean error when using the MD modality.On the contrary, the mean error is reduced by approximately 20% in most cases when increasing  from 1 to 5 in the WD modality.In this case, the setup with BLE4.0 beacons at the corners (i.e., without BLE4.0beacon "Be09") gives us better results for  = 5.The worst results are obtained when we remove one of the two BLE4.0 beacons, "Be07" or "Be08," placed by the drywall side of the experimental area.Figure 9(a) shows the positioning error heatmap for the best global accuracy and Figures 9(b) and 9(c) the best positioning error using MD and WD, respectively.
From the results shown in tables and heatmaps, we can conclude the following: (i) A closer look at the heatmaps reveals very good results, an estimation positioning error as low as 0.4 m.
(ii) Table 5 shows that the lowest mean errors are obtained using only four BLE4.0 beacons placed at the corners.
(iii) Table 5 shows that increasing  from 1 to 5 has a positive impact when using the WD modality of the -NN algorithm but a negative impact when the MD modality is preferred.This shows the importance of weighting the information according to its relevance.
(iv) Figure 9(c) reveals that the use of the WD modality of the -NN considerably improves the accuracy at the center of the experimental area: fusioning the WD of the BLE4.0 beacons proves effective.(v) The heatmaps reveal a higher mean error in the sectors close to the BLE4.0 beacons, being considerably lower in the case of "Be10."This latter BLE4.0 beacon has exhibited the lowest RSSI level among all BLE4.0 beacons; see Figure 6(d).This may translate into the estimation of a slight change on the distance as the signal decreases.
Local Accuracy.In this section, we evaluate the local accuracy for the best global accuracy and the lowest mean error cases; see Figures 10(a), 10(b), and 10(c), respectively.We are mainly interested in defining the guidelines to configure the localization setup according to the user needs.Figure 10(c), for  = 5, shows that the sectors close to "Be07" and "Be11" estimate a mean error of around 2.25 m (refer to Figure 9(c)), with an accuracy of 40% while the accuracy at the center experimental area, corresponding to the lowest mean error, is approximately 6%.We also notice that the lowest accuracy is reported in the sector close to BLE4.0 beacon "Be10." Figure 10(a) shows that the accuracy over the whole experimental area with respect to the accuracy of the other two figures can be improved by discarding BLE4.0 beacon "Be10."However, we notice that the mean error in sectors close to BLE4.0 beacons "Be07" and "Be11" gets severely affected.Figure 10(b) shows similar results to the case when BLE4.0 beacon "Be10" is removed.These results clearly show that the placement of four BLE4.0 beacons at the corners of the experimental area provides a more uniform localization accuracy over the whole area.However, from Figure 10(c) it is also clear that it is important to consider the relevance of the information provided by the BLE4.0 beacons, that is, the difference on the initial RSSI levels of each BLE4.0 beacon.

Case 2: SVM.
For the SVM classifier, we use three different kernels: linear, polynomial of degree 2 ( = 2), and polynomial of degree 3 ( = 3).As in the -NN case, the analysis has been developed for the same experimental area and the same number of training and validation samples.
Global Accuracy.Global accuracy values for SVM are presented in Table 6.In all system configurations, but the one not making use of "Be07," the table shows that a polynomial kernel of degree  = 3 provides better results, that is, linear and  = 2.When comparing one to one the results for each BLE4.0 beacon system configuration, we find that SVM (with  = 2) reports a global accuracy approximately between 1% and 4%, depending on the BLE4.0 beacons setup, lower than the one obtained using -NN with  = 1.
Mean Error.Table 7 shows the mean error for different BLE4.0 beacon setups.As can be observed all mean error values are very similar, but the lowest mean error is provided for the SVM with a polynomial kernel with  = 2. Furthermore, we note that all values are higher than the ones reported by the -NN algorithm using WD modality.Figure 11(a) shows the positioning error heatmaps for the best global accuracy and Figures 11(b) and 11(c) for the lowest mean error.Similar to the results reported by the -NN algorithm, sectors close to the BLE4.0 beacons exhibit higher mean error.However, in contrast to the mean error heatmaps for -NN (see Figure 9), the heatmaps for SVM are more uniform throughout the central sector of the experimental area.Moreover, the sector around BLE4.0 beacon "Be10" exhibits worse results than the other ones obtained using the -NN algorithm.This clearly shows the benefits of taking into account the RSSI levels reported by the various BLE4.0 beacons.These RSSI levels do not exclusively depend on the distance but also on the structural characteristics of the environment.
Local Accuracy.Figure 12 shows the local accuracy corresponding to the highest global accuracy and the lowest mean errors using SVM. Figure 12(a) shows similar behaviour to the results depicted in Figure 10(b): the sectors close to BLE4.0 beacons "Be07," "Be08," and "Be11" show better local accuracy results.However, the local accuracy for the sectors close to the BLE4.0 beacons is considerably lower than the one obtained when using the -NN algorithm.This effect also causes a lower global accuracy.Figures 12(b) and 12(c) show similar behaviours, that is, lower local accuracies with respect to the ones reported by the -NN algorithm.
From the results obtained with both algorithms (-NN and SVM), we can conclude that, in order to assess the capabilities of BLE4.0-based wireless indoor mechanisms, it is essential to count with all the three metrics: mean error, global accuracy, and local accuracy.Up to date, most studies limit their evaluation to report on the mean error and global accuracy.We should argue that, by providing the local accuracy, together with the system parameters, transmission power, and actual placement of the BLE4.0 beacons, the system designer should be able to identify the shortcomings to overcome.
In the following, we will carry out a second set of experiments.Our main aim is twofold: (1) We aim to explore the system localization parameters.
In this case, we will consider a slightly larger experimental area and the use of a medium transmission power level, Tx = 0x04.
(2) We aim to use larger training and validation datasets with respect to the one used in the previous study.

Experimental Area 2.
In this second experimental setup, we further explore the performance of the -NN and SVM  algorithms using two different transmission power settings, namely, Tx = 0x04 and Tx = 0x07.We define an experimental area fragmented into 15 zones of 1 m 2 each separated by a guard zone of 0.5 m 2 to better differentiate the RSSI of joint sectors.The experimental setup consists of a total area of 9.6 m by 6.3 m, where the minimal distance between a BLE4.0 beacon and the receiver is 1.5 m; see Figure 13.Similar to the previous experimental trials, we sampled the RSSI during two minutes at the of each of the fifteen zones.Table 8 shows the RSSI and the size of the training and validation dataset used for two transmission power levels under study.

Case 1: 𝑘-NN.
Similar to the previous study, we proceeded to analyze -NN with the same classification metrics for Tx = 0x04 and Tx = 0x07.
Global Accuracy.Table 9 shows different BLE4.0 beacon setups used in our environment, where the best configuration is obtained eliminating the BLE4.0 beacon "Be09" for Tx = 0x04.Furthermore, for both transmission power levels we can see that the BLE4.0 beacons placed at the corners are essential in this experimental area.Moreover, a higher value of  improves the global accuracy considerably.
Therefore, the use of a larger area with guard zones enables an improvement on the global accuracy: the classification algorithm is able to better differentiate the RSSI between the different sectors.Table 4 shows an improvement as high as 7% with respect to the results obtained for the Experimental Area 1 setup.Also, as expected, an intermediate transmission power level and a higher value of  improve the global accuracy.Furthermore, our results also show that the BLE4.0 beacons should be placed at the corners of the experimental area setup.
Mean Positioning Error.Table 10 depicts the mean error for all system configurations for the -NN algorithm.As can be observed, when MD is used, increasing the value of  does not always improve the mean error as in the previous case.When WD is used, similar to the previous experiment, the mean error is considerably reduced.Also, in this latter case, the configuration with the four BLE4.0 beacons located at the corners and the one using five BLE4.0 beacons, offers very similar results.From this table, we also notice that better results are obtained when a higher transmission power level is used, that is, for Tx = 0x04.Then, better mean error results are obtained at the expense of using a higher transmission power level.
Figure 14 depicts different error heatmaps.Figures 14(a) and 14(b) show the heatmaps for the best results when setting Tx = 0x04 using MD and WD, respectively.The fact of using WD proves effective in reducing the mean error across all the experimental area.On the contrary, the heatmaps produced when using the MD modality show that the classification algorithm is unable to take into account the difference on the RSSI levels of the different BLE4.0 beacons.The sectors close to BLE4.0 beacon "Be07" are characterized by the largest mean error.In other words, since the RSSI value of BLE4.0 beacon "Be07" is higher than the ones characterizing the other BLE4.0 beacons, the estimated distance is longer than expected.This effect is worsened by the inclusion of BLE4.0 beacon "Be09"; see Figure 14(c).Also, when MD is used, the classification splits the area into two main sectors; see Figures 14(a) and 14(c).Finally, the use of higher transmission power, Tx = 0x07, results in a more uniform positioning   error throughout the whole area; see Figure 14(b).In fact, this system configuration provides the best results in terms of the mean positioning error, as shown in Table 10.
Local Accuracy.As already mentioned before in this section, the BLE4.0 beacons located at the corners are essential, as can be seen in Figure 15, which represent the behaviour of the local accuracy throughout the experimental area.Specifically, Figures 15(a In this case, we notice that a more uniform local accuracy results in a lower mean positioning error; see Figures 15(b Global Accuracy.Table 11 shows the global accuracy for Tx = 0x04 and Tx = 0x07.We can see that the best results, for both transmission power levels, are obtained when the configuration of all BLE4.0 beacons is used. Moreover, using a linear kernel for Tx = 0x04 provides significantly better results than other SVM configurations.For Tx = 0x07 the best result is obtained using a polynomial ( = 2) kernel, but far from the results obtained with Tx = 0x04.Comparing results with Experimental Area 1 (see Table 6), we can see how SVM improve the accuracy a 10% when using a bigger experimental area.
Mean Positioning Error.Table 12 shows the mean error values.Again, the best results for Tx = 0x04 are obtained when all BLE4.0 beacons are used and with the linear kernel function, but for Tx = 0x07 the best performance is obtained when   the BLE4.0 beacons are located at the corners (configuration without BLE4.0beacon "Be09") of the experimental area and using a polynomial ( = 2) kernel function.
Figure 16(a) depicts the positioning error heatmaps for the best global accuracy for Tx = 0x04, which is also the same with lowest mean error.Figures 16(b) and 16(c) depict the best global accuracy and lowest mean error for Tx = 0x07, respectively.Similar results are obtained as in previous experiments, where we can observe that a good global accuracy does not provide a lower mean error (in general), but a balanced mean error throughout the area provides better results.In this Experimental Area 2, comparing Figure 14 with Figure 16, we also observe that -NN have a more uniform mean error than SVM, since the central sectors normally have lower mean error.Finally, comparing SVM for Experimental Areas 1 and 2 (see Figures 11 and 16) we observe that the mean error is improved in bigger areas, specially in sectors close to the BLE4.0 beacons, with the mean error being more uniform throughout the whole area.
Local Accuracy.Figure 17 shows the local accuracy behaviour for different setups and transmission power levels.
We obtained similar results as in previous sections, where in general a balanced local accuracy provides lower mean error (comparing Figures 17(a

Lessons Learned
In this section, we summarize the main guidelines on setting the system configuration and algorithm parameters enabling the setting of a more accurate and robust localization mechanism.In our discussion, we will present our main findings by following the parameters related to the classification algorithms and the localization system setup.In order to guide our discussion, we will follow Figure 2, where we have listed the main parameters to be set.Furthermore, we will refer to the setting of our experimental system as a means to illustrate the applicability of our guidelines.Since the results obtained using the second experimental setup were clearly superior to those obtained in the first setup, we will derive the main guidelines from the lessons learned through our experimental trials.Our main aim is to provide guidelines allowing us to identify the main system and algorithm parameters to be tuned on the development of a robust and accurate BLE4.0-based indoor localization mechanism.
Table 13 summarizes the best system and algorithms setups derived from our study.As already mentioned, we focus on our second experimental setting.In fact, the size and organization of the experimental area were among the first parameters to be set; see Figure 2. From our preliminary study on the channel characterization, Section 3, we were able to identify the signal propagation allowing us to better distinguish the various sectors.As for the actual organization of the experimental area, our results have shown that the use of a guard zone proves to be effective in improving the classification process.
As for all the other parameters related to the system setting, the following guidelines can be derived.
(i) BLE4.0 Beacon Transmission Power.The setting of this parameter has to be derived taking into account the information that it may provide in order to enable the classification of the process.In this case, it should provide enough information to enable distinguishing the various sectors of interest.In our particular study, we found out that the use of a medium power level showed slightly better results in terms of the mean positioning error for the case of the -NN (WD) and the SVM classification algorithm setups.In the case of the -NN (MD) algorithm, the results exhibited a higher discrepancy.For this latter setup, we notice that both system configurations include all five BLE4.0 beacons.It is therefore clear that the information of BLE4.0 beacon "Be09" helps to compensate the discrepancies on the RSSI levels reported by the BLE4.0 beacons located close to the windows and those located close to the drywall.As for the accuracy reported for the two transmission power setups, we notice that the use of a medium transmission power level considerably improves the accuracy of the localization mechanisms; see Tables 9 and 11.
(ii) BLE4.0 Beacons Position and Topology.From our preliminary study, Section 3, we have found out the importance of identifying the materials composing the various walls.In a more complex setup where, for instance, big metal cabinets may be present, the designers should take care of evaluating the RSSI levels close to and around such objects.The information obtained from such preliminary study should condition the actual topology of the system.In our case, we have found out that the levels of the RSSI detected may considerably vary depending on whether the BLE4.0 beacon has been placed close to a drywall or window.As seen in Table 13, the system configurations for the -NN (WD) and SVM algorithms when using transmission power level Tx = 0x07 do not include BLE4.0 beacon "Be09."Since in this case the RSSI levels reported by the BLE4.0 beacons close to the window and those to the drywall do not greatly differ, there is no requirement for BLE4.0 beacon "Be09."However, in the case when Tx = 0x04 is preferred, the inclusion of BLE4.0 beacon "Be09" provides some extra information and therefore compensates for the discrepancies on the RSSI levels reported by all the other BLE4.0 beacons.
(iii) BLE4.0 Beacons Density and Spacing.From our results, it is clear that the number of required BLE4.0 beacons to cover a given area will depend on the size of the area to cover, the transmission power, and the RSSI levels reported by the BLE4.0 beacons.As our results show, the discrepancies on the RSSI reported by the BLE4.0 beacons due to the structural characteristics of the surrounding walls will require the use of additional BLE4.0 beacons.In our case, the use of four BLE4.0 beacons placed at the four corners provided the best results when using the lowest transmission power; see Table 13.The inclusion of BLE4.0 beacon "Be09" under these latter   conditions exhibited slightly worse results; see Tables 10 and  12.
Regarding the operation and setting of the algorithm parameters, the following guidelines can be derived: (i) Classification algorithms: a classification process needs to be able to differentiate the RSSI levels of the BLE4.0 beacons at different sectors.Therefore, a steeper fall on the RSSI should provide the best results.Furthermore, ambiguities should be removed in order to reduce errors on the classification process.
In our particular setup, we had to limit the distance to 8 m and use medium and low transmission power levels.
(ii) -NN classifier: the use of values of  higher than or equal to five provides good results given the limitations on the mean positioning error and accuracy reported by BLE-based localization mechanisms.In our setup, we have observed that the use of  = 5 may compensate the discrepancies on the reported RSSI levels when using low transmission power levels.In fact, in the case when using Tx = 0x07, the best configuration for -NN (WD) does not make use of BLE4.0 beacon "Be09."However, the overall best results were obtained for the system configuration -NN (WD), with  = 5 and using all the five BLE4.0 beacons.This clearly shows that the use of a transmission power level enabling differentiating the various sectors together with the use of a compensating BLE4.0 beacon "Be09" proves effective.
(iii) WD criteria work better than MD criteria due to the fact that the first one use all the  neighbours in order to polish the final result with a weighted average distance.
(iv) SVM algorithm: similar to the results obtained for the -NN algorithm, the choice of the transmission power plays a major role in the setting of the algorithm parameters.In the case when the lower transmission power is used, the best results are obtained for a system configuration not making use of the BLE4.0 beacon "Be09."Furthermore, the use of a higher transmission power level simplifies the configuration of the SVM.In this latter case, a linear classifier is used.
Finally, Figure 18 summarizes the recommendations for tuning up the indoor localization mechanism.The recommendations include the setting of the parameters of the classification algorithms and configuration of the BLE4.0 beacons.

Conclusions and Future Plans
In this paper, we have explored the use of two supervised learning algorithms towards the development of BLE4.0 beacon-based location mechanisms.From our study, we have identified that the use of -NN and SVM algorithms may prove effective in developing an indoor location fingerprinting mechanism.Furthermore, our results have provided us with some useful insight on the key parameters of both, the physical infrastructure and the supervised learning algorithm.The Tx level and the number and placement of BLE4.0 beacons are the main physical parameters to be looked at, while the number of neighbours to be used plays a major role in the performance of the -NN algorithm.
Moreover, with the purpose of improving indoor localization parameters, it is necessary to count with a floor plan defined, for example, guard zones, for the data acquisition phase, in order to differentiate the RSSI in contiguous sectors.
With respect to the environment, we can say that it must be configured so that we have a fall on the RSSI.This is due to the fact that sectors that are close to the transmitters do not provide good results.Other important aspects to improve the indoor localization mechanisms are the topology, the Tx levels, and, above all, the classification algorithms hyperparameters.
Our immediate research activities will focus on the impact of room occupancy using multiple additional sensors and the impact of using different and much more Machine Learning algorithms for the location estimation.

Figure 3 :
Figure 3: RSSI-distance correlation for three different transmission power (Tx) levels.

Figure 4 :
Figure 4: Experimental area for Bluetooth signal attenuation experiments.

Figure 5 :
Figure 5: Pictures, each one of each of the four corners of the laboratory.
) and15(b)  are related to the best global accuracy and mean error for Tx = 0x04, respectively.
) and 14(b).More specifically, comparing Figures15(c) with 15(d), we can see that the use of BLE4.0 beacon "Be09" provides better local accuracy in remote sectors.The results of local accuracy for Tx = 0x07 provided similar results.6.2.2.Case 2: SVM.Similarly as with Experimental Area 1, we have proceeded to analyze the SVM experimental results with the same classification metrics for Tx = 0x04 and Tx = 0x07.
) and 16(a)).Typically, sectors placed in the corners have higher local accuracy, and BLE4.0 beacon "Be09" usually improves the local accuracy throughout the area as we can see in -NN experiments, comparing Figure15with Figure17.

Figure 18 :
Figure 18: Overall recommended values for the parameters.

Table 1 :
Mean squared error and standard deviation obtained for a distance of up to 15 m for all transmission power (Tx) levels.

Table 5 :
-NN, mean error (m), using statistical mode (MD) and weighted distance (WD), for different BLE4.0 beacon setups for Tx = 0x07.Best results are shown in bold.

Table 7 :
SVM, mean error (m) for different BLE4.0 beacon setups for Tx = 0x07.Best results are shown in bold.

Table 8 :
Average RSSI and training and validation data for two transmission powers (Tx) level.

Table 10 :
-NN, mean error (m), using statistical mode (MD) and weighted distance (WD), for different BLE4.0 beacon setups for Tx = 0x04 and Tx = 0x07.Best results are shown in bold.

Table 12 :
SVM, mean error (m) for different BLE4.0 beacon setups for Tx = 0x04 and Tx = 0x07.Best results are shown in bold.