Optimal Uncalibrated RSS Indoor Positioning and Optimal Reference Node Placement Using Cramér-Rao Lower Bound

In this paper we propose a global positioning algorithm of multiple assets based on Received Signal Strength (RSS) measurements that takes into account the gain uncertainties of each hardware transceiver involved in the system, as well as the uncertainties on the Log-Distance Path Loss (LDPL) parameters. Such a statistical model is established and its Maximum Likelihood Estimator (MLE) is given with the analytic expression of the Cramér-Rao Lower Bound (CRLB). Typical values of those uncertainties are given considering whether calibration is done in production, in situ, or if hardware is used uncalibrated, in order to know what is the expected accuracy in function of the calibration setup. Results are tested by numerical simulations and confronted to real measurements in different room configurations, showing that the theoretical bound can be reached by the proposed MLE algorithm.


Introduction
Assets positioning raised a great interest in the last decade, especially with the Internet of Things (IoT) business. In this context, we expect indoor and outdoor positioning to be achieved with the same hardware, a low energy constraint for an autonomy of a few years. Economical aspect can even constrain each reference node, named Access Point (AP), used to locate the target assets, to run on battery.
Outdoor positioning is mainly achieved using Global Navigation Satellite System (GNNS) with an accuracy of a few meters that is enough for this kind of applications. Even if GNNS receiver energy consumption has been greatly improved in the past decade [1][2][3][4][5], this still mainly limits the asset autonomy to a few years. There is more energy efficient but less accurate work in progress systems where the measurement effort and position estimation are reported from the asset to the infrastructure using mainly a Low Power Wide Area Network (LPWAN) connection [6][7][8][9][10]. Sub-GHz bands allow using the same transceiver for both indoor and outdoor positioning.
Indoor positioning using GNNS requires additional infrastructure because of the poor signal level; the monolithic solutions [11][12][13][14] can be very precise but the need of a great number of external antenna does not meet most economical requirements.
This paper focuses on indoor solutions that meet the constraints of low energy (for assets and APs) with minimal infrastructure and setup to reach a precision of a few meters. We particularly focus on the static assets positioning usecase, which implies that some techniques such as pedestrian dead reckoning are not applicable for the solutions we are studying. This area of indoor positioning using LPWANs, WIFI, or Bluetooth Low Energy (BLE) raised a great interest in the last decade and many solutions have been proposed, based on Received Signal Strength Indication (RSSI), Time of Arrival (ToA), and Angle of Arrival (AoA). Reviews on recent advances and capacities can be found [15][16][17], and theoretical bounds like Cramér-Rao Lower Bound (CRLB) and 2 Journal of Sensors algorithms have been given and reviewed for most techniques [18][19][20].
The common framework widely used proceeds in two steps instead of direct estimation to reduce complexity even if this is suboptimal in general [19]: (i) A given number of measurements related to distance are collected in a short time slot; outliers and noise are filtered and an approximate distance measurement is inferred from those, which are not coherent one with the other due to multipath and measurement noise.
(ii) Those distances are then aggregated using algorithms, which can belong to the following domains: ToA based solutions can be very accurate (within a meter or centimeter), such as Ultra Wide Band (UWB) or collaborative positioning, but they require expensive hardware and moreover synchronization signals are involved which increases the power consumption. On the other hand such a precision is not needed for asset positioning.
This paper then considers less precise but lower cost solutions like Received Signal Strength (RSS) positioning using low energy protocols like BLE. In this case the accuracy is strongly limited by the fast and slow fading effect arising in dense multipath environments. Fast fading can be mitigated using several measurements in time and their first step preprocessing (averaging and removing outliers from the average, i.e., values that are far from the mean value, or using median value) before running the second step positioning algorithm (which is known as time diversity [21]).
Contrariwise, slow fading effects need spatial diversity or frequency diversity to be reduced. For a given multipath configuration, frequency diversity changes fading effects on the RSS allowing simple algorithms like averaging or maximum selection to improve the measurement at the preprocessing stage of the positioning [22,23]; this solution is promising because it is only at the energy cost of a few emissions using an agile emitter. Theoretical bounds and finding algorithms should be more developed in the future to tackle the slow fading effect.
Fingerprinting methods include the multipath effect for each specific room in the propagation model to compensate the static part of slow fading effects. Depending on the AP placement, this method can drastically improve the accuracy of positioning compared to physical models based solutions. The main drawback is the learning process energy consumption and hardware cost that should be deployed at the setup for static environments, and moreover the continuous learning process needed to correct slow changes in the environment.
An energy efficient solution to reduce slow fading is to use a sufficient number of APs in sparse places of the room to give redundant measurements, with different multipath biases. This spatial diversity may increase the accuracy of the positioning algorithm when it is able to take advantage of redundant measurements. There is then a trade-off between the hardware cost of a great number of APs and the desired accuracy of positioning. Part of the hardware cost resides in its installation process which is greatly reduced when those receivers are battery powered (so they do not need to be connected to a power source), when no calibration of the hardware gain is needed, and moreover when precise coordinates of AP placement in the room are not needed.
Calibration of APs gains can be done in the fabrication process or at setup time which takes into account the antenna coupling with the environment, or calibration can be run jointly to positioning as an unsupervised learning process. Many works deal with the calibration of radio map with pure machine learning [24][25][26][27], where others introduce path-loss model into the learning process [17,24,28]. It is difficult to find how much radio map learning can improve the positioning performances and what will be quantitatively the positioning error after the learning phase.
To help the design of a positioning system, the authors of the previous work proposed theoretical bounds of positioning error with uncalibrated propagation model that are reachable in an analytic way [29]. This bound has already been derived for RSS, ToA, and Difference Time of Arrival (DToA) techniques but not in the specific case of uncalibrated Aps [19]. In the case of RSS phenomena like measurements quantification are taken into account but lead to numerical expression of the CRLB [30]. The CRLB is generally used in statistics to bound the error variance of a given estimation problem using the FIM derived from the likelihood function of an estimate. This bound cannot be used to find an estimation algorithm but gives a limit to the performance of any estimator that can be proposed to solve this specific estimation problem. Moreover this bound can sometimes be optimistic when there is no existing algorithm capable of reaching the performances of the bound.
We first consider in this paper the whole estimation of multiple assets positions, as it may improve performances compared to the independent estimation of each single target in presence of common APs miss-calibrated parameters. Then full equations are given and compared to previous paper single asset positioning [29]. The Maximum Likelihood Estimator (MLE) is also confronted to real experimentation and measurement. Comparing error performances to CRLB gives the efficiency of each solution so one can judge the loss of precision in regard to the algorithm complexity and energy efficiency. Moreover the analytic expression of the bound helps to design, without any experimentation or simulation, the best APs coverage density and calibration efforts to be run with a given precision objective.
In the next section, we first give the stochastic propagation model and state the positioning problem as an uncertain static parameter estimation. Then analytic CRLB and the Journal of Sensors 3 MLE reaching this bound are derived. Section 3 aims to verify previous results using simulations: the analytic bound is compared with variance and RMSE of numerically computed estimates.

Optimal RSS Positioning Algorithm with Uncertain Calibration Parameters
We consider the positioning problem of I target assets which sends N signals to J APs placed in a given room. The positioning algorithm takes part of a propagation model which contains uncertain parameters model for each asset and APs. It must be noted that this estimation problem is not a joint estimation of the model parameters and the position at the same time (like it is often done when dealing with fingerprinting) but rather a position estimator which takes into account the residual uncertainties of model parameters from the calibration process to estimate a position.
The propagation model and estimation problem are stated in the next section; the likelihood and maximum likelihood estimator is derived in Section 2.2. The analytic CRLB is given in Section 2.3, and finally the impact of asset gain, APs gain, and reference model gain are shown using this analytic formula at the end of Section 2.3.

Problem Statement and Uncertain Model.
We consider the position estimation problem to be a static parameter estimation: knowing the RSSI channel model, the uncertainties on the parameters model, we want to find̂, the best estimate of position , inside a given room from some measurements ( ) by maximizing the likelihood function L( | ( )): The use of probabilistic algorithms requires a propagation model; the Log-Distance Path Loss (LDPL) model is a common choice for indoor (NLOS) propagation [31], which defines the received power in relation with the distance as it follows: where 0 is the loss at a reference distance of 1 m, is commonly known as path loss exponent, path loss factor, or path loss gradient, ( , resp.) is the loss due to receiver's antenna (transmitter, resp.), and V( ) is the measuring noise modeled as a random Gaussian variable.
Fast fading phenomenon is removed by the preprocessing step, usually by removing outliers using median or Kalman filtering. NLOS situation is taken into account by the lognormal probability of shadowing V( ) and does not worth to be modeled separately from measurement noise as for time delay techniques.
Path Loss parameters are known with relative precision depending on the calibration setup when it exists; its value differs from the theoretical value of 2 (valid in the case of isotropic propagation in vacuum). Density of obstacles and antenna directivity change this value which can be different from an AP to another. However, it is possible to find average values for those parameters working fine in most case and improving them with the collected RSSI over time [27,31] and measure a mean value and a variance for and .
If we do not calibrate ( , 0 , and , resp.), then we can consider it as a random variable normally distributed around an average value with a variance . Similarly, we can model all those parameters by a mean value and a variance expressing the residual uncertainty of the measure. We want to provide some typical values for three different calibration modes: production calibration where the static gains are measured prior to installation, in situ calibration where the gains are measured with the device fixed at its final position (more complexes and being costly for industrials, but it takes into account the possible change in gain because antenna coupling with the material the device is attached on), and finally no calibration where we only know the general Probability Density Function (PDF) of the gains from mean and variance measurements. We conducted experiments in three different rooms (confined, semiconfined, and open environment) to get typical values for our hardware: we measured RSSIs at known distance from multiple APs with several target assets and measured the mean and standard deviation of each parameter of the model using a Root Minimum Square (RMS) optimization. We also measured the static gain in an anechoic chamber using an USRP B200 from Ettus Research, measuring as well the variance of our measurement; results are compiled in Table 1. It must be noted that mean gains ( and ) are set to zero for in situ calibration because their mean value is reported to be 0 as we do not have a reference measuring tool such as in production calibration.
We propose studying the influence of those variances over the accuracy and precision of the likelihood estimate using CRLB, for instance, to know which accuracy we could expect by skipping or not the calibration of the receivers or the APs.
Let us consider that assets send signals to APs in the same room. The path loss model (2) applied to the ∈ N = { , 0 ≤ < } signal sent from the i th asset and received at the j th AP gives a strength measurement (in dB) and its expectation is expressed by with Δ ( , ) = 5log 10 ( ( , ) 2 ) being the log-distances and ( , ) 2 = ( − )( − ) being the distance between the target i and the AP number j.
It is shown in the Appendix that the vector of all × × measurement can be modeled as the statistical expectation Table 1: Typical uncertain parameters models when production calibration is run, or when calibration is done at the setup process in situ, or for the uncalibrated scenario. The notation is the value obtained from calibration where N(], 2 ) is the normal law of mean ] and variance 2 .

Parameters Production calibration
In situ calibration Not calibrated Thus the measurements are affected linearly by a multivariate covariant random uncertain vector, so the likelihood and MLE can be obtained.

Log-Likelihood Expression.
The positioning problem is to find all the assets coordinates components of the vector with ∈ K = { , 0 ≤ < }, = 3 , for 3D positioning and = 2 for 2D where = [ 0 ⋅ ⋅ ⋅ ]. The likelihood of a measurement vector is simply the probability density of the multivariate random vector W to equal − . Then the probability density function of such a vector follows a multivariate normal law and can be expressed in matrix form [32], which gives the following expressions for likelihood (L) and log-likelihood (LL): Then the maximum likelihood to be solved iŝ Analytic solution to this optimization problem seems complicated as long as uncertainties on (involved in the covariance matrix Σ W ( )) are to be taken into account: this involves a covariance matrix which depends on the unknown optimization. This is the scope of future work because the part of Σ W that depends on is the diagonal matrix Σ W in an additive way and further calculations may be solved analytically in the future.
In this paper we consider that all path-loss exponents are certain, or sufficiently calibrated, with known values . Then only miss-calibrated gains on assets and APs are considered uncertain as follows. Then Σ W does no longer depend on , which simplifies (9) and (15) to The positioning problem becomes the following nonlinear least square formulation: which is efficiently solved in an iterative way [33]:

Analytic Cramer-Rao Lower Bound.
To compute the theoretical bound, the FIM matrix should be derived using Once again, the log-likelihood derivative with respect to does not permit analytic expression of the bound because of Σ W depending on : In this case the FIM can only be obtained in a numerical way. Taking the assumption of a well-calibrated path-loss exponents (14) and (11) gives Then the CRLB can be used to obtain the inequality Then as / is overdetermined compared to the dimension of W we can use the Moore-Penrose Pseudo Inverse (MPPI) [34]: where + and ( ) + stand for the MPPI of . As the measurements sensibility is independent from an asset to another and the expectation of measurements is independent from a measure to another, the derivative / is a block diagonal repetition of the form diag ∈I ( / ⊗ 1 ) where = ( ) ∈J is the vector of all expected measurement for the i th asset. Then pseudoinverse is expressed as a vertical vector as Then using the covariance matrix structure (A.5) of W with expression (18) the inverse FIM is simplified in The inverse FIM shows that each asset positioning accuracy is independent one from the other, which shows that estimating positions of multiple assets is equivalent to making different estimations of one asset. Hence, information measured by each asset does not improve positioning of the others, and results from previous work [29] can also be used for multiple assets without loss of optimality. Then the CRLB of the i th target positioning can bound the RMSE of its estimate error with This formula shows that the increase of assets to be positioned does not improve accuracy (as the positioning algorithm does not calibrate the gain mismatch, this result seems fine). The number of measurements only helps to mitigate the fast fading and noises measurement 2 and does not compensate the calibration of the APs gains. Hence, we see that we can stop measuring values when we reach 2 ≫ 2 / . Two terms depending on the geometrical configuration represent the Geometrical Dilution of Precision (GDoP) of the positioning. This means that the optimal AP placement depends on the quality of assets, APs gain calibrations, and fading probability.

Simulations and Experiments
In this section we compare analytic expression with simulations. The first subsection validates the analytic form of the FIM with simulations, showing analytic versus simulated covariance matrices. Then Section 3.2 compares results from real measurements against CRLB expectations for various results from a numerical simulation. Those results were made for several different configurations of APs.

Numerical Verification of Cramer-Rao Lower Bound.
For the numerical verification of previous equations, we used a simulator generating RSSIs with an additional pseudorandom noise. From those values we computed a position estimate using the MLE, and we measured the covariance matrix of those estimates. We then plotted the estimates coordinates, the 2-sigma ellipsis of the numerical verification, and the CRLB covariance matrix. An example of the results is shown in Figure 1 with a room of11 by 6 meters, one AP at each corner, = = √ 2 and = √ 3, and two target assets at positions , = [5 5] and , = [2 2]. In this figure, we can see that the simulated covariance is bigger than the CRLB which is consistent with the theory. On regions closer to APs (position of transmitter 1 in Figure 1), the likelihood function is highly nonlinear due to the nonlinearity of the LDPL model close to zero; hence the estimated positions will no longer be spread linearly but rather in circle around the AP combined with the fact that we have further all APs and the GDOP is not good in this configuration; some noisy measurements might be estimated outside the room, which is distorting the covariance ellipsis: the CRLB ellipsis less fits the simulated values; however estimates covariance is always above the CRLB, so the results are still theoretically correct. We can see that filtering those outliers leads to a closest match between CRLB and simulated values. On the rightmost part of the figure you can see the RMSE of the Euclidean distance ((1/ ) ∑ √(̂− ) 2 + (̂− ) 2 wherêand̂are the estimates of the real coordinates and ) versus the trace of the CRLB (the criterion in equation (21)).

Comparison with Real Data.
To make sure that the CRLB matches real case scenarios, we conducted experiments in different room configurations, with several APs configurations at various known positions and computed the Cumulated Density Function (CDF) of the distance error between estimated and real positions. This subsection depicts the setup and the results.

Presentation of Hardware and Experimental Setup.
After a numerical verification of the equations, we confronted the CRLB results with real measurements. Experiments took place in 11 by 6 meters office with two devices from firm FFLY4U: FFLYdot and Myria, respectively, for APs and tracked devices (see Figure 2 for a schematic of the setup and pictures of the devices). Both devices use a Nordic NRF52 for Bluetooth communication and RSSI ranging. It must be noted that for practical issues in this scenario the APs were set as BLE advertisers (iBeacon) and the RSSIs were measured by the tracked asset. This is done without any loss of generality and its impact on the algorithm results in a change of gains between receivers and transmitters ( and values are swapped). This inverted scenario is necessary because the APs only have a broadcast capability. The APs were broadcasting with a period of 25 milliseconds and a total of 140 measurements were collected for each AP. The tracked device was placed on a regular grid of 135 points spaced with a distance of 0.6 meter, using existing marks on the ground, whose size was fine-measured using a laser rangefinder. To be able to easily change the APs coordinates for position optimization tests, each AP was mounted on a mobile wooden pillar.

Cumulative Density Function of the Error for Simulated Data, Real Data, and CRLB in Various AP Configurations.
To compare those measurements, we also computed a CDF using our RSSI simulator: for all 135 positions of measurements, we simulated 140 measurements, from which we computed the estimated position. For the CRLB, we generated 1000 positions estimates from the covariance matrix at each measuring point. It must be noted that the number of estimates is significantly higher than the CRLB because we want to generate a theoretical smooth curve and the random generated position is just an easier way to compute the theoretical curve from the analytic expression, taking into account all the different values of the CRLB at the different positions of the room. We made measurements with 7 and 4 APs placed at easiest mount points in the room (close to existing pillars, e.g., not on windows); the resulting CDF curves of those simulations and measurements are shown in Figure 3, showing that the CRLB is as expected slightly better than reality but not too much conservative or optimistic. Hence, the CRLB could be used as a good indicator of the performance of the AP topology, giving a metric of the error in meters. Moreover, simulated values also produced a more slightly better result as they do not include effect of multipath propagation and fading.

Impact of Calibration on the Expected Accuracy.
Calibration is a step costly in time, which involves measuring in postproduction the static gain of all devices (APs and tracked assets), which might also vary in time for devices on battery. However, it could be skipped by measuring the average and variance of gains and model parameters and using those values in the model. As it could save time and Journal of Sensors   money for industrial deployments, we want to study its impact on the expected accuracy. Now, we showed that our CRLB is consistent with real measurements; we will use it as a criterion to see if calibration is required depending on the expected accuracy. We simulated the CRLB in the case of a squared room with an AP at each corner in function of the size of the room (Δ, which symbolizes the meshing density), our relative coordinates [ , ] = [ , ] /Δ, and the number of measures . We compared the result in the case where no calibration has been made ( = = 2), where APs are calibrated in production ( = 2, ≈ 0), and where both APs and tracked devices are calibrated in production ( = ≈ 0). Results are shown in Table 2. What we can see on those results is that if we do not calibrate the APs gains, the lower error achievable is equivalent to Δ/2; that means the positioning is equivalent to a cellular algorithm. If we want a higher precision than cellular setup, we necessarily have to calibrate the APs.

Conclusion
The positioning problem of multiple assets with uncertain receivers gain and uncertain propagation model has been addressed. Typical values of uncertainties have been given from the observation of multiple setups and calibration realized in different industrial environments. Based on this model, the global MLE algorithm is given and formulated, having an iterative nonlinear least square solver. The CRLB is expressed in the general form and analytically for the specific case of well-calibrated path-loss exponents. Simulations show that the bound is reached by the MLE and moreover real measurements and estimations show that this result is not too much conservative; that is, the bound is a good estimation of the expected accuracy. Using this analytic CRLB, we discuss the impact on the accuracy of access points gain calibration, assets gain calibration, precision of measurements, number of measurements repetitions, and number of joint assets to be positioned.
It first shows that in contrary to joint calibration and positioning algorithms the number of assets to be estimated has no effect on the accuracy of each estimation. One can then reduce the solver complexity by running independent estimations without loss of optimality.
Secondly, it shows that the positioning of RMSE is split into two terms involving APs calibration, measurement precision and repetition, weighted by a parameter connected to the geometric position of the APs inside the room on the one hand. And on the other hand the asset calibration gain and LDPL reference gain are weighted by a different geometrical factor to impact the accuracy. Then it shows that depending on different calibration quality or efforts different geometrical terms can be involved and then different optimal Journal of Sensors 9 APs configurations can arise. Using a numerical application, we also showed that with typical hardware the error was close to those of cellular positioning if we do not calibrate APs. More generally, this can be used for dimension and to optimize a setup to reach a desired accuracy. It showed an example of how to infer the number of APs to reach accuracy in a given room. Future work could infer the number of measurements required for a given accuracy in function of the calibration error of the APs or even to find the optimal AP disposition in a given room for a given calibration.

Full Measurement Vector Construction
Equation (3) givesthe RSS measurement obtained for signal number ∈ N received by the AP number ∈ J = { , 0 ≤ < } sent from asset number ∈ I = { , 0 ≤ < }. In this section we use matrix algebra with the Kronecker product, noted ⊗, to give the full measurements vector and its expectation value .
We first stack in the vector the expectations of the strength measurement transmitted between a couple ( , ) ∈ I × J of target and AP. This expectation is the same for all measurements; only the measurement and shadowing effect noise V ∼ N(0, 2 ) change from one measurement to another. We then define the Gaussian random vector variable where O is the null vector of size N, and I is the identity matrix of size , and express those measurements as ( ) = ( + 0 + + Δ ( , )) ⊗ 1 ( ) = ( ) For stacking all APs on the vector we need to define the mismatch gain vector of those J receivers as = [ 0 . . . The two multivariate random vectors W and W are independent so they can be added to form a single multivariate vector which simplify expression to