An Adaptive Unscented Particle Filter Algorithm through Relative Entropy for Mobile Robot Self-Localization

Self-localization is a basic skill for mobile robots in the dynamic environments. It is usually modeled as a state estimation problem for nonlinear system with non-Gaussian noise and needs the real-time processing. Unscented particle filter (UPF) can handle the state estimation problem for nonlinear system with non-Gaussian noise; however the computation of UPF is very high. In order to reduce the computation cost of UPF and meanwhile maintain the accuracy, we propose an adaptive unscented particle filter (AUPF) algorithm through relative entropy. AUPF can adaptively adjust the number of particles during filtering to reduce the necessary computation and hence improve the real-time capability of UPF. In AUPF, the relative entropy is used to measure the distance between the empirical distribution and the true posterior distribution. The least number of particles for the next step is then decided according to the relative entropy. In order to offset the difference between the proposal distribution, and the true distribution the least number is adjusted thereafter. The ideal performance of AUPF in real robot self-localization is demonstrated.


Introduction
Autonomous mobile robots have increasingly been used in a wide range of applications [1,2].Self-localization is a fundamental and challenging task for the autonomous mobile robots [3].In order to successfully make decisions or perform the missions, the mobile robots have to know their poses precisely [4].The self-localization of mobile robot can be defined as the estimation for the pose of the mobile robot relative to the environment [5].
Over the past years many different solutions for mobile robot self-localization have been presented, and they can be divided into two classes: the sensor-based method and the sensor assisted probabilistic method.The sensor-based method only uses the sensor information to realize the mobile robot self-localization.Reference [6] presents a natural landmark-based indoor localization algorithm.The natural landmarks are extracted from the input image which is obtained by the vision sensor.Then the map positions of the natural landmarks are identified by comparing the input image with the reference data set to acquire the mobile robot self-localization.Reference [7] has suggested an algorithm that only utilizes the Bluetooth device to realize the self-localization of the mobile robot.Although the sensorbased method is easy to implement, it has an inherent limitation.The range and resolution of the sensors are subject to the physical law, and the sensor measurements can be perturbed by the noises.These limitations influence the sensors' ability to extract the useful information from the measurements.On the other hand the sensor-based probabilistic method can utilize probabilistic law to deal with the noise and the uncertainty of the sensor information [5].
Bayesian filter is the most important probabilistic method for self-localization which is a state estimation problem for nonlinear system with non-Gaussian noise [8,9].Bayesian filter is a class of statistical filter techniques via Bayesian inference, and it can recursively calculate the posterior probability, the estimated state, and the predicted state [10].Bayesian filter is easy to implement and has a wide range of applications, so it has been extensively studied in recent years [11,12].
There are several commonly used Bayesian filter algorithms that can be applied in the mobile robot selflocalization.Kalman Filter (KF) is a recursive algorithm for the linear system state estimation with Gaussian noise.KF has high computational efficiency and good accuracy [13].
In [14], the applicability of KF to the mobile robot selflocalization is analyzed.This algorithm is only suitable for linear systems.Extended Kalman Filter (EKF) is another kind of Bayesian filter for the mobile robot self-localization.In [15] EKF is used to localize the four-wheeled mobile robot equipped with the wheel encoders and a laser-rangefinder sensor.EKF is derived for the nonlinear systems, which linearizes the measurements and evolution models using the Taylor series expansions.EKF only uses the first-order term of the Taylor series expansions of the nonlinear functions.This linearization often induces large errors in the estimated statistics of the posterior distributions.This is especially evident when the models are highly nonlinear, and it can lead to filter divergence [16].Unlike EKF, the Unscented Kalman Filter (UKF) does not approximate the nonlinear process and the observation model.It uses the true nonlinear models.UKF approximates the distribution of the state random variables by the scaled unscented transformation [17].In [18], the performances of EKF and UKF are compared for the localization of an autonomous robot.The conclusion is that UKF has shown better performance in terms of average localization accuracy.UKF does not need to solve the Jacobian matrix, so it has fewer computations and higher precision.
Self-localization is a state estimation problem for nonlinear system with non-Gaussian noise.Neither EKF nor UKF can deal with non-Gaussian noise, but particle filter (PF) can handle the state estimation problem of nonlinear system with non-Gaussian noise [19].The key idea of PF is to use particles to represent the posterior distribution of the state.As new information arrives, these particles are reallocated to update the estimation of the state of the system [20].As a useful tool for dealing with the nonlinear and non-Gaussian problems, PF has been widely used in various fields [21,22].
In PF, the particles are usually sampled from the proposal distribution which neglects the new observation, and this can cause particle degeneracy and influence the accuracy of PF.Then the unscented particle filter (UPF) algorithm is proposed to resolve the problem.In UPF, UKF is used to generate the new proposal distribution [23].This proposal can incorporate the latest observation and match the true posterior distribution more closely, so UPF improves the accuracy of state estimation evidently.In traditional UPF the number of particles is fixed, and in each step each particle utilizes UKF to obtain the new importance proposal that can bring huge computation and influence the realtime capability of the filter.During filtering fixed number of particles may bring unnecessary computation, so the computation will be reduced if the number of particles can be dynamically adjusted according to the estimation quality during filtering.
Mobile robots usually work under a dynamic environment; real-time self-localization is thus needed to be realized [24].So UPF should reduce its computation to accommodate mobile robots self-localization.To the best of our knowledge there is no related work about how to reduce the computation cost of UPF and meanwhile maintain accuracy.The computation cost of UPF depends on the number of particles, so in order to reduce the computation cost of UPF the number of particles should be adaptively adjusted during filtering.
In this paper an adaptive unscented particle filter algorithm through relative entropy (AUPF) is proposed to enhance the real-time capability of UPF and meanwhile ensure its accuracy.In AUPF, the relative entropy is used to measure the distance between the empirical distribution and the true posterior distribution; the least number of particles for the next step is then decided according to the relative entropy.In order to offset the difference between the proposal distribution and the true distribution the least number is adjusted thereafter.AUPF can adaptively adjust the number of particles during filtering to reduce the necessary computation and hence improve the real-time capability of UPF.
The rest of the paper is organized as follows.Section 2 describes the dynamic state-space model of mobile robot self-localization problem.In Section 3 the concept of UPF is introduced.In Section 4 an adaptive UPF algorithm through relative entropy is proposed.Then Section 5 shows the experiment results which include the simulation results and the utilization of AUPF in a real robot self-localization.Finally, Section 6 gives the conclusions.

Dynamic State-Space Model of Mobile Robot Self-Localization
To solve the mobile robot self-localization problem, we need a self-localization model for robot.In general, such a model can be described using a set of discrete-time equations given as follows: where   ∈    denotes the self-localization state vector of the mobile robot at time step t;   means the dimension of   ;   ∈    denotes the process noise at time step  which is usually non-Gaussian;   ∈    is the observation vector at time step t;   is the dimension of   ; V  ∈   V is the measurement noise at time step  which is also usually non-Gaussian;  V is the dimension of V  ; the states follow a firstorder Markov process, and the measurements are assumed to be independent.The function  represents the state transition model,  :    *     →    .The function ℎ represents the state measurement model, ℎ :    *   V  →    .

Unscented Particle Filter Algorithm
In PF, it is hard to design the proposal distribution which can approximate the true posterior distribution reasonably well.The common strategy is to sample the particles according to the state transition.This strategy cannot consider the new observations, so it can not generate an ideal proposal distribution.UPF is proposed to resolve this problem.In UPF, UKF is introduced to produce the proposal distribution of particle filter which considers the new observation, and the produced proposal distribution is close to the true posterior distribution.UKF is a minimum mean square error estimator for nonlinear system.It utilizes the scaled unscented transformation to calculate the statistics of a random variable.UKF has higher accuracy than EKF especially when the models are highly nonlinear.In UPF, before sampling of the particles UKF is utilized to generate the proposal distribution which can consider the new observation.The proposal distribution generated by UKF has a bigger overlap with the true posterior distribution than PF, so UPF has relatively high accuracy, while, in UPF, at each time step each particle will utilize UKF to generate the proposal distribution, so the computation cost of UPF is huge.
The basic UPF algorithm is described as in Algorithm 1:

An Adaptive Unscented Particle Filter Algorithm through Relative Entropy
It can be seen that UPF described in Section 3 may provide high accuracy for estimation.However, it is computationally expensive due to the utilization of UKF for each particle at each time step.In this section, in order to reduce the computation cost and meanwhile maintain the accuracy of UPF, we propose an adaptive unscented particle algorithm by adaptively adjusting the number of particles during the process.Specifically, we introduce the relative entropy to bind the error induced by sample-based representation of UPF and correspondingly determine the number of particles for each filter iteration.

Relative Entropy Based Adaptive Adjustment of UPF.
The sample-based representation of UPF will bring the error.
If the error can be bounded, the accuracy of UPF will be insured.So the adaptive adjustment of the number of particles needs to ensure that the error introduced by the sample-based representation must be within a bound.The error can be measured by the distance between the empirical distribution and the true posterior distribution.In order to maintain the accuracy of UPF, it is assumed that the error must be within a prespecified value .
The relative entropy is a measure of the distance between two distributions [25].The relative entropy between two probability functions  1 and  2 can be defined by In UPF N particles are drawn from the proposal distribution with k different bins. denotes the number of particles drawn from each bin,  = ( 1 , . . .,   ). means the true distribution. specifies the probability of each bin,  = ( 1 , . . .,   ).The maximum likelihood estimate of  is given by p =  −1 . ( The likelihood ratio statistic   for testing  is given by Reference [25] gives a conclusion that the likelihood ratio statistic can satisfy a chi-square distribution convergence: So Then we can obtain where  is a given probability;  2 −1,1− can be obtained according to  and Equation ( 10) is transformed from (9).From (10) it's found that when the number of particles  is big enough the relative entropy between the maximum likelihood estimate and the true distribution is less than  with the probability 1 − .

Compensation for the Influence of the Importance Proposal.
In Section 4.1,  is assumed to be the true distribution, while the true distribution cannot be obtained, and it is approximately described by the proposal distribution.Furthermore the quality of the match between the proposal distribution and the true distribution is one of the main elements that determines the accuracy of the filter, hence the suitable number of particles.To fix this problem, we need a way to quantify the degradation in the estimation using samples from the proposal distribution.
In [26], Var true can measure the influence of sampling from the true distribution (), and it is calculated by Var In [27], when the particles come from the proposal distribution (), the variance of the estimator Var  can be given by (1) Initialization:  = 0  = 1 do draw the particles   0 from the prior ( 0 ) and set  where   is the number of particles coming from the proposal distribution, In order to obtain the similar accuracy, let (11)

2𝑁𝜀 > 𝜒
So the minimum number of particles  min  can be obtained according to

An Adaptive Unscented Particle Filter Algorithm through
Relative Entropy. Figure 1 illustrates the relationship between several basic methods.The adaptive unscented particle filter algorithm through relative entropy is shown in Algorithm 2.

Experimental Results
In this section, the results of two experiments are presented.In the first experiment the performance of AUPF is compared with UPF and PF through a simulation to validate the effectiveness of AUPF.Then the utilization and performance of AUPF for a real robot self-localization are also illustrated in detail.

Simulation Results.
To illustrate the effectiveness of the proposed AUPF, a scalar nonlinear system is considered which is shown in ( 18) and ( 19).The simulation is carried out Equation ( 18) is the state transition model.Equation ( 19) is the state observation model.  is white Gaussian noise, and its covariance is 1.5; V  is white Gaussian noise, and its covariance is 1.
We compare the AUPF with traditional PF and UPF.The number of particles in PF and UPF is set to be 100.The initial number of particles in AUPF is also set to be 100.In the experiment we use a value 0.99 for  and a fixed bin size Δ 0.02.Random 50-step state estimation results are shown in Figure 2.
The root mean square error (RMSE) is an important value that can be used to reflect the accuracy of state estimation.RMSE can be calculated by We have run AUPF, UPF, and PF for 10000 steps; the RMSE of the three algorithms is shown in Table 1.Meanwhile the average number of particles of the three algorithms is shown in Table 2.
From Figure 2 and Table 1 we can see that the estimation result of PF is not satisfying.Particle filter uses a set of particles to represent the posterior distribution, so the accuracy of PF is decided by the number of particles.In this experiment the number of particles is 100, and these particles cannot approximate the true posterior density well especially when the nonlinearity of the system is strong.The performance of UPF is the most excellent, because in this algorithm the latest observation is incorporated into the proposal distribution, so 100 particles are enough to approximate the true posterior density.In Table 2 the average number of particles of PF and UPF is 100; it is a fixed number of these algorithms, while in AUPF this number is adaptively adjusted according to (17).According to the adaptive adjustment of the number of particles the runtime of AUPF is reduced which is proportional to this number.

AUPF for Real Mobile
Robot Self-Localization.We hereafter experimentally validate the effectiveness of AUPF for mobile robot self-localization.The robot "E-Robot" which is developed by Chinese Academy of Sciences is utilized for mobile robot self-localization.E-Robot is shown in Figure 3.
During the mobile robot self-localization, the odometer information and the visual information are used.The picture is obtained by the ultrahigh definition color gun camera AMPSON DSC-B468C, and it is processed by FPGA Cyclone II EP2C20.The image pixel of the picture obtained by the camera is 256 * 256.In one second about 30 pictures can be processed.

Mobile Robot Motion Model.
The odometer can obtain the offset of robot pose (Δ, Δ, and Δ) from time  − 1 to time t, so the motion model of mobile robot can be represented by where (  ,   ) is the position at time step t;   is the orientation at time step t;   ,   , and   are non-Gaussian noise.

Mobile Robot Observation Model.
In our experiment, the mobile robot can move under an environment with many features, such as the white line and the two-dimension codes.All the features can be extracted from robot vision information.In this paper, the special two-dimension (2D) codes have been used as the features or landmarks.As shown in Figure 4, the outer contour of each 2D code is regular pentagon.A different 2D code has different dots inside the regular pentagon.The 2D code is developed by Institute of Automation, Chinese Academy of Sciences.It is easy to be processed and can bring plenty of information.In [28], the image processing for 2D code is explained in detail.
In our experiments, each 2D code is different from the others.Each 2D code is located at a fixed position, and it brings the location information.So once the mobile robot detects one 2D code, it can obtain one numerical value of selflocalization with the help of image processing.
In our experiment, the mobile robot self-localization result is represented by a set of particles,    = ⟨   ,    ⟩, where    means the self-localization value maintained by the jth particle at time step t, and    is the importance weight of the jth particle at time step t.Utilizing the mobile robot motion model and  −1  , the jth particle can obtain an original selflocalization numerical value    .If the robot is localized at    , the distance between    and V  can be used to calculate the probability (  |  −1  ) with which the robot localized at    can observe the feature   .The probability will become bigger along with the decrease of the distance The extraction of the features just depends on the robot's position and orientation, so the features are mutually independent.The observation model ( |  −1  ) is formulated in Actually, the mobile robot can detect at least one feature in our experiment, and no more than two features will be chosen which are the nearest to the robot.

The Realization of AUPF for Mobile Robot Self-Localization.
In the experiment, the initial number of particles of AUPF is set to be 400.The number of particles during filtering at time instant t is represented by   .The process of the realization for mobile robot self-localization is reported as follows.
(1) Obtain the initial position of the mobile robot  0 according to the robot vision.
(2) Around  0 a field of 2 * 2 meters is chosen, and 400 particles are uniformly distributed in this field; the importance weight of each particle is 1/400.
(3) Obtain new particles according to UKF algorithm and the mobile robot motion model which utilizes the odometer information.
(4) Obtain the vision information, extract the features, and update the importance weight of each particle according to (17).
(5) Output the estimation of mobile robot position according to (18), then calculate the orientation of the robot according to (24), and then return to step 3 at the next time step.
(6) Calculate the number of particles   .
(7) Resample   particles according to their importance weight.
(8) Set all the importance weight of particles to be 1/  .
The robot position (  ,   ) is obtained by AUPF; then the orientation   can be calculated according to where (, ) is the position of one 2D code in the world coordinate, and this position is already known.All the positions used above are in the world coordinate.(   ,    ) is the position of the 2D code in the robot coordinate, and this position can be easily obtained by the image processing.

Experiment Results and Analysis.
In order to illustrate the effectiveness of AUPF, we designed a mobile robot selflocalization experiment.In the experiment, the E-Robot randomly moved on the ground, and a trajectory was generated.Then the trajectory was recorded.The trajectory is depicted in Figure 5 by a black line.From Figure 5 we can see that the 2D codes are located on the ground as landmarks.When the robot moved, AUPF was run to calculate the robot position at each time step.The effect of AUPF can be measured by the difference between the calculated position and the trajectory.Then we adopt AUPF, UPF, and PF to obtain the estimation of self-localization value.The estimation results of the three algorithms are shown in Figure 6.
The mobile robot self-localization result can be measured by the mean relative position error (MRPE) which is defined in where   true and   true are the true robot positions in Cartesian coordinates at time step t, tf is the total time instant of the three algorithms, and x and ŷ are the estimation results.
From Figure 6 and Table 3 we can see that the MRPE of AUPF is better than PF, and the MRPE of UPF is the best.That is, because in UPF each particle moves according to a new proposal distribution which utilizes the new observation, while in PF the particles cannot move appropriately without the direction of new observation.We also find that AUPF can acquire relatively ideal accuracy.The number of particles in AUPF is reduced, and the reduction is according to the estimation quality, so the reduction will not cause severe loss of accuracy.

Conclusions
In UPF the number of particles is fixed, and the computation cost of UPF is relatively large which influences its realtime capability.In order to accommodate the mobile robot self-localization under dynamic environment, an adaptive unscented particle filter algorithm through relative entropy has been proposed in this paper.This algorithm utilizes the concept of relative entropy to measure the distance between the empirical distribution and the true posterior; then the difference between the proposal distribution and the true distribution is offset to obtain the least number of particles of UPF for the next step.This algorithm can effectively reduce the computation cost and enhance the real-time capability for UPF.In the future, we will research on how to further reduce  the computation of UPF at the least cost of the accuracy.Moreover, the influence of the accuracy will be investigated and analyzed.

Figure 1 :
Figure 1: The relationship between several basic filter methods.

Figure 3 :
Figure 3: The E-Robot used in the experiments.

4 Figure 4 :
Figure 4: Some examples of 2D code developed by Chinese Academy of Science.

Figure 5 :
Figure 5: The real trajectory of the E-Robot in the experiment.