Robot Motion Planning Method Based on Incremental High-Dimensional Mixture Probabilistic Model

,


Introduction
In recent years, as robots play an increasingly important role in industrial production and daily life, the issue of motion planning has received extensive attention.Although modern robots have significant differences in configuration, size, perception, driving methods, and application scenes, autonomous navigation and planning in complex environments are common problems faced by almost all robots [1].
The motion planning problem refers to, given the related description of robot and environment, initial state and goal region, seeking a series of control inputs to drive the robot to complete the movement from the initial state to the goal region while satisfying the environmental constraints (without colliding with the obstacles).However, for some robots with high planning dimensions, their configuration space's obstacle region cannot be explicitly described.In response to this problem, sampling-based motion planning algorithms have developed rapidly and received widespread attention [2][3][4].This type of methods do not explicitly describe obstacles.Instead, they rely on the collision query module to provide feasibility information of the candidate trajectories and connect a series of collision-free samples to generate a feasible path from the initial state to the goal region.The collision query module is generally implemented by the robot kinematics calculation and the space bounding box principle [5], which requires a large computational overhead to make the module a major bottleneck for limiting the efficiency of the sampling-based motion planning algorithms [6].Sampling-based motion planning algorithms can generate a large number of labeled samples with spatial collision information in planning instances, which provides a necessary condition for the implementation of machine learning methods.If the robots can learn from past planning experience to guide the future planning tasks, more efficient motion planning can be achieved.Therefore, how to use machine learning methods to break the efficiency bottleneck of motion planning algorithms has become a research focus in this field in recent years.
A large amount of research work is devoted to performing adaptive sampling and guiding the planning algorithm to explore certain areas in the configuration space with machine learning methods.Dalibard et al. [7] use the principal component analysis (PCA) method for online analysis of samples to estimate the direction and position of the local narrow passage in collision-free space and increase the sampling density in the direction of the passage axis.Similar methods include bridge test [8] and retraction strategy [9,10].These methods can significantly improve the efficiency of the planning algorithm at the local narrow passage.Brock and Burns [11] propose an exploration strategy based on entropy guidance, which can guide the planning algorithm to sample the areas that maximize information gain.However, because of the high computational cost of this method, it is more suitable for the off-line path graph construction of multiqueue query algorithms like PRM.Arslan and Tsiotras [12] use the kernel function to learn the feasibility and heuristics of samples generated in the previous planning instances and increase the sampling density of the areas that may make current path cost lower.This method can improve the convergence rate of asymptotically optimum motion planning algorithms RRT # [13].Bialkowski et al. [14] propose to store empirically observed estimates of collisionfree space in a point proximity data structure and then use kd-tree algorithm to generate future valid samples.
In addition, by learning the past experience of motion planning, the prediction of the new samples' feasibility can be achieved.Burns et al. [15] present a model-based motion planning method.This method utilizes the local weighted regression to incrementally learn the approximate model of the configuration space, and the classification of new samples is implemented.Compared with the traditional collision query module, the method has a smaller computational complexity.Yang et al. [16] propose a neural networks framework to achieve robot automatic collision avoidance.By exploiting the joint space redundancy, the human operator would be able to only concentrate on motion of robots end effector without concern over possible collision.Pan et al. [17] report that the collision query results produced in previous motion planning tasks are stored in a data set.When the new sample is generated, the KNN algorithm is used to search the  nearest neighbors of the sample in the data set, and the probability of its feasibility is estimated according to the neighbors' collision information.Yang et al. [18] combine the flexibility of the Gauss process (GP) with the efficiency of the RRT algorithm and establish a motion model to predict the movement of obstacles, so that the robot can find a safe path in the dynamic environment constraints.Huh et al. [19] propose that the Gaussian mixture model can be used to fit the probability distribution of the high-dimensional space obstacle region, so as to quickly predict the feasibility of the new samples.However, the influence of the number of Gaussian components on the model prediction accuracy has not been analyzed and considered.Besides, some other learning methods like conditional variational autoencoder (CVAE) [20], neural learning [21], experience graphs (E-Graphs) [22], and dynamic movement primitives (DMPs) [23] are also used in robot motion planning problem.
The above methods improve the efficiency of motion planning to some extent, but there are the following problems: (1) Some lazy learning methods such as kd-tree and KNN require data sets with large sample size.In the motion planning problem of high-dimensional space, this will lead to great spatial complexity which is hard to realize.
(2) Models like PCA, CVAE, E-Graphs, and local weighted regression have poor flexibility.Even if the environment or the base of robot changes slightly, the model needs to be retrained, which greatly increases the computational overhead.
Therefore, in this article, we propose to use Gaussian mixture model (GMM) as a prior model of robot configuration space to improve the efficiency of robot motion planning.GMM can represent the unknown model by the linear combination of Gaussian probability density functions.It has the ability to fit the continuous probability density distribution in any dimensional space with small spatial complexity.On the other hand, the Greedy EM algorithm utilized in this paper can realize incremental training of GMM, so that the model can be updated according to the environment changes without retraining.
In general, the main contributions of this paper include the following: (1) The influence of number of components on the prediction accuracy is analyzed, and a method for adaptively training GMM of collision region in robot highdimensional configuration space based on the convergence of log-likelihood function is proposed.(2) The Greedy EM algorithm is used to update the GMM online with realtime 3D map to adapt to environmental changes.(3) The above method is applied to a variety of sampling-based motion planning algorithms, and the planning efficiency is significantly improved.
The paper is organized as follows: Section 2 introduces the main motivation of the research work; Section 3 introduces the incremental training process of GMM using Greedy EM clustering algorithm, and the integration with motion planning algorithms; Section 4 shows the simulations and the real world experiments' results; Section 5 summarizes the results and provides directions for future work.

Motivations and Problem Statement
The Gaussian mixture model [24] is a mixture probabilistic model that can fit the probability density distribution of arbitrary dimensions and arbitrary shapes by weighting the probability density functions of multiple Gaussian distributions.

𝑃 (𝑥
where   is the component weights and satisfied  1 + 2 +⋅ ⋅ ⋅+   = 1,   ≥ 0. ( |   ) is the probability density function of the th Gaussian distribution, and   is the parameter vector of the distribution:

𝜙 (𝑥 | 𝜃
where  is the dimension.  and   are the mean and covariance matrix of the Gaussian component, respectively.By using the sample set { 1 ,  2 , . . .,   } with sample size , the parameters of the Gaussian mixture model can be estimated by the basic EM algorithm.The basic EM algorithm process is as follows: Set the number of components  and the parameters' initial value  (0) properly, and start iterating.
Step E. According to the current model parameters, calculate the expected probability ( |   ) of each Gaussian component to each observation data   .

𝑃 (𝑘 | 𝑥
Step M. The weight of each Gaussian component   , the mean   , and the covariance matrix   are adjusted by maximum likelihood estimate method.
The E step and the M step are repeated to maximize the log-likelihood function L  and improve the fitting ability of GMM to the sample set.
For simplicity, first, the above basic EM clustering algorithm is used to train the GMMs with different number of components to fit the obstacle area in two-dimensional map (Figure 1(a)).This process will naturally extend later to  multiple dimensions.The trained GMM is used to calculate the collision probability of the new samples, so as to guide the RRT * [1] (a RRT-like motion planning algorithm with asymptotically optimal) search for the feasible path from the starting point to the end point.
Figures 1(b)-1(f) show the influence of number of Gaussian components on the model fitting effect and the motion planning result.When  = 7, the fitting accuracy of the GMM is low, resulting in a large number of feasible regions in the two-dimensional map being blocked, and the planned path repeatedly passes through the obstacle region.Obviously, this kind of GMM cannot meet the needs of motion planning problems.With the increasing number of model components, the fitting accuracy of the GMM to the obstacle region is continuously improved.When  = 15, the contour of the high collision probability area is basically consistent with the obstacles in the map, which indicates that the GMM has the ability to perform collision query on new samples.If the number of Gaussian components continues to increase, the GMM's fitting accuracy is not significantly improved.
Subsequently, the generalization error of GMM under different number of components is further calculated and analyzed.As shown in Figure 2(a), with the increasing number of Gaussian components, the probability of generalization error on new samples is reduced.After  = 15, if the number of components continues to increase, the performance of the GMM is not significantly improved, which is basically in accordance with the result shown in Figure 1.
On the other hand, the time to calculate collision probability of new samples under different number of components is calculated.As shown in Figure 2(b), the average calculation time is linearly positively correlated with the number of Gaussian components.This indicates that the increasing number of components will lead to greater computational overhead when performing probabilistic calculations on new samples and limit the efficiency of motion planning algorithms.Therefore, selecting the appropriate number of components  according to the actual environment can maximize the performance of GMM in the sampling-based motion planning algorithm.
Therefore, using the basic EM algorithm to train the GMM of the collision region has the following problems: (1) The environment in which the robots are located is very different and requires different number of components, so it is impossible to artificially specify  to suit the needs of all environments.
(2) When the robot's environment changes, the GMM trained with the basic EM algorithm will fail and need to be retrained, which indicates that it does not have the ability to update online according to the environment.

Algorithms
To solve the above problems, the Greedy EM algorithm is used to perform incremental training on the GMM, and the  value is adaptively selected according to the convergence of the log-likelihood function.The model has the online learning ability to adapt to the environment changes.The GMM is used to perform collision query routine in a variety of sampling-based motion planning frameworks.The algorithm framework is shown in Fig- . .e Training Method of the Probabilistic Model.As shown in Algorithm 1,   is the sample set of collision region in robot configuration space;   is the GMM to describe the distribution of the collision region;  is the number of components of the current GMM; a, m, S is an array of weights, mean values, and covariance matrices for all components of the current GMM; and L  is the value of the overall log-likelihood function under the current number of Gaussian components .
The incremental training process of GMM is divided into two parts: global EM iteration and binary EM iteration.The global EM iteration process is exactly the same as the basic EM algorithm.The maximum likelihood estimation is used to adjust the weights, mean values, and covariance matrices of all Gaussian components in the current GMM to improve fitting ability to the sample set   , as shown in ( 3), (4), and (5).
The new component's weight   is initialized by the following equation.
The mean   is set to a randomly selected sample, and the covariance matrix   is initialized to  2 .According to [24],  depends on the dimension of the planning problem , sample size , and a fixed number  which is set to half of the maximum singular value of samples' covariance matrix (8).
To make the new GMM better fit the sample set, the new component's weight and the parameters vector  need to be adjusted.The optimization goal is to maximize the binary loglikelihood function.
According to the maximum likelihood estimation, binary EM iterations are performed on the current GMM and the new Gaussian component.The binary EM iteration process is as follows.
Set the initial number of components  to 1, and initialize the parameters of the component according to the above method.
Step E. Initialize the new Gaussian component' parameters  and calculate its expected probability to each observation sample   .Step M. Maximize the log-likelihood function (9) by adjusting the new component's weight   , mean   , and covariance matrix   .

𝑃 (𝑘 + 1 | 𝑥
Then, as shown in line 20 to line 28 of Algorithm 1, the weights of all components in the current GMM are diluted, and the new component trained by the binary EM algorithm is added to the GMM.Then the new GMM's parameters are adjusted through the global EM iterations until convergence. As shown in line 3 to line 10 of Algorithm 1, the global EM iterations and binary EM iterations are alternated to implement the incremental training of GMM.When the overall log-likelihood function value L  becomes stable (which means L  /L −1 − 1 < 1 −3 , as shown in line 3 of Algorithm 1), the addition of a new Gaussian component is stopped and the training is completed.In this way, according to the convergence of the overall log-likelihood function, the number of Gaussian components  can be adaptively adjusted without being specified in advance, so as to reasonably weigh the accuracy and computational efficiency of the GMM.Beside, compared with the basic EM algorithm, the GMM trained by Greedy EM algorithm has better fitting accuracy [25].
. .Online Updating of Gaussian Mixture Model.The actual working environment of the robot is dynamic and uncertain.The Gaussian mixture model trained off-line by the above method may cause errors due to the changes of environment or robot base, so the GMM is required to have the ability to update online.Through the Greedy EM algorithm, a new Gaussian component is added to fit the new collision region due to environmental changes, and some useless Gaussian components are eliminated.
As shown in line 2 to line 4 of Algorithm 2, the octree map O  is updated according to the point cloud data P  acquired by the RGB-D camera.The sample set   of the high-dimensional collision region is updated according to the new octree map and robot kinematics model K  .Then, according to (1) and ( 2), the probability that each sample belongs to the current GMM is calculated.If the probability is small, it indicates that the sample is a new collision sample point due to environmental changes.In this way, new samples are screened, and a sample is randomly selected as the initial mean value of the newly added Gaussian component.Finally, the Greedy EM algorithm is used to train the GMM, and the new component is added continuously until the loglikelihood function converges.
As shown in line 13 of Algorithm 2,   is the GMM to fit the collision region in configuration space. in order to prevent the infinite increase of the number of components and inefficiency of calculation, some useless Gaussian components in the new GMM need to be eliminated, such as the weights being very small, or the components with covariance matrix determinant close to 0, as shown in Figure 4. Other components obtain the weight of the removed components according to their respective weights.The sum of weights of all components is always 1.This process enables the GMM to adapt quickly and accurately to environment changes.
. .GMM-Based Motion Planner.After the GMM of the high-dimensional collision region is established by the Greedy EM algorithm, the probability of the new samples belonging to the GMM can be calculated by ( 1) and ( 2), thereby determining whether the sample is feasible.In this way, the forward kinematics calculation and precise collision query are avoided, and the total collision query time is greatly shortened.
As shown in line 7 to line 13 of Algorithm 3, if the probability of the new sample P(  ) belonging to the current GMM is less than the upper probabilistic boundary of the collision-free region  l− , the sample is considered feasible.If the probability of the new sample P(  ) belonging to the GMM is greater than the lower probabilistic boundary   of the collision region, the sample is considered     unfeasible.If the probability is between the   and  − , the GMM's prediction result is ambiguous.Therefore, the precise collision query on these samples will be implemented by the kinematics and Flexible Collision Library (FCL) [26].
and  − can be determined by cross-validation.Cross-validation is a method commonly used in machine learning to evaluate the prediction effect of a model with a small sample set.1000 samples in the collision region and collision-free region are collected, respectively, to form two cross-validation sets.Through multiple tests, the probability of GMM' generalization error for two cross-validation sets under different decision boundaries  can be obtained.For robot motion planning, false positive means that the collision-free samples are classified as collision samples, which may cause the motion planner to fail to find a feasible solution.False negative means that the collision samples are classified as collision-free samples and may cause the final path to pass through collision region.
For example, in the environment of Figure 5, the GMM is trained by Greedy EM algorithm to fit collision region in 6-DOF robot's configuration space.Then, the above-mentioned cross-validation is performed on the six-dimensional GMM to obtain the generalization error curves as shown in Figure 6.
In order to guarantee the safety and success rate of motion planning, the tolerable generalization error probability is set to 3.0%, which is the empirical value obtained by many simulation experiments.The intersections with two generalization error curves are   and  − .

Simulations and Experiments
This section provides some simulations and real world experiments' results to support the method proposed in this paper.All experiments are conducted on an Intel Core i5-4590 3.3GHz personal computer.ROS and Gazebo are used to build a simulation platform.FCL is used to acquire the sample set of the collision region in robot configuration space and perform accurate collision query when the GMM-based collision query results are ambiguous.OMPL [27] (Open Motion Planning Library) provides a variety of samplingbased motion planning algorithms for comparison experiments.
. .Learning and Online Updating.In order to prove the effectiveness of the algorithm, the 6-DOF UR10 robot in Figure 5 is used for the simulation test.First, according to Algorithm 1, a six-dimensional Gaussian mixture model under the environment is trained.By alternately using the global EM iterations and binary EM iterations, the number of Gaussian components  gradually increases.When  = 25, the log-likelihood function L  converges, meaning that the GMM has completed training and stopped adding new Gaussian components, as shown in Figure 7.
After the model training is completed, the incremental GMM's online updating capability is tested by changing the environment.As shown in Figure 8, the table in Figure 5 is rotated and translated to change the environment.Besides, We have also carried out simulations to change the environment through moving the robot.Algorithm 2 is used to update the GMM online.The new Gaussian components are added and trained according to the binary EM iterations from (7) to (11), and the basic EM algorithm from (3) to ( 5) is used to train the GMM with new components added.This will allow the GMM to fit the new collision sample set.After the training is completed, the Gaussian components that cannot help to fit the new collision sample set are eliminated to reduce the computational overhead of the algorithm.As shown in Figures 8(b), 8(d), 8(f), and 8(h), the projections of the GMM on the first joint and the third joint subspace change according to the changes of environment and robot base.Therefore, GMM has the ability to respond to the environmental changes online.
-DOF Robot Motion Planning.The 6-DOF UR10 robot is used to perform motion planning in the simulation environment of Figure 5.After the completion of GMM's incremental training, the false positive and false negative generalization error curves (Figure 6) are obtained using the cross-validation method in Section 3.3, and the decision boundaries   and  − are determined accordingly.Finally, the GMM-based collision query strategy described above is applied to four sampling-based motion planning algorithms: RRT [3], RRT-Connect [28], BiEST [29], and KPIECE [30].
The initial state and the goal region of the planning problem are specified, and the above planners are used to solve the planning problem repeatedly to obtain the boxplots of the planning time (Figure 9).Besides, we provide the average planning time and the planning success rate, as shown in Table 1.The allowed planning time is set to 50s.It can be seen that the collision query strategy based on GMM is generally applicable to various sampling-based motion planning algorithms.Compared with the basic collision query methods based on forward kinematics and space geometry, the computational overhead is greatly reduced, and the overall motion planning efficiency is improved by 2-3 times.In order to further test the effectiveness of the proposed method in various environments, four common scenes (Figure 10) are built in a simulation environment.The RGB-D camera is used to acquire the point cloud data of each scene, and a three-dimensional octree map is created based on the OctoMap library [31].The Greedy EM algorithm is used to train the GMMs of the robot configuration space in the four scenes, respectively, and the GMM-based and the basic collision query methods are applied to the above four kinds of motion planners, respectively.In the four planning scenes, the task of picking and placing a cup is accomplished by motion planning of a robot, repeatedly testing this planning task and obtaining the average planning time and successful rate with the GMM-based and basic collision query methods, as shown in Table 2.The allowed planning time is set to 50s.
From Table 2, it can be seen that the actual performance of the proposed method varies due to different complexity of the planning tasks and number of Gaussian components, the latter of which is caused by different planning scenes.However, the average planning time of the GMM-based motion planning method is reduced by 2-4 times compared with the basic collision query method.
. .-DOF Robot Motion Planning.In order to verify the application effect of this algorithm in the actual scene, two UR10 robots and RealSense camera are used to build a 12-DOF dual-arm robot experimental platform, as shown in Figures 11(a) and 11(b).In order to prevent the robot obstruct camera's view, the "Eye in Hand" arrangement is adopted.
Due to the high planning dimensions of dual-arm robots, the successful rate under allowed planning time (50s) of most motion planning algorithms is too low, and it is difficult to obtain large amounts of data through multiple experiments.Therefore, only the RRT-Connect algorithm is used to perform multiple experiments, and the planning times based on the GMM method and the basic method are, respectively, obtained (Figure 11(d)).The success rate of both methods is 100%.Experiments show that the method proposed in this paper is also applicable to higher-dimensional motion planning problems.

Conclusion
In order to improve the planning efficiency of the samplingbased motion planning algorithm, this paper studies the influence of the number of components on the fitting accuracy and proposes a self-adapting model training method, which achieves a rapid collision query in the robot highdimensional configuration space.Using the machine learning method, the robot learns the collision query results generated during the previous motion planning instances, and the Gaussian mixture model is incrementally established to quickly estimate the new high-dimensional samples' collision probability.In order to eliminate the model fitting errors caused by environmental changes, the Greedy EM algorithm is used to adaptively select number of components of the GMM-based on the convergence of the log-likelihood function and achieved online response to the changes of external environment.This method is applied to several kinds of sampling-based motion planning algorithms.The simulations and real world experiments are performed on the 6-DOF and 12-DOF robots in various scenes.Compared with the basic methods, the planning time is greatly shortened.The effectiveness of the proposed algorithm is proved.
In the future work, we will continue to focus on improving the practical performance of this method in higher-dimensional robot motion planning problems, such as Humanoid robots and quadruped robots.The GMM will be used for guided sampling to further narrow the scope of the search algorithm.

Figure 1 :
Figure 1: The Gaussian mixture model with different number of components is used to fit the probability distribution of the obstacle regions in the two-dimensional map (a), and the GMM-based collision query method is used to guide the RRT * algorithm to plan the path from the starting point to the end point.The white lines represent the feasible edges that RRT * algorithm explores, the green points are the starting point and the end point of the planning problem, and the red line represents the final path.

Figure 2 :
Figure 2: Comparison of the GMMs' performance under different numbers of components.

Figure 3 :
Figure 3: Algorithm framework: the main contributions are highlighted in red.

Figure 4 :
Figure 4: The schematic diagram of GMM's online updating process.

Figure 10 :
Figure10: A 6-DOF UR10 robot is used to pick and place a cup in various home environments, and the universality of the proposed method is verified.The green grid represents occupied space stored in the octree data structure, and the blue line represents the trajectory of the end effector when the robot performs the pick and place task.

Figure 11 :
Figure 11: 12-DOF dual-arm robot experiment.(a),(b) The initial and goal state of the planning problem, respectively.(c) The environmental model represented by octree, and the lines represent the trajectory of end effectors.(d) The boxplots of planning time based on GMM and basic collision query method.

Table 1 :
Planning performance of the motion planning algorithms.

Table 2 :
Planning performance of the motion planning algorithms in four scenes.