A Novel Mittag-Leffler Kernel Based Hybrid Fault Diagnosis Method for Wheeled Robot Driving System

The wheeled robots have been successfully applied in many aspects, such as industrial handling vehicles, and wheeled service robots. To improve the safety and reliability of wheeled robots, this paper presents a novel hybrid fault diagnosis framework based on Mittag-Leffler kernel (ML-kernel) support vector machine (SVM) and Dempster-Shafer (D-S) fusion. Using sensor data sampled under different running conditions, the proposed approach initially establishes multiple principal component analysis (PCA) models for fault feature extraction. The fault feature vectors are then applied to train the probabilistic SVM (PSVM) classifiers that arrive at a preliminary fault diagnosis. To improve the accuracy of preliminary results, a novel ML-kernel based PSVM classifier is proposed in this paper, and the positive definiteness of the ML-kernel is proved as well. The basic probability assignments (BPAs) are defined based on the preliminary fault diagnosis results and their confidence values. Eventually, the final fault diagnosis result is archived by the fusion of the BPAs. Experimental results show that the proposed framework not only is capable of detecting and identifying the faults in the robot driving system, but also has better performance in stability and diagnosis accuracy compared with the traditional methods.


Introduction
In recent years, the wheeled robots have received a wide range of applications and developments [1][2][3]. Particularly, in home service area, various kinds of wheeled service robots have become members of our family, such as the elderly companion robot [4] and the sweeping robot [5]. However, robot users are usually nonexpert in robot technology, which means that the faults which occurred in the wheeled robot system may cause serious damage to their life and property. The increasing demand of safety, reliability, and the necessity of low cost have become the bottleneck of wheeled robot applications with current technology. Therefore, it is meaningful to focus on novel fault diagnosis methods, particularly for the man-robot coexistent environments.
Generally speaking, the existing fault diagnosis methods can be classified as the model based and the data driven ones [6,7]. In the earlier days, the research of model based fault diagnosis methods drew much attention and constituted the mainstream of this field [8,9]. In [10], based on the mathematical model of the robotic manipulator, Caccavale et al. presented a discrete-time framework for diagnosis of sensors and actuators of robotic manipulators. Using particle filter, Yu et al. [11] proposed a fault-proneness prediction method for robot dead reckoning system. Besides the abovementioned methods, the adaptive observer and some other model based methods have also been designed for fault diagnosis of robot platform or robot manipulator [12,13]. Those model based fault diagnosis methods are effective and suitable for the diagnosis problem of robot manipulator or robot arm, because robot arm usually works in a structured environment and it is relatively easier to get the accurate mathematical model. While, for wheeled robots, firstly, their working environments are usually dynamic and unstructured, secondly, wheeled robots are usually equipped with various kinds of equipment that are more complex in both hardware and software aspects compared with manipulators. Thus it is hard to get an accurate mathematical model of a wheeled robot working in an unstructured environment, which becomes a restriction of those model based methods.

Computational Intelligence and Neuroscience
Moreover, the wheeled robots are well equipped with multiple sensors which implies that large data volumes containing robot running status information are available. Those large data volumes imply difficulties in system modeling, while they provide the required information for data driven based fault diagnosis method.
Principal component analysis (PCA) is a typical representative of the data driven fault diagnosis method. PCA is more suitable for fault detection rather than diagnosis, because it does not use the input-output relationships [14]. Therefore, in order to improve the diagnosis ability, PCA is often used by combining the classifiers, such as the neural network (NN) and the support vector machine (SVM). This hybrid method has been applied in the fault diagnosis of rotating machinery [15], power transmission systems [16], and some other aspects [17,18]. Applications of PCA could be useful in extracting and interpreting process information from massive data sets, and the pattern recognition techniques could also be used to diagnose the specific running status of the robot.
Nevertheless, there are mainly two problems that exist in the above hybrid diagnosis methods. On the one hand, most of the studies adopted the existing classical kernel (e.g., Gaussian kernel and polynomial kernel) as the kernel function of SVM in their diagnosis methods, while new kernel functions with better classification performance need to be proposed, proved, and applied to the robot fault diagnosis fields. On the other hand, the diagnosed objects are usually complex and with varying degrees of uncertainties. A single PCA model cannot achieve full and complete awareness of the diagnosed object so that the information fusion in data level or decision level is needed to reduce the existing uncertainties.
Mittag-Leffer functions [19,20] play fundamental roles in fractional calculus, which exhibit intermediate process among exponential function, power function, and polynomial function. Nowadays, fractional calculus has been successfully applied in many aspects, such as the application of fractional Fourier transform in signal processing [21] and the application of fractional order PI controllers [22]. Inspired by fractional calculus, a novel fractional Gaussian kernel named ML-kernel is proposed in this paper, which is a generalization of the traditional Gaussian kernel. The proposed ML-kernel is proved to be positive definite and its diagnosis performance is discussed in this paper. Besides, a hybrid fault diagnosis framework is discussed for robot driving system based on Dempster-Shafer (D-S) fusion and ML-kernel support vector machine (SVM). Multiple PCA models are established to do fault feature extraction and the fault feature vectors are used as the inputs of the ML-kernel SVM classifiers. The ML-kernel SVM classifiers output the preliminary fault diagnosis results which are fused by D-S fusion and the fusion result is taken as the final diagnosis result. Two sets of comparative experiments are carried out to validate the proposed method.
The remainder of this paper is organized as follows. Section 2 briefly introduces the SVM method and the positive definiteness of the presented ML-kernel is also proved in this section. In Section 3, the proposed fault diagnosis framework is described in detail. Section 4 illustrates the architecture of the experimental wheeled robot driving system and the application studies for various fault conditions. Section 5 is devoted to conclusions.

Conventional SVM Algorithm.
In the past few years, SVM has been one of the most highly studied topics in the machine learning fields, and it has been successfully applied in practice, especially for classification problems (e.g., fault diagnosis) [23,24]. Based on the statistical theory of VC dimension and structural risk minimization inductive principle, SVM reaches the best compromise between the complexity of modeling and the leaning ability and hunts the best generalization ability. The basic SVM [25] deals with linearly separable two class cases and it can cope with nonlinear problems by introducing kernel functions and slack penalty. Given a training set is the th training input vector, is the number of training data for SVM, is the dimension of the input data, and ∈ {−1, 1} is the set of classification tag for training. The optimal hyperplane separating the data can be obtained as a solution to the following optimization problem: min , , where is the slack penalty, is the adjustable weight vector, is the offset of the hyperplane, and is the distance between the margin and the lying on the wrong side. The equivalent Lagrangian dual problem can be described as follows: ) , where is the Lagrangian coefficient, from which we can obtain = ∑ =1 , = − , to solve (1). The kernel function can map the input vector into feature space and returns a dot product of the feature space. The linear discriminant function with kernel ( , ) is given by the following: where sgn(⋅) is the signum function.
The fault diagnosis of a robot driving system is a multiple class classification problem, while the conventional SVM was designed for the binary classification problem, so it is not suitable for the fault diagnosis in its original form. A few types of methods for multiclass SVM have been proposed [26]: one against one, one against others, direct acyclic graph, Computational Intelligence and Neuroscience 3 and so forth. This study employs the "one against one" multiclass SVM. In order to construct the BPAs, we need the probabilistic outputs of the SVM classifiers and the "pairwise coupling" method [27] is used to solve this problem.

Kernel Function.
The nonlinear pattern recognition problem in fault diagnosis can be transformed into the linear problem in some very high-dimensional feature spaces. The kernel function ( , ) = ( ) ( ) is able to handle any dimension feature spaces without the accurate calculation of ( ) and ( ). It has been proven that any function that satisfies the Mercer theorem can be used as a kernel function [28]. Currently, there are three typical kinds of kernel functions: (1) Polynomial kernel function (2) Radial basis kernel function (RBF) (3) Sigmoid kernel function

Proof of the Positive Definiteness of the ML-Kernel.
As the core of SVM, kernel function and the parameters of the model determine the performance of the SVM algorithm applied to the fault diagnosis system. In this paper, we employ the Mittag-Leffler function as a novel kernel function named as ML-kernel. The Mittag-Leffler function [29] is defined as follows: where Γ is the Gamma function and is an arbitrary positive constant. For = 1, (7) becomes 1 ( ) = . The presented ML-kernel function can be defined as , and the ML-kernel is identical to the Gaussian RBF kernel in this condition.
Given a kernel, it is in general straightforward to verify its continuity and symmetry, while the positive definiteness is more important and essential for a kernel. Thus, the proof of the positive definiteness of the proposed ML-kernel is given as below.

Fault Diagnosis Method Based on ML-Kernel SVM and D-S Fusion
As shown in Figure 1, there are two main processes of the proposed approach, namely, the training process and the fault diagnosis process. Before the application of the proposed approach, the initial samples should be obtained from the laboratory experiments. In the training process, multiple PCA models are set up based on the data sampled in the normal and faulty states. Then, those models are used to do fault feature extraction and the ML-kernel SVM classifiers are trained. In the diagnosis process, new sampled data are normalized firstly. Secondly, the PCA models established in the training process are applied to do fault feature extraction. The fault feature vectors are then served as the inputs of the trained ML-kernel SVM classifiers, respectively, and the probabilistic outputs of the classifiers are taken as the preliminary fault diagnosis results. The BPAs are constructed based on the preliminary fault diagnosis results and the confidence values calculated from the confusion matrix. To reduce the uncertainties of the preliminary diagnosis results, the D-S fusion algorithm is introduced for decision fusion and the final diagnosis results are given based on the fusion results. The proposed approach is elaborated in detail as follows.
Given that the robot driving system is equipped with sensors and we get groups of samples in each running state, so the original sampled data sets can be written as all = [ 0 , . . . , , . . . , ℎ ] , ∈ × ( = 0, 1, . . . , ℎ). 0 represents the samples under normal condition 0 and ( = 1, . . . , ℎ) represent the samples in the th kind of faulty state . To establish PCA models, several steps are introduced.
Step 1 (data normalization). To reduce the influence of different dimensions of the sensors, the training data should be normalized before establishing the PCA models. For a data set of observations and process variables ∈ × ( = 0, 1, . . . , ℎ), we can get the normalized data matrix by where and are the mean and standard deviation, respectively, of the th variable, is an element of matrix , and is an element of the normalized data matrix .
Step 2 (singular value decomposition). Consider where represents covariance matrix of , Λ is a diagonal matrix containing the eigenvalues of in decreasing order, and is orthogonal and contains the eigenvectors of .
Step 3 (determine the loading matrix according to the number of PCs). Given = / ∑ =1 , the number of principal components (PCs) is determined to satisfy the equation where is a constant and usually required to be bigger than 0.85 [31].
The loading matrix̂= [ 1 , 2 , . . . , ] consists of the former eigenvectors of the covariance matrix and can be decomposed as where =̂is named as score matrix and is the residual matrix.

Feature Extraction and SVM Training.
During the process of PCA, the orthogonal loading matrix̂can be considered as the main features of the original training data set. So, we can do data dimensionality reduction and feature extraction at the same time using the following equation:   where all ∈ (ℎ+1) × is the normalized training data sets and̂∈ × is the loading matrix of the th PCA model. is the number of principal components and is used as the final training sets of SVM ( = 0, 1, . . . , ℎ).
A novel ML-kernel presented in Section 2.3 is applied as the kernel function of SVM and particle swarm optimization (PSO) [32] is adopted to tune the parameters , , and . Here, is the slack penalty and and are two parameters of the ML-kernel.

Decision Fusion.
To reduce the uncertainties and imprecisions of the preliminary fault diagnosis results, D-S fusion is introduced in the proposed fault diagnosis framework. The determination of BPAs is the first and most important step in evidence theory. In our approach, we construct BPAs based on the probabilistic outputs of the PSVM classifiers and their confidence values.
Step 1 (calculation of the confidence values). The average classification accuracy of SVM can be calculated by where is the diagonal elements of the confusion matrix of SVM , = is the number of samples under the th kind of fault condition, and = (ℎ + 1) is the total number of training samples in the training set . So, we can get = ∑ ℎ+1 =1 /(ℎ + 1), which can be used as the global confidence of SVM .
The th column vector of the confusion matrix ⋅ ( = 1, 2, . . . , ℎ + 1) indicates the local confidence for the th kind of fault and the local confidence can be calculated by Then we can incorporate the local confidence into the probabilistic output of SVM and after normalization we can get where 0 ≤ ≤ 1 is the output of SVM .  Step 2 (construction of BPAs and D-S fusion). In our approach, the BPAs are defined as It can be indicated from (23) that the frame of discernment (Θ) = {⌀, 0 , 1 , . . . , ℎ , Θ}. Here, ⌀ denotes the empty set, and ℎ represents the ℎth kind of running condition of the robot. It is obvious that ∑ ∈ (Θ) ( ) = 1; (⌀) = 0. With BPAs, we use a fast fusion algorithm based on the matrix analysis [33] to accomplish D-S fusion algorithm.

Implementations on Wheeled Robot
A real application of robot driving system fault diagnosis is selected to illustrate the aforesaid theories and the proposed diagnosis framework. The experimental robot and its fault diagnosis problem are described briefly, followed by the discussions of the three key components in the proposed diagnosis framework, namely, data collection and preprocessing, feature extraction and SVM training, and decision fusion. In addition, several groups of contrast experiments are given in this section. Figure 2, we use the wheeled service robot developed by our research group as the experimental platform. This robot is driven by two differential wheels and it is equipped with various kinds of sensors such as one gyroscope (L3GD20), two incremental encoders, two temperature sensors (DS18B20), current detecting circuits, and voltage detecting circuits. The architecture of the driving system is shown in Figure 3. In general, faults which occurred in a wheeled robot driving system can be divided into two categories: mechanical faults and sensor faults. In fact, each of the two categories can be subdivided into many small classes. However, only a few typical kinds of high risk faults often occur in the actual course of using the robot [11]. In this paper, we mainly focus on the diagnosis of 7 common kinds of high risk faults and the normal condition 0 is treated as a special kind of "fault." As shown in Table 1, the fault space can be defined as err = { 0 , 1 , . . . , 7 }.

Description of Experimental Robot. As shown in
In order to achieve the effective detection and diagnosis of the faults presented in Table 1, the fault symptom space must be determined, which means that we should select the available and useful sensor signals in the robot driving

Data Collection and Preprocessing.
The robot motion controller (ARM chip) is responsible for data collection and uploading. In our experiments, we sample 200 sets of data under each of the running conditions ( 0 -7 ), respectively. So, the raw data sets can be marked as all = [ 0 , . . . , , . . . , 7 ] ∈ 1600×9 and ∈ 200×9 denotes data sampled under the th running condition. For simplification without losing generality, 100 sets of data in each are randomly selected as the original training samples ∈ 100×9 and the remaining 100 sets of data are used as the original testing samples ∈ 100×9 . With the normalized ( = 0, . . . , 7), 8 PCA models are established by (17) and (18). The cumulation variance proportion of the PCs for each PCA model is shown in Figure 4 and the threshold value is set to 0.85. We can get = [5,5,5,4,5,5,5,5], which represents the optimal number of PCs for PCA ( = 0, . . . , 7).  onto the principal component subspace of each PCA model and we can get the feature vectors ∈ 800× , ( = 0, . . . , 7) by (19). Then is used to train SVM with 5-fold cross validation and PSO algorithm for parameters optimization.

Feature Extraction and SVM
Taking SVM 8 as an instance, Figure 5 shows the distribution of the particles during parameters optimization process using PSO algorithm and the optimal parameters of SVM 8 are { = 11.086, = 2.997, = 0.950}. The optimal parameters of other SVM models are shown in Figure 6. With the trained SVM models, the global and local confidence 8 Computational Intelligence and Neuroscience  values ( , 0 , . . . , 7 ) can be obtained using (20) and (21), respectively, and the confidence values of SVM ( = 0, . . . , 7) are presented in Figure 7.

Decision Fusion.
With the global and local confidence values elaborated in Figure 7, we can construct the BPAs for D-S fusion by (23). Taking 2 and 4 , for example, we get two sets of fusion records randomly and the details are presented in Table 2.
As shown in Table 2, there are three error diagnoses in the 7th, the 14th, and the 16th row, because one single PCA model cannot achieve complete awareness of the robot driving system. While in the proposed approach, multiple PCA models are used to do feature extraction and D-S fusion is applied to fuse the outputs of the ML-kernel PSVM classifiers. Thus, the proposed approach can achieve better awareness of the system and diagnose the faults accurately. Besides, it can be indicated from the 5th and the 7th column that the confidence value (0.959 and 0.995) after D-S fusion is much bigger than that of any single PCA . In other words, the proposed approach can reduce the uncertainties of the diagnosis result efficiently.
The classification accuracy indicates the ability to diagnose the entire categories which are defined as 7 kinds of faulty states { 1 , . . . , 7 } and 1 normal state { 0 }. In this study, we use the true positive rate as the diagnosis accuracy, which is defined as where FN is the number of false negatives defined as the number of faults in category that are not classified as category and TP is the number of true positives. According to Figure  . Besides, PSO and 5-fold cross validation are applied to find the optimal parameters for each kernel function. The experimental results are presented in Table 3.
From the 5th row and the 6th row of Table 3, we can see that the proposed ML-kernel has an identical classification performance compared with the classical Gaussian RBF kernel when = 1. When 0 < ≤ 1, we can see that the diagnosis ability of the proposed ML-kernel is better than that of the classical Gaussian RBF kernel. From the discussion in Section 2.3, we know that the proposed ML-kernel can be   regarded as a generalized form of the Gaussian RBF kernel and the experimental results here verify this conclusion again. Table 3 demonstrates that the proposed ML-kernel has the best performance for fault diagnosis of the wheeled robot driving system, followed by the Gaussian RBF kernel and the polynomial kernel, while the sigmoid kernel has the worst performance in our experiments.

Evaluation of the Proposed Hybrid Diagnosis
Framework. The performance of the proposed framework can be evaluated by comparison with traditional nonfusion diagnosis framework. 10 groups of new test data are sampled and each group contains 800 samples which are sampled under each of the running conditions ( 0 , . . . , 7 ), respectively (100 samples for each running condition). The ML-kernel is adopted as the kernel function of the SVMs both in the proposed framework and in the traditional framework. The experimental result is shown in Figure 9, from which we can see that the proposed framework achieves the average accuracy of 94.46% (where the highest diagnosis accuracy and the standard deviation are 97.5% and 1.95, resp.). While, for the traditional framework, the average accuracy is 88.15% (where the highest diagnosis accuracy and the standard deviation are 95.5% and 5.76, resp.), it is clear that the proposed framework achieves better performance both in diagnosis accuracy and in stability, which can be owing to the multiple PCA models and the fusion in decision level.

Conclusion
A novel hybrid fault diagnosis framework for wheeled robot driving system is proposed in this paper. The proposed framework is composed of three key components, namely, data collection and preprocessing, feature extraction and SVM training, and decision fusion. Besides, a novel fractional MLkernel is presented and its positive definiteness and diagnosis ability are discussed in this study. In the proposed framework, multiple PCA models are established to do fault feature extraction firstly. Secondly, the extracted fault feature vectors are used to train the ML-kernel PSVM classifiers with PSO algorithm and cross validation for parameters tuning. Based on the probabilistic outputs and confidence values of those classifiers, the BPAs are constructed. Finally, the BPAs are fused by D-S fusion algorithm that follows the final diagnosis result. In contrast with the earlier studies, the proposed approach can achieve better awareness of the diagnosed system and reduce the uncertainties of the diagnosis result significantly. Through an illustrative application of wheeled robot driving system fault diagnosis, the proposed method is verified as an efficient way of diagnosing the faults in robot driving system and has better performance in stability (standard deviation 1.95) and diagnosis accuracy (highest diagnosis accuracy 97.5%) compared with the traditional methods. In the future, the combination with parallel computing and the cost-sensitive fault diagnosis framework will be studied.