Modified Kernel Marginal Fisher Analysis for Feature Extraction and Its Application to Bearing Fault Diagnosis

The high-dimensional features of defective bearings usually include redundant and irrelevant information, which will degrade the diagnosis performance. Thus, it is critical to extract the sensitive low-dimensional characteristics for improving diagnosis performance. This paper proposes modified kernel marginal Fisher analysis (MKMFA) for feature extraction with dimensionality reduction. Due to its outstanding performance in enhancing the intraclass compactness and interclass dispersibility, MKMFA is capable of effectively extracting the sensitive low-dimensional manifold characteristics beneficial to subsequent pattern classification even for few training samples. A MKMFAbased fault diagnosis model is presented and applied to identify different bearing faults. It firstly utilizes MKMFA to directly extract the low-dimensional manifold characteristics from the raw time-series signal samples in high-dimensional ambient space. Subsequently, the sensitive low-dimensional characteristics in feature space are inputted intoK-nearest neighbor classifier so as to distinguish various fault patterns. The four-fault-type and ten-fault-severity bearing fault diagnosis experiment results show the feasibility and superiority of the proposed scheme in comparison with the other five methods.


Introduction
Rolling bearing is frequently used in rotating machinery.The damaged bearings are often the leading cause of the catastrophic machine breakdown and big economic loss [1,2].Therefore, it is significant for rolling bearings to implement fault diagnosis so as to prevent fatal malfunction of rotating machinery and even human casualties [3,4].Feature extraction is significant for bearing fault diagnosis.There are common feature extraction approaches to bearing fault diagnosis.The time-domain statistical features (e.g., mean, root mean square, etc.) [5] are extracted from the periodic time-series signals.The frequency-domain statistical features (e.g., mean frequency, etc.) [6] are extracted from the frequency spectrums of faulty signals.The classical time-frequency analysis techniques are suitable to nonlinear and nonstationary signals, like empirical mode decomposition (EMD) [7].The features in the three domains are usually high-dimensional to obtain comprehensive faulty information.A large number of redundant and irrelevant characteristics will degrade the diagnosis performance and increase the computing consumption.Conversely, a few salient features will improve the fault identification accuracy and alleviate the computation burden [8].Consequently, it is a great challenge for rolling bearings to extract the sensitive low-dimensional characteristics for improving fault diagnosis performance.
The vibration signals of faulty machinery are weak nonlinear or strong nonlinear due to the instantaneous variations in friction, damping, and loads [9].Kernel Fisher discriminant analysis (KFDA) [10] and Kernel principal component analysis (KPCA) [11] are classical nonlinear feature extraction methods with dimensionality reduction.Liu et al. utilized KFDA to capture low-dimensional characteristics of planetary gearboxes [12].Shao et al. employed KPCA to capture low-dimensional characteristics from the 16dimensional wavelet packet energy features of a gear system [13].However, KPCA neglects the significant discriminant information related to the subsequent pattern classification.Despite KFDA being supervised, it may perform not well due to overlooking the non-Gaussian distribution characteristics of faulty samples.Moreover, both KPCA and KFDA 2 Shock and Vibration effectively discover only the global structure in Euclidean space.They may not effectively excavate the underlying manifold structure which is more beneficial to classification assignment compared with the global structure, if the high-dimensional samples locate or keep close to a lowdimensional manifold [14,15].
Many studies reveal that manifold learning geometrically motivated can well handle the high-dimensional nonlinear samples and exploit the inherent low-dimensional manifold structure [16][17][18].Some recent investigations have demonstrated that manifold learning methods can extract the sensitive low-dimensional manifold characteristics beneficial to pattern classification for bearing fault diagnosis [19][20][21][22].As one of the representative manifold learning techniques, marginal Fisher analysis (MFA) [23] algorithm was successfully applied to face recognition [23,24] and gait recognition [25].It has also been proved to be an effective methodology for bearing fault diagnosis [5].MFA is a linear method in essence.Hence, kernel MFA (KMFA) was proposed by applying the kernel trick to MFA [23].Although KMFA is a prominent approach, it has some flaws for bearing feature extraction.Firstly, most kernel-based algorithms, like KMFA and KFDA, usually take advantage of Gaussian radial basis function (RBF) as the kernel function.Nevertheless, it is an open issue to select the best kernel parameters for improving their feature extraction performances [26].Secondly, the classical KMFA algorithm (not utilizing PCA preprocessing) may encounter the singular problem if there is no sufficient training (or labeled) faulty sample especially for expensive and critical machine.In such case, the classical KMFA algorithm fails to obtain a stable solution and effectively extract the sensitive low-dimensional manifold features of mechanical equipment.Thirdly, the similarities of the two neighborhood graphs in most extended KMFA algorithm are defined to be either 1 or 0, which do not simultaneously utilize the label information and distance relationship of sample points.
In view of the aforementioned deficiencies, this paper presents modified kernel marginal Fisher analysis (MKMFA) algorithm to make KMFA more robust for feature extraction and pattern classification.MKMFA utilizes the datadependent kernel function [27] without the selection of the best kernel parameters.Additionally, it introduces a manifold regularization term to solve the singular problem and simultaneously incorporates the label information as well as the distance relationship of sample points into the two similarities to further enhance its classification capability.Subsequently, the MKMFA-based fault diagnosis model is presented and applied to identify various bearing faults.Unlike the traditional signal processingbased fault diagnosis techniques, it is unnecessary for the proposed scheme to extract the high-dimensional multidomain features by signal processing approaches and then reduce the feature dimension before pattern classification.By implementing MKMFA algorithm, it directly extracts the optimal low-dimensional manifold characteristics from the time-series signal samples in high-dimensional ambient space and simultaneously preserves the inherent manifold structure related to fault patterns.Finally, -nearest neighbor (KNN) classifier is employed to identify various fault operations of rolling bearings in category space.
The remainder of the paper is organized as follows.The principle of MKMFA algorithm is addressed in Section 2. In Section 3, the MKMFA-based fault diagnosis model is presented and applied to identify various bearing faults.Finally, Section 4 gives the concluding remarks.

MKMFA Algorithm
KMFA is designed to capture the low-dimensional manifold characteristics embedded in high-dimensional ambient space based on graph embedding framework.The outline of graph embedding framework and KMFA algorithm are first briefly presented.For more details on them, readers can refer to [23].Given a -dimensional sample set of  samples X = [ 1 ,  2 , . . .,   ],   ∈   .Suppose (  ) ∈ {1, 2, . . .,   } denotes the class label of the sample   and   is the number of classes.The low-dimensional representation of the highdimensional sample set X is denoted by a vector Y = [ 1 ,  2 , . . .,   ],   = Z T   (Z is a transformation matrix,   ∈   and  ≪ ).

Outline of Graph Embedding
Framework.Assume G  = {X, S  } to be an intrinsic graph with vertex set X and similarity matrix S  .The similarity matrix S  ∈  × of the intrinsic graph represents the similarities between vertexes.The diagonal matrix D  of the intrinsic graph G  is defined as Suppose G  = {X, S  } to be a penalty graph with vertex set X and similarity matrix S  .The similarity matrix S  ∈  × of the penalty graph reflects the suppressed similarity properties between vertexes.The diagonal matrix D  of the penalty graph G  is defined as The graph embedding aims at seeking low-dimensional representations of the vertexes in high-dimensional space and simultaneously preserving their similarities.The objection function of the similarity preserving criterion in graph embedding framework is depicted as where  is a constant and A is a constraint matrix, which may be the Laplacian matrix of a penalty graph.
(b) Class 1 x i x j (a) x i x j Class 2 The linear graph embedding in (3) postulates that the low-dimensional embeddings of the vertexes are linearly projected from the high-dimensional space.Thereby, the kernel trick is applied to the linear graph embedding so as to acquire the nonlinear embedding and fully excavate the inherent geometry structure.Suppose the -dimensional sample set X is projected to a reproducing kernel Hilbert space (RKHS)  by a nonlinear mapping : X ⊂   → (X) ⊂ .The inner product in RKHS is defined as ⟨() ⋅ ()⟩ = (, ).RBF is usually employed as the kernel function (, ) and the kernel parameter is the kernel width .The elements of the kernel Gram matrix K is   = (  ,   ), and thus the nonlinear graph embedding in RKHS is expressed as By the reproducing kernel theory, the transformation matrix Z  is a linear combination of (  ).Accordingly, there exists a vector  = [ 1 ,  2 , . . .,   ] T satisfying (5)

Outline of KMFA.
On the basis of the label information and local adjacency relationship of samples in highdimensional space, KMFA defines two neighborhood graphs (both shown in Figure 1) to illustrate the inherent geometrical structure.The vertex pairs are linked in the intrinsic graph if the sample   and its  1 neighbors fall into the same classes.
If the data point   and its  2 neighbors belong to different classes, the vertexes in the penalty graph are connected.The intraclass similarity matrix S  ∈  × of the intrinsic graph is defined as The interclass similarity matrix S  ∈  × of the penalty graph is defined as

Shock and Vibration
The intraclass compactness of the intrinsic graph is depicted by the term The interclass dispersibility of the penalty graph is illustrated by the term KMFA aims to seek an optimal mapping direction which pulls the intraclass nearest neighbors in the intrinsic graph close and pushes the interclass nearest neighbors in the penalty graph far.Thus, marginal Fisher criterion is maximizing the interclass dispersibility and minimizing the intraclass compactness.The objection function of KMFA algorithm is defined as follows: By matrix transformation theory, the objective function in (10) can be converted to solve the following generalized maximum eigenvalue decomposition problem: If the number of features exceeds that of the training samples, the intraclass compactness matrix S  may encounter the singular problem.Under the circumstances, the eigenvalue decomposition problem in ( 11) is ill-posed and could not obtain a stable solution.Consequently, the classical KMFA algorithm (not utilizing PCA preprocessing) firstly embeds the original high-dimensional sample set into a PCA subspace ahead of constructing the two neighborhood graphs.Hence, the final mapping matrix of KMFA algorithm is defined as where Z PCA is the mapping matrix of the PCA subspace.

Modified KMFA.
In KMFA algorithm, the similarities of the two neighborhood graphs are simultaneously defined to be either 1 or 0, which ignore the label information as well as the distance relationship of samples.Although the intraclass similarity of supervised kernel locality preserving projection (SKLPP) [28] algorithm between vertex pairs is defined to be heat kernel related to the distances of sample points, it involves the kernel width parameter and defines the interclass similarity to be 0. Furthermore, the two kinds of weights make the algorithms prone to be sensitive to the noise and overfit the sample points [29].So as to enhance the compactness of the samples in the same classes and the dispersibility of the samples from different classes, MKMFA algorithm incorporates the label information and the distance relationship into the similarities of KMFA to guide the construction of the two neighborhood graphs.Motivated by [29], the definitions of the two similarities in MKMFA algorithm are stated below.In the intrinsic graph R  = {X, H  } of MKMFA algorithm, the entries of the intraclass similarity matrix H  ∈  × are defined as where the parameter  is a regulator, which is equal to the square of the average Euclidean distances between all samples.
In the penalty graph R  = {X, H  } of MKMFA algorithm, the entries of the interclass similarity matrix H  ∈  × are illustrated as Because 0 ≤ 1 − exp(−(‖  −   ‖ 2 /)) ≤ 1 and 1 ≤ 1 + exp(−(‖  −   ‖ 2 /)) ≤ 2, the interclass similarity of MKMFA algorithm is smaller than that of KMFA algorithm and the intraclass similarity of MKMFA algorithm is larger than that of KMFA algorithm.According to marginal Fisher criterion, the higher similarity of the two neighboring samples in the same class will bring about the smaller distance between their corresponding low-dimensional representations.In contrast, the lower similarity of the two neighboring samples from different classes will lead to the larger distance between their corresponding low-dimensional embeddings.Thus, compared to KMFA algorithm, MKMFA algorithm pulls the neighboring sample points in the same classes closer and pushes the neighboring sample points from different classes farther.Additionally, the similarity of the intrinsic graph is larger than that of the penalty graph in MKMFA algorithm.It means two sample points with the same label will have relatively high similarity.On the contrary, two sample points with lower similarity will have more possibility of having different labels.On account of the two similarities that are controlled in a certain range, it results in the goal of the noise suppression.These endow that the two similarities of MKMFA algorithm are helpful for improving the discriminant ability and suppressing the noise.
It is not easy for KMFA to select the best kernel width  for improving the feature extraction performance.Wang et al. proposed manifold adaptive nonparameter kernel [30], which can well capture the nonlinear property.However, it is not an easy and efficient method to calculate the kernel Gram matrix.Thus, MKMFA algorithm utilizes the data-dependent kernel function [27] constructed by the covariance matrix to reduce the influence of kernel parameter selection on the feature extraction performance.The data-dependent kernel function is described as where J denotes the covariance matrix of the sample set X.
The intraclass compactness of MKMFA algorithm is depicted as where the diagonal matrix O  = (   ) × = ∑  ̸ =    .The interclass dispersibility of MKMFA algorithm is expressed as where the diagonal matrix Therefore, the objection function can be characterized as follows: Just like KMFA algorithm, (18) also may suffer from singular problem when only a small number of samples are available.In order to obtain good generalization capability and avoid the singular problem, a common approach is mapping the original samples in the high-dimensional space into a PCA subspace ahead of constructing the neighborhood graphs [14,23].Although the preprocessing scheme can suppress noise and avoid the singular problem, the unsupervised PCA algorithm does not employ the label information.Thus, the features extracted by KMFA algorithm (utilizing PCA preprocessing) may discard some useful discriminate information in favor of pattern classification.The second method is transforming the ratio form of marginal Fisher criterion into the difference form [30,31].The third method is calculating the mapping direction in the null space of the intraclass compactness matrix.Lin et al. proposed Kernel Null Space MFA for face recognition [32].The above three techniques disregard employing the underlying geometry of samples.Another way is introducing a manifold regularization term so as to deeply exploit the inherent manifold structure.Wei et al. utilized Laplacian penalty function as the regularization term [33].Regularized KMFA (RKMFA) [34] and semisupervised KMFA (SSKMFA) [35] were proposed to deal with the singular problem and applied to bearing feature extraction.Motivated by [36], marginal Fisher criterion is modified by introducing the underlying manifold structure as the regularization term.It can be described as where 0 ⩽  ⩽ 1 controls the smoothness of the regularization term.The procedure of MKMFA algorithm is stated below.
Step 1 (constructing two neighborhood graphs).According to the local neighborhood relationship and label information of sample points, MKMFA algorithm constructs the intrinsic graph R  = {X, H  } and the penalty graph R  = {X, H  }.
The intraclass similarity matrix H  ∈  × of the intrinsic graph is defined in (13) and the interclass similarity matrix H  ∈  × of the penalty graph is defined in (14).
Step 2 (calculating the kernel Gram matrix).The entries of the kernel Gram matrix K ∈  × are   = (  ,   ).Thus, the kernel Gram matrix K is acquired according to (15).
Step 3 (seeking the optimal projecting direction).The optimal projecting direction  * in ( 19) is given by solving the following generalized maximum eigenvalue decomposition problem: where the diagonal matrix O  ∈  × of the intrinsic graph and the diagonal matrix O  ∈  × of the penalty graph are depicted in Section 2.3.
Step 4 (calculating the low-dimensional representations).The low-dimensional embedding of the original highdimensional signal sample   is obtained as where K(: There are three parameters to be preset for WKMFA, such as the regularization parameter , the intraclass neighboring point number  1 , and the interclass neighboring point number  2 .The regularization parameter  is set to be 0.01 by experience.As recommended in [23], the intraclass neighboring point number  1 is selected as 5. Fivefold cross-validation is employed to select the best interclass neighboring point number  2 , which ranges from 5 to 70 with a step size of 5 [35].

Bearing Fault Diagnosis Based on MKMFA
So as to verify the effectiveness of MKMFA algorithm for feature extraction and pattern classification, the MKMFAbased fault diagnosis model is presented and applied to identify various bearing faults.

The Structure of the Diagnosis System.
Rolling bearing fault classification is essentially multiple-manifolds learning problem [37].From the viewpoint of geometry, the highdimensional signal samples in the same fault state have the same topology or space distribution, and their lowdimensional embeddings reside on or near a submanifold [38].On the other hand, the high-dimensional signal samples in different classes have different geometric property, and their low-dimensional representatives are located on different submanifolds.Owing to the fact that MKMFA algorithm simultaneously considers the same submanifold compactness and different submanifolds dispersibility, the MKMFA-based bearing fault diagnosis model was presented.Figure 2 shows the entire process of the proposed scheme.The time-series vibration signal series () are collected from the vibration monitoring equipment by sensors.Subsequently, the signal samples   = [(), ( + 1), . . ., ( +  − 1)] are normalized to zero mean and unit variance, where  denotes the feature dimension of signal samples and is equal to the sampling point number of each signal sample.Thus, the signal sample set X = [ 1 ,  2 , . . .,   ] is acquired in high-dimensional pattern space.Via MKMFA algorithm learning the underlying manifold structures of high-dimensional signal samples and excavating the inherent fault information of different submanifolds, the signal samples in high-dimensional pattern space are mapped to a low-dimensional feature space, in which the intraclass nearest neighbors become closer while the interclass nearest neighbors get farther.Thus, the sensitive low-dimensional manifold characteristics related to the fault patterns are extracted from the high-dimensional pattern space and finally inputted into a category space.Thereby, the various fault patterns of rolling bearings are identified by KNN classifier in the category space.

Vibration Data Acquisition.
The experimental data of the rolling bearings are available from the Bearing Data Center [39].It has been validated in several researches [19,22,35,37,40] and become a standard data set of bearings.The detailed descriptions of the experimental system are illustrated in [40].Figure 3 shows the experimental setup, which is composed of a motor, a torque sensor, and a dynamometer controlled to gain different torque load levels.The rotational speed of the motor was ranged from 1730 to 1797 r/min according to different loads (0, 1, 2 and 3 hp).The deep groove ball bearings at the drive end were tested under four kinds of single point faults (normal, ball fault, inner, and outer race fault).Each fault type covers three  Data set C comprises 1000 signal samples involving four kinds of loads (0, 1, 2, and 3 hp) and fault types (normal condition, ball fault, inner, and outer race fault).Each fault type contains three kinds of damage sizes (0.007, 0.014, and 0.021 inches).Each operating condition consists of 100 signal samples, which are split into 50 training samples and 50 testing samples.It is a ten-submanifold learning problem corresponding to ten kinds of bearing fault severities.The purpose of performing the experiment on data set C is to further investigate the fault classification performance of the proposed fault diagnosis scheme under complicated operating conditions.

Feature Extraction and Pattern
Classification.So as to evaluate the effectiveness and exhibit the superiority of MKMFA algorithm for bearing feature extraction and fault identification, we conducted several experiments on the three data sets in Table 1 and made a comparison with KPCA, KFDA, KMFA, SKLPP, RKMFA, and SSKMFA.The feature dimension of each signal sample in the three data sets is 1024, which is larger than their training sample sizes.

Bearing Fault Categories
Identification.An investigation was performed on data set A to evaluate the feature extraction performance of MKMFA algorithm.The lowdimensional features are directly extracted from the highdimensional pattern space by utilizing the six feature extraction methods.The first two mapping results of these methods are plotted for intuitional display.Figure 4 reveals that the clustering result of the training set with KMFA is much dispersed for inner race fault, so that it brings about the crossing area between normal condition and inner race fault.Except outer race fault, KMFA could not clearly separate the other three faults for the testing set. Figure 5   and RKMFA algorithm, MKMFA pulls the neighboring sample points in the same classes closer and pushes the neighboring sample points from different classes farther.The reason is that the two similarities of MKMFA algorithm are weighted.On the grounds of the above experimental results, it is demonstrated that MKMFA is able to enhance the intraclass compactness and interclass dispersibility.Compared with the other five feature extraction techniques, MKMFA is more effective to capture the sensitive low-dimensional manifold characteristics related to the nature of different bearing faults.

Shock and Vibration
For objectively assessing the fault classification performance of the six feature extraction approaches, the lowdimensional mapping results of them are fed into KNN classifier as the final evaluation criteria.Table 2 displays their recognition accuracies and the corresponding parameter settings.It can be seen from Table 2 that the feature dimension of MKMFA ( = 5) is lower than that of KPCA ( = 20).Nevertheless, the classification performances of the former (100% for ball fault, normal condition, and inner race fault) surpass those of the latter (6.67%, 13.34% and 80%, resp.).It results from the fact that KPCA does not take advantage of    Hence, the proposed approach is effective to extract the most sensitive low-dimensional manifold characteristics beneficial to fault classification.The reason is that the proposed approach effectively makes use of the class information and the underlying geometric structure of faulty samples.On the other hand, the modified intraclass and interclass similarities are helpful for deeply exploiting the underlying manifold structure.
We used data set B to assess the influence of the training sample sizes on the recognition rates of different feature extraction methods.The training samples are selected randomly.Each experiment is conducted by ten trials in the following experiments.Table 3 and   is small compared with the other five feature extraction methods.
For KMFA, RKMFA, and MKMFA algorithm,  1 neighboring points dominate the intraclass compactness and  2 neighboring points govern the interclass dispersibility.Hence, the two neighboring points number roles are critical for the construction of the two neighborhood graphs and the subsequent diagnosis assignment.Several experiments were implemented by changing the number of the two neighboring points.The training and testing sample sizes per class are both set to be 50.As illustrated in Figure 11(a), the classification accuracies based on KMFA fluctuate below 90% when the values of  2 are small.Figure 11(b) reveals that the identification accuracies of RKMFA have the relatively small fluctuations.Figure 11(c) shows that the classification rates of MKMFA are stable and maintained at a high level while varying the two neighboring points number.Compared with KMFA and RKMFA, MKMFA-based bearing fault diagnosis approach is robust and convenient in virtue of without making great effort to tune the two nearest neighbors' number.The parameter  controls the smoothness of the regularization term in MKMFA algorithm.As the training sample sizes in each class are 10, 50, and 90, the effect of the parameter  on the recognition rates is illustrated in Figure 12.As can be seen, the classification performances of KMFA are superior to those of MKMFA and RKMFA with  as zero.It stems from the fact that under the circumstances, MKMFA and RKMFA are exactly the classical KMFA without mapping the original samples to a PCA subspace beforehand.Except the parameter  which is equal to zero, the recognition rates of MKMFA are higher than those of KMFA for different training sample sizes.Thus, the regularization term of MKMFA can improve the diagnosis performance of KMFA.Compared with RKMFA, the classification accuracies of MKMFA hold the relatively smaller fluctuations for different .It reveals that it is not very difficult for MKMFA to select the best parameter  for enhancing its classification capability.

Bearing Fault Severities Identification.
The experiments were conducted on data set C to recognize ten types of bearing fault severity conditions.So as to quantitatively describe the superiority of the six feature extraction methods, the definition of within-class scatter and between-class scatter is [9]  where   ( = 1, 2, . . ., ) is the feature vector,    is the mean of the feature vectors in the th class, and   is the mean of all feature vectors.
The within-class scatter S  describes the compactness of the samples in the same classes and the between-class scatter S  characterizes the dispersibility of the samples from different classes.Thus, it is beneficial to fault classification for smaller S  values and bigger S  values.Some simple feature extraction methods are widely applied to bearing fault diagnosis.Hence, we employed the simple features, including ten time-domain statistical features and six EMD energy entropies as illustrated in [35], to analyze the above two cases for comparison.Table 5 shows the average recognition rates of the 16 simple features based on KNN classifier.In comparison with Figure 10 and Tables 3 and 4, the simple features have lower recognition rates than MKMFA features for the two cases.The reason is that the MKMFA algorithm extracts the sensitive low-dimensional manifold characteristics related to fault patterns by learning the underlying manifold structures of high-dimensional signal samples.However, the simple feature extraction methods only give attention to some specific contents of faulty signals.Thus, it is necessary to explore the advanced feature extraction methods to improve fault classification performance, which is the goal of our study.

Conclusions
This paper presents improved kernel marginal Fisher analysis (MKMFA) algorithm for feature extraction with dimensionality reduction, which employs the label information and distances relationship of faulty samples, introduces a manifold regularization term, and utilizes the data-dependent kernel function.MKMFA effectively extracts the optimal low-dimensional manifold characteristics from the timeseries signal samples in high-dimensional ambient space.It is efficient to transform the complicated two-stage (feature extraction and dimensionality reduction) procedure into a relatively simple one-step process, which boils down to the generalized maximum eigenvalue decomposition problem.Compared with KPCA, KFDA, KMFA, RKMFA, and SKLPP, the feature extraction experiments on four categories of bearing faults reveal that our proposed feature extraction scheme is more effective to capture the sensitive low-dimensional manifold characteristics beneficial to pattern classification due to its good clustering and separation properties.The feature evaluation experiments on ten types of bearing fault severities show its superiority in comparison with KPCA, KFDA, KMFA, SSKMFA, and SKLPP.Based on MKMFA algorithm, a fault diagnosis model is presented and applied to identify different bearing faults.When varying the training sample sizes in the four-fault-type comparison experiments, it is demonstrated that the classification performances of MKMFA are significantly improved even for insufficient training samples.The ten-fault-severity comparison experiments of rolling bearings exhibit its outstanding fault recognition capability compared with the other five feature extraction methods.It is robust and easily applied to bearing fault classification without great effort to tune the parameters in MKMFA.The proposed diagnosis scheme has confirmed its effectiveness of recognizing bearing faults and can be easily applied to fault diagnosis of other components as well.

Figure 2 :
Figure 2: The flow chart of the proposed fault diagnosis procedure.
) and fault types (normal condition, ball fault, inner, and outer race fault) with the damage size of 0.021 inches.Each operating condition consists of 100 signal samples, which are divided into 10 training samples and 90 testing samples.It is a four-submanifold learning problem corresponding to four kinds of bearing fault categories.The experiments were conducted on data set A to evaluate the feature extraction and fault classification performance of MKMFA algorithm.Data set B is similar to data set A. Data set B also consists of 400 signal samples, whose operating conditions are identical with those of data set A. But data set B varies the training sample sizes of each class.It increases with a step size of 20 from 10 to 90.Hence, the corresponding remaining samples are used for testing.It is also a four-submanifold learning problem corresponding to four kinds of bearing fault categories.The experiments over data set B were aimed at assessing the effect of the training sample size on the fault recognition ability of MKMFA algorithm.
= 5,  = 6) KPCA ( = 20,  = 5) KFDA ( = 3,  = 3) SKLPP ( = 5,  = 3) RKMFA ( = 3,  = 10) MKMFA ( = 5) Inner race fault Figure 10 display the average recognition rates of the six feature extraction methods with different training sample sizes per class.The neighboring parameters of MKMFA are set to  1 = 5 and  2 = 10; those of RKMFA are set to  1 = 5 and  2 = 20.It is indicated that the recognition performances are improved as the training samples increase.This is because overfitting is less likely to occur when more training samples are available for KPCA, KFDA, and KMFA.The classification accuracies of KPCA, KFDA, KMFA, and SKLPP are much lower than that of MKMFA as the training sample number is equal to 10.By comparison, the classification rates based on RKMFA and MKMFA hold the least fluctuation when varying the training sample number.It results from the fact that RKMFA and MKMFA introduce the regularization term incorporating the intrinsic manifold structure to reduce the effect of the insufficient training samples.In comparison with RKMFA, MKMFA utilizes few features to achieve higher diagnosis accuracies.Therefore, MKMFA has the best classification performance even though the training sample number

Figure 10 :
Figure 10: The comparison of the average accuracies for various training sample sizes.

Figure 12 :
Figure 12: The average recognition rate as the training sample sizes are (a) 10, (b) 50, and (c) 90.

Table 1 :
Description of the data sets.
I: inner race fault, B: ball fault, O: outer race fault, and N: normal.

Table 2 :
The recognition rates (%) of KNN classifier based on six feature extraction approaches.

Table 3 :
The average recognition rates (%) of KNN for various training sample sizes each class.Although KFDA is supervised, it cannot excavate the underlying manifold structure.These results also indicate that the local structure information extracted by MKMFA could be more effective than the global feature information of the Euclidean space extracted by KPCA and KFDA.The classification performance of the proposed approach outperforms those of KMFA-based and SKLPPbased fault diagnosis methods.The reason is that KMFA and SKLPP lose some useful discriminant information by using PCA as preprocessing although they are able to capture the manifold structure.Compared with RKMFA, MKMFA employs fewer features to achieve better diagnosis results.
Table 4 shows the two parameter values and the classification accuracies of the six feature extraction methods based on KNN classifier.The parameters of KMFA, SSKMFA, and MKMFA are set as  1 = 5 and  2 = 10.As can be seen, MKMFA features have the smallest S  value and the biggest S  value compared with the other five feature extraction schemes.It reveals that MKMFA has the best clustering property and classification capability in comparison with KPCA, KFDA, KMFA, SSKMFA, and SKLPP.Compared with the other five supervised feature extraction techniques, KPCA has the lowest recognition rate due to discarding some useful discriminate information in favor of pattern classification.Additionally, MKMFA algorithm based on KNN classifier achieves the highest identification rate in all the six feature extraction techniques.The reason is that MKMFA algorithm employs the class label information and the distances relationship of sample points to guide the construction of local neighborhood graphs.Consequently, the discrimination performance of

Table 4 :
Performance comparisons for various feature extraction approaches.

Table 5 :
The average recognition rates (%) of the simple features based on KNN.Compared to KPCA, KFDA, KMFA, RKMFA, SSKMFA, and SKLPP, the above experimental results demonstrate that MKMFA algorithm has remarkable superiority.The main reasons are as follows.Firstly, MKMFA algorithm employs the discriminant information and local neighborhood relationship of signal samples to construct the two neighborhood graphs.Secondly, it incorporates the class label information and distance relationship of signal samples into the two similarities and thus further enhances the intraclass compactness and interclass dispersibility.Thirdly, it introduces a manifold regularization term to cope with the singular problem and employs the nonparametric kernel function to reduce the influence of kernel parameter selection on feature extraction performance.As a result, the advanced low-dimensional manifold characteristics extracted by MKMFA algorithm are related to the nature of bearing fault patterns by excavating the inherent manifold structures of different submanifolds.