A Novel Fault Diagnosis Model for Bearing of Railway Vehicles Using Vibration Signals Based on Symmetric Alpha-Stable Distribution Feature Extraction

Axle box bearings are the most critical mechanical components of railway vehicles. Condition monitoring is of great benefit to ensure the healthy status of bearings in the railway train. In this paper, a novel fault diagnosis model for axle box bearing based on symmetric alpha-stable distribution feature extraction and least squares support vector machines (LS-SVM) using vibration signals is proposed which is conducted in three main steps. Firstly, fast nonlocal means is used for denoising and ensemble empirical mode decomposition is applied to extract fault feature information.Then a new statistical method of feature extraction, symmetric alphastable distribution, is employed to obtain representative features from intrinsicmode functions. Additionally, the hybrid fault feature sets are input into LS-SVM to identify the fault type. To enhance the performance of LS-SVM in the case of small-scale samples, Morlet wavelet kernel function is combined with LS-SVM for the classification of fault type and fault severity and the particle swarm optimization is used for the optimization of LS-WSVM parameters. Finally, the experimental results demonstrate that the proposed approach performs more effectively and robustly than the other methods in small-scale samples for fault detection and classification of railway vehicle bearings.


Introduction
Rolling element bearings have been widely used in industrial applications.Axle box bearings are one of the critical mechanical components of railway vehicles.The frequent failures including pitting, stripping, wear, crack, and abrasion of train bearings have a great influence on the traffic safety.Therefore, effective identification of bearing health status is indispensable to monitor the working condition of axle box bearings for train maintenance [1,2].Currently, vibration analysis and acoustic analysis are two main approaches for defect detection [3,4].Vibration-based diagnosis has become the most common monitoring technique because of its higher reliability.
In the process of fault diagnosis, extracting defect features from noisy vibration signals remains a great challenge.Many sources of signal contamination including additive noise, the signals from shafts, gearboxes, and other mechanical components of railway vehicles overlap signals of interest in both time and frequency.Thus, it is vital to advance signal denoising method to get rid of the noises and extract the fault characteristics.For that reason, a lot of algorithms have been developed for vibration signal denoising.In recent years, the methods based on the discrete wavelet transform (DWT) [5][6][7] coefficient shrinkage, the empirical mode decomposition (EMD) [8][9][10][11][12][13], and the nonlocal means (NLMs) [14,15] have been introduced as three popular methods to rotating mechanical fault diagnosis.DWT is characteristic of analyzing signals on multiple scales by discarding the lower magnitude, and the performance of the wavelet transform relies on the selection of the wavelet basis function.In the EMD method, the clean vibration signal is obtained by discarding first few intrinsic mode functions (IMFs).Mode mixing [16], resulting from signal intermittence, is 2 Shock and Vibration the disadvantage of EMD.To overcome this obstacle, the ensemble EMD (EEMD) was proposed by Wu and Huang [17].The fast NLMs (FNLM) approach is a very successful image denoising method [14], which has been applied for rotating machinery fault diagnosis.In this study, the FNLM approach and the EEMD approach are combined to denoise the raw vibration signal.
After denoising step, the feature parameters which correctly represent the health status of the axle box bearing should be extracted.According to previous studies, the fault features such as permutation entropy [18,19], subband energy [20], and statistical features (variance, kurtosis) [13] in frequency domain, time domain, and time frequency domain could be extracted.However, the methods above are not stable for complex signal.As a number of non-Gaussian signals have an impulsive property and heavy tail in engineering, alpha-stable distribution has been widely applied in various fields [21][22][23].Three estimation methods for alphastable distribution were comparatively analyzed theoretically [21], and the results showed that the stability and estimation accuracy of empirical characteristic function method (ECF) ranked first.The kurtogram and stable parameter  have been proposed to detect incipient bearing faults in [22].After the analysis on stability and sensitivity of parameters, optimal parameters are selected for bearing fault diagnosis [23].
However, the estimation of the symmetry parameter  and location parameter  can be calculated with the estimated value of  and , resulting in cumulative error propagation of such  and .Meanwhile, because the characteristic function of  stable distribution is intermittent in  = 1, the estimation error is particularly serious in  ̸ = 0 and  → 1.Furthermore, the geometry of the bearing structure is symmetric, so the SS distribution is a more accurate statistical model to describe the bearing signals.Therefore, to enhance the computational efficiency and recognition accuracy of rolling bearings diagnosis, it attempts to extract fault feature using symmetric  stable distribution in this paper.
After feature extraction and selection, the early fault of axle box bearing should be detected via the classification of the selected fault characteristics.Recently, based on statistical learning theory, support vector machine is widely used in pattern classification and fault diagnosis of rotating machinery due to its high classification accuracy [18,19,[24][25][26][27].For its low complexity and improved computational efficiency, LS-SVM has better performance in applications.The kernel function of LS-SVM is critical for a better classification result.Multiple kinds of kernel functions including Polynomial Kernel, Gaussian Kernel, and Sigmoid Kernel are applicable.In order to obtain better performance, WSVM is proposed here with the combination of Morlet wavelet kernel and SVM.Compared with RBF kernel, the Morlet wavelet kernel shows a more reasonable hyperlane.Thus, this article will employ LS-SVM with wavelet kernel function and optimized parameters by PSO to enhance the accuracy of fault diagnosis.
The remainder of this paper is organized as follows.We briefly describe the FNLM and EEMD denoising methods in Section 2. The introduction of the feature extraction based on symmetric alpha-stable (SS) distribution is presented in Section 3. Section 4 describes the proposed PSO-LSWSVM method.In Section 5 the proposed approach is validated by experimental data.The conclusion is drawn in Section 6.

Fast Nonlocal Means Algorithm.
With the additive noise models, the definition of noise signals can be expressed as () = () + , where  is the true signal and  is additive noise.For a given sample , the estimate of signal  is a weighted sum of values within their neighbourhood (): where () = ∑  (, ), and the weights are [14]  (, ) = exp (− ∑ ∈Δ ( ( + ) −  ( + )) ( This similarity is measured via the weighted Euclidean distance.The weight (, ) takes a large value if the patch  is similar to the patch j and vice versa.In (2),  is a bandwidth parameter, while Δ stands for a local patch of samples surrounding , with  Δ samples included.To reduce the computing time, the fast NLM has been proposed.For a signal of length , given a translation vector   ,    corresponds to the discrete integration of the squared difference of the sample y and its translation by   .
Now let   =  −  and define  =  + ; the patch size is Δ = [−, ].Thus,  2 (, ) can be rewritten as follows: We split the sum and use the identity in (3); we obtain This is the key expression that computes the weight for a pair of pixels in constant time.

Ensemble Empirical Mode Decomposition.
As an improved version of EMD, EEMD can decrease the mode mixing effect.The algorithm can be given as follows [17].
(1) Add white noise   () with the given amplitude to the original signal () to generate a new signal: where   () represents the noise-added signal of the th trial, while  = 1, 2, . . ., .
(3) Repeat steps (1) and (2) while  < , with various white noise series every time to acquire an ensemble of IMFs.
(4) Ensemble means of the corresponding IMFs of the decomposition is calculated; the final result is as follows: where   () is the th IMF decomposed by EEMD, while  = 1, 2, . . ., , and  = 1, 2, . . ., .The IMFs include different frequency bands ranging from high to low.In this study, the first five IMFs are chosen for analysis.

Symmetric Alpha-Stable Distribution
3.1.Alpha-Stable Distribution.It is found that alpha-stable distribution can provide useful models for non-Gaussian signals with impulsive waveform and heavy tail probability density.Since the probability density function of an alphastable random variable cannot be given in a closed-form, the characteristic function can always be given as follows: where Thus, the characteristic function is a four-parameter family of distribution and is denoted by (, , , ).The first parameter  (0 <  ≤ 2) is the characteristic exponent which describes the tail of the density function.The second parameter  (−1 ≤  ≤ 1) is called symmetric parameter controlling the skewness.The parameters  ( > 0) and  (−∞ <  < +∞) are the scale parameter and the location parameter, respectively.

Symmetric Alpha-Stable Distribution.
In the case of  = 0, the distribution is symmetric about , called symmetric alpha-stable (SS), which has a characteristic function such that Furthermore, as the large estimation error, the location parameter  cannot describe the health condition of bearings.Thus, the parameter  is set to be zero so as to improve processing speed; the characteristic function could be rewritten as

Empirical Characteristic Function Parameter Estimation
Method.In practical applications of engineering, the realtime parameter estimation of random sequence is crucial in alpha-stable distribution.In the literature, there are three major methods used to obtain the parameter value: (1) quantiles method, (2) logarithmic moment method, and (3) empirical characteristic function method.By comparative analysis [21], the empirical characteristic function approach has the highest estimation accuracy for four parameters of alpha-stable distribution with best stability.The parameter estimation process based on ECF is described as follows [28]: (1) Calculating the sample characteristic function is as follows: where   ( = 1, 2, . . ., ) is the sample of a random variable.
The SS distribution densities with different  and  value are shown in Figure 1.For the fault bearing signal, the defect characteristic parameters, such as the exponent , the scale parameter , and the maximum PDF (MPDF) value which represent the healthy status of the bearing, can be gained by the SS distribution method.

Bearing Defect Diagnosis Methodology
Based on PSO-LS-WSVM  proposed.Kernel mapping is applied to map the data in input space to a high dimensional feature space, where the problem is linearly separable.Therefore, the kernel function is a critical factor for classification accuracy.Several types of kernel functions including Sigmoid Kernel, Gaussian (RBF) Kernel, and Polynomial Kernel are generally used in many applications; specifically, Gaussian Kernel has been widely used due to excellent performance.In recent years, the wavelet kernel as a type of multidimensional wavelet can approximate arbitrary nonlinear function, and Zhang et al. have proven that wavelet kernel is better than the Gaussian Kernel [26].
We consider that the wavelet analysis is a function with a family of functions emerging from dilating and translating of a mother wavelet function: where , ,  ∈ ,  is a dilation factor,  is a translation factor, and ℎ() is the mother wavelet.The product of onedimensional wavelet function can be written as follows: where { = ( 1 , . . .,   ) ∈   }.If ,   ∈   , the dotproduct wavelet kernels are And the translation invariant wavelet kernels are With no loss of generality, people can construct Morlet wavelet functions as translation invariant wavelet kernel functions as follows: The Morlet wavelet function is shown in Figure 2. Equation ( 19) defines the mother wavelet, of which the wavelet kernel can be described as follows:

Particle Swarm Optimization for Parameter of LS-WSVM.
Particle swarm optimization (PSO) is a population based on stochastic optimization technique inspired by social behavior of bird flocking or fish schooling [29].Compared to genetic algorithm (GA) [30], the advantages of PSO are easy to implement and there are few parameters to adjust.Thus, it shows better performance with optimization problems.
In PSO, suppose that the search space is -dimensional; there are  particles in the population.The position of the particle  in generation  is expressed as -dimensional vector,   () = ( 1 (),  2 (), . . .,   ()).The position represents the particle velocity vector, V  () = (V 1 (), V 2 (), . . ., V  ()).The position and velocity of each particle are replaced continuously according to the formula as follows: where  is the updated iteration of the particle.
where  min is the minimal inertia weight and  max is the maximal inertia weight, iter is the current iteration number, and iter max is the maximum iteration number.The optimization procedure is illustrated in Figure 3.

The Proposed Intelligent Bearing Fault Diagnosis Methodology.
On the basis of the superiorities of FNLM, EEMD, SS, and PSO-LS-WSVM, researchers put forward a new bearing fault diagnosis approach, with the purpose of sorting multiple and normal types of faulty bearing.Figure 4 shows the proposed procedure and the steps are displayed as follows.
(1) Samples of vibration signals are taken by acceleration sensors at a particular sampling frequency under various operating conditions. (

Experimental Results.
With the purpose of examining the effectiveness of the proposed approach, the axle box bearing vibration data are used as an example.Figure 5 displays the experimental test on the axle box bearings of railway conducted in test rig.The test rig for data acquisition consists of two motors, two friction wheels, hydraulic loading installation, and control electronics (not shown).Experimental bearing is mounted to the wheelset which is fixed by installation of the test rig, and the wheelset is driven by a friction wheel.Figure 6 displays the experimental test on the axle box bearings of railway conducted in test rig.Every fault condition consists of two sizes: the width is set as 0.1 mm; the depth is set as 0.23 mm and 0.43 mm, respectively.An  Figure 7 shows the typical waveforms in time domain by FNLM denoising method and EEMD decomposing algorithm.Generally, defect information is contained in the first five IMF components, which could be utilized to extract defect features using SS.The detailed steps to extract SS features have already been discussed in Section 3.2.
Table 3 shows characteristic exponent values of all five modes, and the third mode (c3) performs a better fault indication, where three values observed under normal situation and abnormal situations with inner race fault (0.43 mm) and roller fault (0.43 mm) are 1.1757, 1.1618, and 1.1409, respectively.The above three values are supposed to be pretty close.Therefore, single alpha values of mode c3 fail to show the difference of bearing healthy status.Scale parameter and MPDF value of mode c3 in Tables 4 and 5 show the distinctions (5) (3)    After feature extraction, the different feature sets, including , , MPDF, and hybrid feature set, are used as input to the wavelet-based LS-SVM for fault diagnosis.Based on experience and experimental tests, computation complexity is taken into consideration in the experiment and the parameters of PSO optimization are set as follows: the number of particles is set as 20, the acceleration constants both are set as 2.0, and the evolutional generation is set as 100.As shown in Table 6, when using 40 training samples and 40 testing samples as the input to proposed classifiers, the classifiers yield recognition rates, 88.57% for the  feature, 92.86% for the  feature set, 93.57% for the MPDF feature set, and 95.71% for the hybrid feature set.It shows that the hybrid feature set contains more information characterizing the condition of axle box bearing.Moreover, Figure 9 shows that the consuming time of Morlet wavelet kernel is shorter than RBF kernel.
For the reason that the performance of diagnosis methods is closely related to the amount of training samples, we will study the recognition rates in different samples.In the testing   10, which proves that the proposed method has reached higher recognition accuracy than that of RBF-based LS-SVM in different training sample.With the increase of the number of samples, the classification accuracy rate is also rising, and the proposed approach showed good performance in the case of a very small number samples.
To obtain the better recognition accuracy, many optimization algorithms including PSO, GA, and Grid Search (GS) were combined with LS-SVM classifier.In this paper, the PSO algorithm is used in our work; thus the GA and Grid Search algorithm are compared with PSO in optimizing parameters.The parameters of GA are set as follows: the population size is set as 20, the iteration number is set as 100, and the crossover probability and the mutation probability are set as 0.5 and 0.1, respectively.The comparison result of 40 samples for each fault class is shown in Table 7 and Figure 11; the classification result of PSO-LS-WSVM is 95.71%, in comparison with the 92.14% and 93.57% using GS-LS-WSVM 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.02 Time (s) −0.2 0 0.2 Amplitude (a) 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.22  and GA-LS-WSVM, respectively.As shown in Figure 11, the consuming time of PSO-LS-WSVM is longer than that of the other two approaches.The main reason is that PSO is not good at binary coding.Moreover, the average classification accuracy of wavelet-based LS-SVM optimized by PSO, GA, and GS in different number training sets and testing sets is compared.From Figure 12, it can be seen that the recognition rate of PSO algorithm is obviously higher than that of the other two methods.It can be concluded that the classification accuracy is affected by the number of training samples.It should be noted that the Morlet-LSSVM method has better performance than RBF-LSSVM for small feature dataset.From the above analysis, the result of RBF-LSSVM is seriously affected by the number of training samples in small size samples.The training time and the testing time of the classifiers rely on the sample size and coded programming.Hence, under the same condition, the smaller the sample size, the less the time it consumes.Furthermore, the consuming time of the Morlet-LSSVM method is less than that of the RBF-LSSVM.This phenomenon may be attributed to the fact that the Morlet kernel is approximately orthonormal, but the RBF kernel is not.

Conclusion
We proposed a novel bearing multifault diagnosis method based on FNLM and EEMD for denoising, symmetric alphastable distribution (SS) for feature calculation, and an appropriate PSO-LS-WSVM classifier.The results of experiment suggest that the denoised method FNLM-EEMD improves efficiency of defect feature extraction, and the proposed SS parameter extraction method is capable of making the most discriminate and efficient features for fault diagnosis.By comparing combinations of SS feature parameter with the LS-SVM based Morlet wavelet kernel and RBF-based classifiers and then optimizing with different algorithm, respectively, the classification capacity of the above classification methods has been studied under various sizes of training and testing samples.All results reveal that the wavelet-LSSVM has better performance than the RBF-LSSVM when the size of data sample is very small.For its higher recognition accuracy and computational efficiency, the bearing fault diagnosis based on FNLM-EEMD, SS, and the PSO-LS-WSVM classifier is an effective and powerful tool for monitoring the health status of axle box bearings.

Figure 1 :Figure 2 :
Figure 1: (a) The pdf of SS with different  values; (b) the pdf of SS with different  values.

Figure 4 :
Figure 4: Structure diagram of the proposed fault diagnosis algorithm.
), 8(b), and 8(c), respectively.It can be seen that anyone of the three parameters cannot identify different fault types, but when the three parameters were combined, samples of the same class exhibit excellent clustering result in Figure8(d).

Figure 6 :
Figure 6: Artificial defects on the components of the axle box bearing: (a) axle box bearing; (b) defect on the outer race; (c) defect on the inner race; (d) defect on the rollers.

Figure 9 :
Figure 9: The consuming time of RBF kernel and Morlet kernel, respectively.

Figure 10 :Figure 11 :
Figure 10: The diagnostic accuracy based on the Morlet wavelet kernel and RBF kernel LS-SVM optimized by PSO, respectively.

Figure 12 :
Figure 12: The diagnostic accuracy based on the LS-WSVM optimized by PSO, GA, and GS, respectively.

Table 1 :
Geometrical parameters of the bearings.

Table 2 .
As shown inTable 2, the present research needs to distinguish 7 classes in total.For each condition, 80 samples can be obtained.The gathered original signals are classified into training samples and testing samples for each condition, with each sample containing 5000 data points.The training samples are used to train the classifier model and the testing samples are used to evaluate the effectiveness of the proposed fault diagnosis methods.

Table 2 :
Specifications of the bearing defects.
The parameters ,  and MPDF value of 40 training samples are aligned in Figures8(a

Table 6 :
Fault diagnosis result of the proposed fault diagnosis method based on SS and PSO-LS-WSVM.

Table 7 :
Recognition accuracies of the feature classification with different methods.