A Novel Faults Diagnosis Method for Rolling Element Bearings Based on ELCD and Extreme Learning Machine

A rolling bearing fault diagnosis method based on ensemble local characteristic-scale decomposition (ELCD) and extreme learning machine (ELM) is proposed. Vibration signals were decomposed using ELCD, and numerous intrinsic scale components (ISCs) were obtained. Next, time-domain index, energy, and relative entropy of intrinsic scale components were calculated. According to the distance-based evaluation approach, sensitivity features can be extracted. Finally, sensitivity features were input to extreme learning machine to identify rolling bearing fault types. Experimental results show that the proposed method achieved better performance than support vector machine (SVM) and backpropagation (BP) neural network methods.


Introduction
Rolling bearing is among the most important components of any mechanical equipment and is often found in various industrial applications.Due to its widespread industrial applications, roller bearing fault diagnosis is critical to prevent catastrophic failure of machines, thereby preventing economic losses [1,2].Status of rolling element bearings is typically monitored by processing vibration signals [3].When a fault occurs, collected vibration signals are nonstationary.Hence, reliable fault detection systems need to adopt appropriate methods to process vibration signals.
Traditional signal processing methods such as Wavelet and Fourier transforms are widely used to process vibration signals.Rafiee et al. applied Wavelets to fault diagnosis of rolling bearing and obtained good results [4].Short-time Fourier transforms have been proved to be superior in mechanical fault diagnosis [5].Wavelet and Fourier transform methods cannot accurately analyze vibration signals because of poor adaptation.The empirical mode decomposition (EMD) represents a classical time-frequency analysis method, and EMD has been widely adopted in mechanical fault diagnosis, earthquake monitoring, and bridge and constructions state monitoring [6][7][8].However, EMD suffers from over-envelope, under-envelope, end-effect, and other shortcomings [9].Local mean decomposition (LMD) methods have been widely used in different fields such as electroencephalogram (EEG) processing and mechanical fault diagnosis.This is because of its strong ability to deal with nonstationary signals and superior time-frequency analysis performance.However, LMD itself also has a large amount of iterative computation and problems associated with end effects [10,11].Recently, Cheng et al. proposed a new selfadaptive signal processing method, local characteristic-scale decomposition (LCD), which can decompose a nonstationary signal into several intrinsic scale components (ISCs) [12][13][14].By analyzing each ISC, characteristic information of the original signal can be extracted effectively with higher accuracy.Due to superior time-frequency analysis performance, LCD method has been widely used to analyze nonstationary signals in mechanical fault diagnosis.As with the EMD method, LCD method also causes modemixing effect [13].Therefore, an improved LCD method, ensemble local characteristic-scale decomposition (ELCD) method, has been proposed to decompose vibration signals.

Shock and Vibration
This effectively eliminates mode-mixing and allows accurate intrinsic scale components to be obtained.
There are two major challenges in the development of real-time fault diagnostic systems.The first one is that there is a large amount of data collected from the real-time monitoring system, which are multivariate and nonlinear.The second challenge is related to the demand for quick fault identification within a short time.It is well known that only a few seconds are needed for a fault to propagate and cause catastrophic failure.This would cause significant financial loss and could result in injury or death of personnel.Therefore, if any fault exists, a diagnostic system should be able to detect the fault immediately and send an alarm signal to inform the control center so that the necessary correction action can be taken immediately.Conventional pattern recognition methods, like backpropagation (BP) neural network and support vector machine (SVM), are widely applied for fault diagnosis [15,16].Yang et al. distinguished signals at different corrosion stages using BP neural networks in the acoustic emission testing of a tank bottom [17,18].Nevertheless, BP neural network has disadvantages related to abundant parameter settings and slow convergence and is easily caught in a local minimum.All these issues restrict accuracy and wide application of the diagnosis [19,20].Compared to BP neural network, SVM generalization performance has been improved greatly, but requires artificial assignment of kernel function and kernel function parameters.This significantly restricts application of SVM [21,22].Extreme learning machine (ELM) is a new classifier based on neural networks [23,24].In theory, this algorithm tends to provide the best generalization performance at an extremely fast learning speed.As a result, it is widely used in gear fault diagnosis, energy fields, and sales forecasting.Moreover, ELM has been proven to require less human intervention and less running time than support vector machine (SVM) [25].Due to these advantages of ELM, ELM has been proposed to realize real-time state classification of rolling bearings under variable conditions.
In this work, a new method based on ELCD and ELM is proposed to identify different rolling bearing working conditions.First, ELCD is used to decompose vibration signals into multiple intrinsic scale components.Applied to ISC feature values, a distance-based evaluation method is adopted to calculate bearing sensitive features for different working conditions.These features are input into the ELM to identify roller bearing fault patterns.

Ensemble Local Mean Decomposition.
LCD is a new selfadaptive signal decomposition method.Any two decomposed ISCs are mutually independent, with instantaneous frequency of physical significance.ISC needs to meet the following two conditions [12,13].
(1) A signal () should have both positive and negative valued maxima and minima, respectively, and any adjacent maxima and minima should witness a monotonic relationship.
(2) For the data, let all the maximal points be denoted as (  ,   ).The line formed by any two adjacent extreme points,   , at  +1 as  +1 , is specified as follows: Then, the relation should be true, where Any complex signal () may have its LCD results written as follows: where   () denotes the residual component.Intrinsic scale components with different characteristic scales are obtained via the LCD method.Mode-mixing phenomenon of the decomposition process generates some IS components that have unclear physical meaning.Therefore, the ensemble local characteristic-scale decomposition (ELCD) method is used in this study for signal processing.This method solves the mode-mixing problem by using statistical features of white noise; that is, they have evenly distributed frequencies.White noise of finite amplitude is repeatedly added to the signal to form a composite signal.Then, this composite signal is decomposed using LCD, and the average multi-decomposed component is calculated.Mode-mixing effect of LCD method is eliminated.The ELCD algorithm is shown in Figure 1.

Algorithm Simulation.
In order to verify the algorithm, an impact component, a high-frequency sinusoidal wave, and a low-frequency sinusoidal wave are used to form a simulation signal.The results are shown in Figures 2(a)-2(d).
The simulated signals are decomposed by the LCD method and ELCD method.In this study, noise added for the ELCD has a signal amplitude of 0.01 times the signal standard deviation and has a total mean of 120 times.The results are shown in Figures 3 and 4.
In Figure 3, decomposition results using the LCD method are shown.It can be seen that the decomposed components of simulated signals have mode-mixing effect.High-frequency components and distortions components are present in the ISC1 component and the ISC2 component.As seen in Figure 4, impact components along with the high-frequency and low-frequency sine waves are accurately decomposed by ELCD.Moreover, mode-mixing phenomenon does not occur in the ELCD method.

Intrinsic scale components Local characteristic-scale decomposition
Obtain ISCs

Calculate ensemble mean of ISCs
Repeat n times   (

Feature Extractions
The following symbols have been used: : skewness, : kurtosis,  and : signal's mean and standard difference, CF: peak indicator, SF: waveform index, IF: pulse index, CLF: margin index, and : energy.The K-L divergence method is follows.
Nonparametric estimation method is used to calculate probability distribution of signals, and then the K-L distance is given as where (), () are the probability distribution of signals.
Calculate K-L divergence (, ): (, ) =  (, ) +  (, ) .[17] proposed the distance-based evaluation approach.Principal feature is chosen from the entire feature using distance-based evaluation approach.Distance-based evaluation approach is one of the most popular feature selection methods; therefore, it is used widely in parallel with the Pearson correlation coefficient and information gain [2].
The basic idea of a distance-based evaluation method is that smaller distances between samples within the same category are better when features characterize the samples and greater distances between different classes are more favorable.The steps involved in this method are as follows.
(1) Evaluate the average distance, where distance is given by Shock and Vibration 5 Here,   stands for the number of samples belonging to the cth class; J is the size of a feature set; and  ,, is the value of the jth feature of the mth sample in the cth class.The average distance  ()   of the jth feature belonging to all the  classes is given by (2) Compute average value: and evaluate average distance  ()  of the  different classes: where  and  are two different classes.
(3) Calculate assessment factor of th feature by The assessment factor reflects sensitivity of the feature.Larger evaluation factors denote more sensitive features.
(4) Calculate threshold value  and take feature with its assessment factor greater than  as a sensitive factor: Repeated experiments showed that a value of  = 2 leads to optimal results.

Extreme Learning Machine
ELM, proposed by Huang et al., was originally developed for single-hidden-layer feed forward neural networks and then extended to "generalized" single-hidden-layer feed forward networks (SLFNs).ELM is a new learning algorithm with faster learning speed and better generalization performance [24,25].Details about ELM algorithm can be found in [24].The ELM output expression reads where   , , and  are input weight, concealed layer deviation, and output weight, respectively.Input and output vectors are denoted by   and   .Number of samples and the activation function are denoted by  and , respectively.For the latter, in practice, a sigmoid function is often used.Assuming  samples {(, )}  =1 and a number of concealed layer sections  for training and testing, the ELM procedures are the following: (1) Initialize and maintain weight   and deviation   .

Sensitive features extractions
Input features to ELM and identify the fault types (2) Calculate concealed layer output matrix .
In this section, the proposed rolling bearing fault detection method based on ELCD and ELM is presented.Fault features can be obtained by processing the vibration signals collected by multiple sensors.As mentioned previously, some of them are associated with fault information and others are not irrelevant.Therefore, the other parameters are used to further extract sensitive features of the fault.ELM is used to identify roller bearing fault patterns.A summary of the process for the fault diagnosis using ELCD and ELM is shown schematically in Figure 5.

Experiments.
This study adopted rolling bearing data of US Case Western Reserve University for processing.Experiments adopted 6205-2RS JEM SKF deep groove ball bearings, with a rotating motor load of 735.5 W. The rolling bearing speed was set to 1797 rpm, adopting EDM technology to process the bearing into one with fault diameter as 0.3556 mm and fault depth as 0.2794 mm.In this study, sensor sampling frequency was 12 kHz, collecting four working state vibration signals, respectively, referring to the normal state, rolling element fault, inner race, and outer race faults, with each data sample's length  as 2500 points.The collected four signals are shown in Figure 6.IS components of the decomposition by ELCD of a vibration signal from the inner race fault are shown in Figure 7.
As can be seen in Figure 7, eight IS components were derived.Then, skewness, kurtosis, peak indicators, waveform index, pulse index, margin index, energy, and relative entropy of 8 ISCs were calculated.This was used to obtain a series of features, some of which contained principal information and  others contained little information.Therefore, the distancebased evaluation approach was adopted to calculate distance factor and threshold value, as shown in Figure 8.
Figure 8 shows that the threshold value evaluation can obtain 15 sensitive features.In order to distinguish between advantages and disadvantages of ELCD and LCD, rolling bearing sensitivity features in different working conditions were calculated, taking mean of multiple experiments, with the comparison results shown in Figure 9.
Figure 9 shows that LCD witnesses mode-mixing, calculation sensitive features and uneven mean distribution, and unobvious differences between different working conditions.The ELCD experiences an even distribution of sensitivity features, overcoming decomposition mode-mixing.To accurately identify faults in different working conditions, the ELM classifier is employed.

Pattern Recognition.
In accordance with results in the previous section, sensitivity features were calculated for different vibration signals and chosen as input data to train and test the extreme learning machine.At the same time, this study adopted ELCD, LCD, EMD, and LMD methods for processing vibration signals.Results of the test samples are shown in Figure 10 and are compared with those using ELCD.Test accuracy of two methods is listed in Table 1.
As seen in Figure 11 and Table 1, both EMD and LCD methods suffer from mode-mixing.Therefore, results of the test samples are poor.LMD method also has a large amount of iterative computation and end effects.As a result, test accuracy of LMD-ELM is not good.ELCD method can effectively eliminate mode-mixing and obtain accurate intrinsic scale components.Hence, experimental results show that the ELCD-ELM method can effectively identify different rolling bearing working conditions, at a recognition rate higher than other methods.60 groups of data were chosen for training and testing, of which 40 groups were used for training and 20 groups were used for testing.Three classifiers, SVM, BP, and ELM, were used for data training and testing.Test results are shown in Figure 11.
As seen in Figure 11, all three classifiers can distinguish different conditions of rolling bearing, but compared to BP and SVM, the ELM classifier achieves the highest mean recognition rate and identifies roller bearing fault patterns, because of the lower human intervention and lower running time.

Conclusions
The collected vibration signals are often mixed with substantial ambient noise, which makes fault signal features insignificant for rolling bearings fault diagnosis.In this study, a novel fault diagnosis methodology for rolling bearings based on ELCD and ELM is proposed.The ELCD method was proposed to process nonstationary vibration signals and overcome mode-mixing phenomenon of the LCD method.A distance-based evaluation method is adopted to calculate bearing sensitive features for different working conditions.In order to address disadvantages of traditional BP and SVM classifiers, such as complex parameter setting and low convergence rate, ELM was used to identify roller bearing fault patterns.A theoretical analysis and experimental results show that the ELCD-ELM method has higher accuracy than other methods.

3. 1 . 3 ( 4 (
Feature Calculation.Single time-domain or frequencydomain features cannot effectively represent mechanical faults, suffering from low diagnostic accuracy and low universality.In this work, frequency-domain is used; timedomain and other parameters are used to represent different rolling bearing working conditions.As dimensionless indices, skewness, kurtosis, peak indicators, waveform index, pulse index, and margin index can be used to represent rolling bearing fault features.These quantities are widely used in mechanical fault diagnosis [2].Kullback-Leibler (K-L) divergence is called relative entropy.It can be used to measure similarity of the two signals.Decomposed different vibration signals are different from the original signal in terms of similarity and K-L divergence.Energy can reflect signal strength.Bearings in different working conditions have different energy in different frequency bands.Against any signal  = ( 1 ,  2 , . . .,   ), the above parameter indicators are defined as follows:  = ∑  =1 (  − )  − 1)  3 ,  = ∑  =1 (  − )  − 1) ,, −  ,,      ,  = 1, 2, . . ., ,  = 1, 2, . . ., .

Figure 5 :
Figure 5: Flow chart of fault diagnosis using ELCD and ELM.