Fault Diagnosis of a Hydraulic Pump Based on the CEEMD-STFT Time-Frequency Entropy Method and Multiclass SVM Classifier

The fault diagnosis of hydraulic pumps is currently important and significant to ensure the normal operation of the entire hydraulic system. Considering the nonlinear characteristics of hydraulic-pump vibration signals and themodemixing problem of the original Empirical Mode Decomposition (EMD) method, first, we use the Complete Ensemble EMD (CEEMD) method to decompose the signals. Second, the time-frequency analysismethods, which include the Short-TimeFourier Transform (STFT) and time-frequency entropy calculation, are applied to realize the robust feature extraction. Third, the multiclass Support Vector Machine (SVM) classifier is introduced to automatically classify the fault mode in this paper. An actual hydraulic-pump experiment demonstrates the procedure with a complete feature extraction and accurate mode classification.


Introduction
Hydraulic systems have been widely used in aeronautics, astronautics, automobiles, shipping, and so on.As the heart of a hydraulic system, the performance of the hydraulic pump significantly affects the entire hydraulic system [1].Thus, achieving real-time fault diagnosis of the hydraulic pump is essential and urgent to maintain the entire system [2].For the hydraulic pump, its structure is complex, the relationship among its internal parameters is highly nonlinear, and there are strong couplings among various fault features.As a result, an accurate mathematical model is difficult to establish.Therefore data-driven diagnostic methods are commonly used for hydraulic pumps based on their vibration signals.Generally, the entire fault diagnosis process can be considered a pattern identification problem that mainly includes two important procedures: feature extraction and mode classification.
Many data-driven feature extraction methods have emerged in recent years that are different from the traditional time-domain analysis and frequency-domain analysis methods.The Empirical Mode Decomposition (EMD), which was developed by Huang et al., is a time-frequency analysis method and has advantages in addressing nonlinear and nonstationary signals [3].The EMD can decompose any signal into intrinsic mode functions (IMFs) based on the local timescale of the data, without using a priori basis [4].However, the EMD faces a serious problem, "mode mixing," where a notably disparate amplitude in a mode oscillates or notably similar oscillations occur in different modes.Because of this problem, a new method was proposed: Ensemble Empirical Mode Decomposition (EEMD), which performs the EMD over an ensemble of the signal plus Gaussian white noise to obtain more regular modes.However, the EEMD also created new difficulties.The reconstructed signal contains residual noise, and different realizations of signal plus noise may produce different numbers of modes.To overcome these difficulties, another EMD method has been proposed and successfully applied to vibration signal analysis, complete EEMD (CEEMD), which provides an exact reconstruction of the original signal and a better spectral separation of the modes [5,6].Han and van der Baan used CEEMD to analyze the synthetic and real seismic data and obtained a good result [7].In our study, CEEMD is selected to adaptively decompose signals into a small number of IMFs or modes, and the Short-Time Fourier Transform (STFT) algorithm and time-frequency entropy analysis method are simultaneously used to obtain the fault feature vectors composed by multiscale time-frequency entropy.This feature extraction method is defined as the CEEMD-STFT time-frequency entropy method.
After the fault feature is extracted, a classifier is exploited to automatically achieve mode classification.Support Vector Machine (SVM) is a powerful machine learning method based on the statistical learning theory and structural risk minimization principle that has been successfully applied to fault diagnosis and satisfactorily solved the overfitting and local optimal solution problem [8].However, there are no elegant approaches to solve multiclass problems.A better alternative is provided by the construction of multiclass SVM [9], which is inherently two-class SVM classifiers.In this paper, we build a multiclass SVM classifier to classify the fault mode over the feature vectors, whose dimensions have been compressed using the Principal Component Analysis (PCA) algorithm because the original feature vectors are always too large, complex, and variable for postprocessing.
This paper is organized as follows: Section 2 introduces the relevant feature extraction and mode classification methodology, which includes the CEEMD, STFT, timefrequency entropy, and multiclass SVM method; Section 3 describes the case study to validate the entire method; Section 4 presents the conclusions of this paper.

Methodology
As shown in Figure 1, the complete fault diagnosis scheme has three elements: data preprocessing, fault feature extraction, and fault mode classification.More details are provided in the following parts.

Feature Extraction Based on the CEEMD-STFT
Time-Frequency Entropy Method

Complete Ensemble Empirical Mode Decomposition (CEEMD) (A) Empirical Mode Decomposition (EMD).
The EMD is an adaptive signal analysis method based on the signal characteristic local extrema, which separate a signal into a certain number of IMF components.To be considered an IMF, a signal must satisfy two conditions: (1) the number of extrema and the number of zero-crossings must be equal or differ at most by one; (2) the mean value of the envelope defined by the local maxima and the envelope defined by the local minima is zero at any data location [10].Assume that () is the signal to be decomposed, the concrete steps of the EMD are shown as follows.Step 1 (Initialization).Set  = 0, where  indicates the mode, and the residual component  0 () = ().
Step 2. Calculate the mean envelope () where  max (  ()) and  min (  ()) are the upper and lower envelopes, respectively, which are obtained through cubicspline interpolation on the local maxima and minima of   ().
Each obtained IMF through the EMD contains different frequency components of the signal from high to low frequencies and represent the inherent mode characteristics of the signal.

(B) Ensemble EMD (EEMD).
To solve the "mode mixing" problem, Wu and Huang proposed the Ensemble EMD (EEMD) method [11], which defines the "true" modes as the mean of the corresponding IMFs that are obtained via EMD over an ensemble of the original signal plus different realizations of finite variance white noise [12].Considering  as an example signal, the EEMD algorithm can be described as follows.
Step 3. Assign IMF  as the th mode of , which is obtained by averaging the corresponding modes: However, the EEMD method has some disadvantages: (1) the decomposition is not complete; (2) different realizations of signal plus white noise may generate different numbers of modes.
Step 1.The first IMF is calculated in the identical method to EEMD.First, add white noise to the original signal and obtain the first EMD component of the data with noise.Repeat the decomposition by adding different noise realizations, and compute the ensemble average to define it as the first IMF 1 of the original signal , that is, where   (⋅) is defined as an operator and the th mode can be computed through EMD when it meets a new signal. is the raw signal,   is the different white noise, and  0 is a ratio coefficient.
Step 2. Calculate a unique first residue as Then set  1 +  1  1 (  ) ( = 1, . . ., ) as the new signal for decomposition.When the first IMF component has been obtained, we must calculate the ensemble average as the second component IMF 2 : Step 3. Repeat Steps 1-2, and we can obtain the ( + 1)th IMF component IMF +1 : Step 4. Finally, obtain the last residual function , until the residue cannot be decomposed.Then,  =  − ∑  =1 IMF  , where  is the total number of IMF.The signal is described as The last step makes the proposed decomposition complete and provides an exact reconstruction of the original signal.

Short-Time Fourier Transform (STFT)
. The time and frequency information in each IMF relates to the sampling frequency and changes with the signal itself, so research on the time-frequency domain characterization of signals has been a key component of signal analysis [13].The STFT is a popular method to analyze nonstationary signals.The STFT of the signal () is defined as where ℎ() should be a low-pass filter, and ‖ℎ‖ 2 = 1.Note that ℎ( − ) ⋅  2 has its energy concentrated at time  and frequency .Thus, |(, )| 2 can be considered the energy in () at frequency  and time .Generally, one displays the energy at each time and frequency pair, that is, (, ) is known as the spectrogram (SP) of () [14].The spectrogram algorithm is an analysis algorithm that produces a two-dimensional image representation of vibration signals.The Power Spectrum Density (PSD) function (, ) is expressed as a Pseudo Color Map (PCM), which is a spectrogram with a time axis and a frequency axis.This timefrequency spectrum, which can be called the "visual language," shows the modulation characteristics of the signals.

Time-Frequency Entropy.
The time-frequency distribution of the vibration signal obtained through the STFT method presents modulation characteristic, that is, the energy distribution changes at different moments.Therefore, a fault can be detected by comparing the energy distribution of the signals with and without fault conditions in the timefrequency domain, which indicates that the energy variation in the time-frequency plane may indicate a fault occurrence [15].Because the spectrogram can provide an accurate energy-frequency-time distribution, the information entropy theory, which measures the uniformity of the probability distribution, can be introduced to the time-frequency distribution to quantitatively describe the divergence in different operating conditions [16].Let a time-frequency plane have  blocks with equal areas, where the information source for the entire plane is  and for each block is   ( = 1, . . ., ), so the probability that each information source appears in the entire system is According to the information entropy calculation [17], the time-frequency entropy is defined as Now, we can consider the time-frequency entropy of each IMF as the extracted feature vectors, which will be the input of the mode classification.

Support Vector Machine (SVM)
. Support Vector Machine (SVM), which originated from the statistical learning theory and an optimal separating hyper-plane in the case of linear separation, was developed by Cortes and his coworker [18].Through some nonlinear mapping functions, the original mode space is mapped into the high-dimensional feature space Z.Then the optical separating hyper-plane is constructed in the feature space.Consequently, the nonlinear problem in the low-dimensional space corresponds to the linear problem in the high-dimensional space.

Two-Class SVM.
SVMs are primarily designed for 2class classification problems.To illustrate the basic principle, a schematic diagram of 2-class SVM is shown in Figure 2, where two different classes (circles and triangles) are classified by a linear boundary , and the distance between the boundary and the nearest data point in each class is maximal.
Assume that the input vector is , which is mapped into high-dimensional space Z through the nonlinear mapping function (), and the linear function ( ⋅ ()) +  = 0 in the high-dimensional feature space can be used to construct the optimal classification hyper-plane.The training data are set as {  ,   },  = 1, 2, . . ., ;   ∈   ,   ∈ {−1, +1},   is the corresponding label of   .Then,  is a weight vector, and the margin is 1/‖‖.The following constraint optimization problem is the solution of maximizing the margin 1/‖‖: where coefficient  is a penalty factor and   is a slack factor [16].In addition, using the duality theory of optimization theory and Kernel function, the final decision function is described by where (  , ) is the kernel function, which satisfies Mercer condition; the constants   are named Lagrange multipliers and are determined in the optimization procedure.The typical kernel functions are the polynomial kernel, Radial Basis Function (RBF) kernel, sigmoid kernel, and linear kernel.In many practical applications, the RBF kernel has the highest classification accuracy rate compared to the other kernel functions, so we mainly consider the RBF kernel in this paper.
The SVM was originally designed for binary classification and had good performance, but it still faced many difficulties in addressing multiclass classification problems.The SVM is not sufficient to handle a practical situation.

Multiclass SVM.
Currently, several methods based on the SVM have been proposed for multiclass classification, such as "one-against-all," "one-against-one," and Directed Acyclic Graph (DAG).Experiments indicate that the "oneagainst-one" and DAG-SVM methods are most suitable for practical situation.In this paper, the "one-against-one" method is selected for classification [19].
Let us suppose that the training data set is  = {( 1 ,  1 ), ( 2 ,  2 ), . . ., (  ,   )},   ∈   and the "one-againstone" method constructs  2  = ( − 1)/2 classifiers, each of which is trained using the data from two classes.For example, we should solve the following binary classification problem for the training data from the th and th classes: min  , , , , When testing is performed for the unknown sample , we construct all ( − 1)/2 classifiers to realize the class discrimination and make decisions using the following voting strategy: if sgn(( , ⋅   ) +  , ),  is in the th class, then, the vote for the th class is increased by one; otherwise, the th class is increased by one.Finally, we predict that  is in the class with the largest vote [20].

Experimental Verification
3.1.Experiment Setup.The plunger pump test-rig is shown in Figure 3; from this test-rig, the original vibration signals were obtained to verify the proposed method.The vibration data were obtained from the front side of the hydraulic pump with a stabilized motor speed of 528 r/min and a sampling rate of 1000 Hz.In this experiment, two commonly occurring faults were set: swash plate wear and rotor wear.Under three conditions (two faulty conditions and the normal state), 20 groups of samples (1024 sampling points for each group) were selected for the analysis.

Feature Extraction Based on the CEEMD-STFT and Time-Frequency Entropy
(1) CEEMD Model.The parameters of the CEEMD model were set as follows: the noise standard deviation (Nstd) was 0.2, the Number of Realizations (NR) was 600, and the maximum number of sifting iterations allowed (MaxIter) was 5000.The original signals of each state were decomposed into a series of IMFs; the first six IMFs were selected for further analysis, as shown in Figure 4.
(2) Procedure of the STFT and Time-Frequency Entropy Acquisition.The parameters of STFT were selected as follows: the length of the window, number of overlaps, and sampling frequency (fs) were 256, 254, and 1000, respectively, and the length of the discrete Fourier transforms was equal to the window length.Then, the time-frequency matrices or spectrograms of each state were obtained in Figure 5.
The time-frequency entropy of each state can be calculated based on the time-frequency matrices.The timefrequency block was set as length = width = 64, and both the lateral and longitudinal slip steps were 32.Then, a sixdimensional time-frequency entropy was obtained for each group, which is one of the fault feature vectors.All of the fault features are listed in Table 1.
(3) Feature Dimension Reduction Based on PCA.To improve the accuracy and robustness of the fault diagnosis, dimension reduction is necessary for the high dimensional fault feature vectors.PCA, which is an important and powerful methods to extract the most significant information from data and compress the size of the data [21], was used to acquire the three-dimensional feature vectors in Table 2.
The clustering result of the fault features is visually displayed in Figure 6, which obviously shows a good performance of the hydraulic-pump fault mode classification.

Fault Mode Classification Based on Multiclass SVM.
The extracted fault feature sets were divided into training data and testing data (the first ten groups were set as the training data and the remainder was set as the testing data for every state).First, the training multiclass SVM classifier was  trained as previously proposed with the training data.Then, the trained classifier was used to classify the fault mode of the testing data and calculate the recognition accuracy.The classification results of the testing data are shown in Table 3 and Figure 7.These testing results verify that the recognition performance is absolutely good, and the multiclass SVM method is notably effective for mode classification.
Combining the clustering figure and multiclass SVM classification results, the effectiveness and feasibility of this method for hydraulic-pump fault diagnosis were proven, and a high classification performance was also obviously obtained.

Conclusion
An effective method for the feature extraction and mode classification of vibration signals has been performed in this paper, and this algorithm was successfully verified on practical signals from a hydraulic pump.The CEEMD model, which is an improvement of EMD and can solve the "mode mixing" problem, was combined with the STFT analysis method and time-frequency entropy calculation to extract the robust and significant fault feature.Meanwhile, the multiclass SVM classifier was selected to process the small sample and multiplefault situation, and it obtained a perfect classification result.Then, the accuracy and feasibility of this hydraulic-pump fault diagnosis method were demonstrated.Future work will concentrate on the application of this method to other objects or fields for signal analysis and fault diagnosis.

Figure 2 :
Figure 2: Illustration for data classification using a 2-class SVM.

Figure 5 :
Figure 5: Spectrograms of the first IMF of each state.

Figure 7 :
Figure 7: Classification results of the testing data.

Table 2 :
Feature vector set after dimension reduction.

Table 3 :
Classification results of the testing data.Clustering result of the fault features.