Experimental Investigation for Fault Diagnosis Based on a Hybrid Approach Using Wavelet Packet and Support Vector Classification

To deal with the difficulty to obtain a large number of fault samples under the practical condition for mechanical fault diagnosis, a hybrid method that combined wavelet packet decomposition and support vector classification (SVC) is proposed. The wavelet packet is employed to decompose the vibration signal to obtain the energy ratio in each frequency band. Taking energy ratios as feature vectors, the pattern recognition results are obtained by the SVC. The rolling bearing and gear fault diagnostic results of the typical experimental platform show that the present approach is robust to noise and has higher classification accuracy and, thus, provides a better way to diagnose mechanical faults under the condition of small fault samples.


Introduction
The bearing and gear are the most critical and frequently encountered components in vast majority of rotating machinery. Their operating state directly affects the machine performance, efficiency, and life. Therefore, fault identification of rolling element bearing and gear has been the subject of extensive research.
Vibration analysis has been established as the most common and reliable method of analysis. Generally, the vibration signals can be used to detect the incipient fault of the machine components and reduce the possibility of catastrophic damage and the down time, through the on-line monitoring and diagnosis system [1,2]. The extracted features include time domain features such as root mean square, variance, skewness, and kurtosis [3][4][5], frequency domain features such as content at the feature frequency and the amplitudes of frequency spectrum [6,7], and time frequency domain features such as the statistical characteristics of shorttime Fourier transform (STFT), Wigner-Viller distribution (WVD), wavelet transform (WT), and so forth [8][9][10]. The WT method possesses perfect local property in both time space and frequency space, and it is used widely in the region of machinery fault detection and identification [11][12][13]. However, the WT cannot split the high frequency band where the modulation information of machine fault is often involved in. The wavelet package transform (WPT) can overcome the difficulty. Nikolaou and Antoniadis proposed a method for the analysis of vibration signals resulting from bearing with localized defects using the wavelet packet transform [14]. Fan and Zuo combined Hilbert transform and wavelet packet transforms to extract modulating signal and detect the early gear fault [15]. Wang and Lin investigated fault signals denoising processing using wavelet packet decomposition coefficients to identify the weak fault characteristic frequency of rolling bearings under strong background noise [16]. However, these investigations did not combine intelligent fault diagnosis techniques to further recognize faults. The carefully selected vibration signals are necessary to match the theoretical fault frequency. For this reason, the advantages of WPT are not reflected.
Many intelligent classification algorithms, such as artificial neural networks (ANNs) and support vector classification (SVC), have been proposed to detect mechanical faults and recognize machine conditions [1][2][3]. The main difference between ANNs and SVC is in their risk minimization. In the    case of SVC, structural risk minimization principle is used to minimize an upper bound based on an expected risk. In ANNs, traditional empirical risk minimization is employed to minimize the error in training of data. The difference in risk minimization leads to a better generalization performance for SVC than ANNs. Thukaram et al. [17] compared the differences between the ANNs and SVC in identifying the fault. Crampton and Mason [18] found that when the data contains noise, the fault detection using support vector machine (SVM) is more effective than other intelligent techniques. However, only ANNs or SVC can not obtain satisfactory classification results from high level ambient noise. Therefore, in recent years, more and more researchers focus on the hybrid approach using WPT and SVC for fault classification. Bin et al. [19] combined WPT and empirical mode decomposition to extract fault feature frequency and further employed ANNs to detect faults in rotating machinery. Hu et al. [20] presented a hybrid approach for bearing fault diagnosis using WPT and SVC. Xian and Zeng developed Hu's scheme for bearing fault diagnosis using WPT and hybrid SVC [21]. Shen et al. proposed a new scheme using the extraction of statistical parameters from WPT of original signals, a distance evaluation technique, and a support vector regression (SVR) based generic multiclass solver [22]. However, due to the limitation of machinery fault simulator, the fault samples used in the above investigations came from a single data source, mostly from the Case Western Reserve University official website. Therefore, the superiority of the hybrid approaches is not confirmed. For the above reasons, this paper presents a hybrid approach for bearing and gear fault diagnosis based on a hybrid approach using wavelet packet decomposition and SVC. To validate the proposed method, we carry out experimental investigations using the machinery fault simulator (MFS-MG). A large number of experimental data is collected for bearing and gear under different working conditions. Our test results have shown that the proposed approach is effective and can further detect mechanical faults with an agreeable precision.

Wavelet Packet Decomposition and Support Vector Classification Diagnostic Principles
The fault pattern recognition flowchart based on wavelet packet decomposition and SVC is shown in Figure 1. The wavelet packet is employed to decompose the vibration signal to obtain the energy ratio in each frequency band. Taking these energy ratios as feature vectors, we can detect faults from the determined fault type through the trained SVC. The detailed procedures are described in Sections 2.1 and 2.2.

3-Layer Decomposition for Each Signal Using Wavelet
Packet. When decomposing the vibration signal using wavelet packet, the binary tree structure will be obtained. The final layer of the binary tree structure of the energy ratio in each frequency band can be obtained through calling the wenergy function. According to WPT theory, index ( , ) represents the th layer and the th node ( = 2 ) as well as a certain signal component (frequency band). For example, (0, 1) represents the original signal, (1, 1) represents the low frequency wavelet packet decomposition coefficients of the first layer, and (1, 2) represents the high frequency wavelet packet decomposition coefficients of the same layer. For the 3-layer ( = 3) case, the corresponding node = 1, 2, . . . , 8. Therefore, we have eight indexes (3, j) representing eight signal components (frequency bands).

Obtaining the Energy Ratio in Each Frequency Band.
Generally, the frequency bands are not arranged in accordance with the frequency order from low to high. When the original vibration signal is decomposed by wavelet packet (through the high-pass filter and downsampling procedure), the spectral sequence will be flipped. Therefore, to calculate the energy ratios, we should adjust the order of the corresponding frequency bands. For the 3-layer ( = 3) case, the exact frequency bands in accordance with the frequency from

Construct Feature
Vector. The energy of the vibration component in each band will be changed when the faults of the mechanical system occurred. For the different faults, the energy ratios will be changed accordingly. Therefore, for the 3-layer ( = 3) case of wavelet packet decomposition, a feature vector T can be constructed by the eight energy ratios as Here, 3 ( = 1, 2, 3, . . . , 8) represent all of the energy ratios for the 3rd layer and the th node.

Fault Pattern Recognition
Using SVC. The basic theory for SVC is summarized in this section [24]. Assume that a training set is given by where ∈ and ∈ {−1, +1}. The goal of SVMs is to find an optimal hyper plane such that where the weight vector ∈ and the bias is a scalar. If the inequality in (3) holds for all training data, it will be a linearity separable case. Therefore, to find the optimal hyper plane, one can solve the following constrained optimization problem: The Scientific World Journal If inequality in (3) does not hold for some data points in , SVMs become linearly not separable. To find an optimal hyper plane, we have to solve the following constrained optimization problem: By introducing a set of Lagrange multipliers , for constraints, the problem becomes the one to find the saddle point of the Lagrangian. Therefore, the dual problem becomes Minimize 0 ≤ ≤ , = 1, 2, . . . , .
If 0 < ≤ , the corresponding data points are called support vectors (SVs).
SVMs map the input vector into a higher dimensional feature and, thus, can solve the nonlinear case. By choosing a nonlinear mapping function ( ) ∈ , where > , the SVM can construct an optimal hyper plane in the new Therefore, the dual optimization problem becomes and the constraints are the same as shown in (7) and (8); the only requirement on the kernel ( , ) is to satisfy the Mercer's theorem [24]. Using Kernel functions, every data will be classified as ∈ { positive class, if ( ) > 0 negative class, if ( ) < 0 (11) in which the decision function is The typical examples of kernel function are polynomial kernel, radial basis function (RBF) kernel, sigmoid kernel, and linear kernel. In many practical applications [25][26][27], the RBF kernel obtains the highest classification accuracy rate than other kernel functions. Therefore, the RBF kernel is employed in the present investigation. Support vector machines were originally designed for binary classification. How to effectively extend it for multiclass classification is still an ongoing research issue. Currently there are several methods that have been proposed for multiclass classification, such as one-against-one, one-against-all, and directed acyclic graph (DAG). Hsu and Lin [28] gave a comparison of these methods and pointed out that the oneagainst-one method is more suitable for practical use than others. In the present, the one-against-one method is applied to detect the faults of bearings and gears.

Experimental Investigation
In this section, two typical fault diagnosis experiments for bearings and gears are given to testify the performance of the proposed hybrid approach. Figure 2. It includes speed monitor, manual speed governor, acceleration sensors, speed sensor, motor, spindle, bearings, and so forth. During the experiment, the data are acquired by an accelerometer mounted on the top of the bearing holder on left side. The vibration signals of the five fault types are collected, that is, normal case, rolling element fault, inner race fault, outer race fault, and compound fault (including inner race, outer race, and rolling element faults). The spindle speed is 1792 rpm and the end of the experimental bearing (see Figure 2) is free of loading. The bearings with typical faults are shown in Figure 3: the bearing model is ER-12K, the bearing pitch diameter is 33.4772 mm, the number of rolling element is 8, and the rolling element diameter is 7.9375 mm. Figures 3(a), 3(b), 3(c), and 3(d) show four fault cases, that is, a bearing with compound faults, a bearing with rolling element fault, a bearing with inner race fault, and a bearing with outer race fault, respectively.

Bearing Fault Diagnosis. The MFS-MG experimental platform for the bearing fault simulation [29] is shown in
In the present, the sampling frequency is 25.6 k. 163800 data points (bearing vibration signal) are collected at one running in different cases and divided into 50 sections. Each signal contains 3276 data points. Db18 wavelet [23] is used to decompose each signal into three layers and gain 8 subbands in the final layer. The energy ratio of each band  is obtained through calling the wenergy function in the wavelet toolbox of Matlab. Therefore, as shown in (1), a 1 × 8 vector (feature vector) is obtained. Figures 4, 5, 6, 7, and 8, respectively, represent the graph of the original signals and the corresponding eight energy ratios distribution maps for five cases including normal case and four fault cases. For each of the five cases, we collect two bearing vibration signals (each signal contains 163800 data points) at two runnings. The first signal is employed to train the SVM and the second signal to be tested. According to the above description, 50 feature vectors can be extracted from each signal. Therefore, 50 feature vectors (samples) in the first running are served as training samples and the other 50 feature vectors (samples) in the second running are the fault samples to be tested (classified). Tables 1 and 2 give the first three training samples of the first signal and the first three test samples of the second signal. To represent the five cases numerically, we label the normal case, the rolling element fault, the inner race fault, the outer race fault, and the compound fault as 1 to 5, respectively. It points out that they are called standard labels.
In the present investigation, we adopt the SVM toolkit programmed by Franc and Hlavác of the Czech Technical University [30].
For general analysis, it is desirable to use normalized, nondimensional parameters. The normalized parameters also speed up the computational process. Therefore, prior to the training of the SVC model, all samples data are normalized to be bounded by [0, 1].

8
The Scientific World Journal where X and are, respectively, the samples and label and is the number of samples. From the above, the data for samples constitute a 50 × 8 matrix and the label lead to a 50 × 1 vector. Then, the first signal (50 segments) in all cases inputs the support vector machine for training. That means it includes two matrices, that is, the 250 × 8 matrix of samples and the 250 × 1 vector of label. After training, the second signal (50 segments) in all cases separately inputs the trained SVM for testing. Therefore, it includes a 50 × 8 vector of test samples. The predicted output (predicted label) will be obtained when the test is completed, and the recognition rate of every case can be obtained through the comparison of the predicted label with the standard labels. For each training and prediction, arg, (arg = 1, = 10, suggested by Franc and Hlavác [30]) and radial basis functions are selected as kernel argument, regularization constant, and kernel functions, respectively. Table 3 shows the SVC results. From Table 3, we can see that the recognition rate of rolling element fault is 86% and the inner race fault is 96%. For the other three cases, that is, the normal case, the outer race fault, and compound fault, we obtain high performance results with 100% accuracy.

Gear Fault Diagnosis.
The MFS-MG experimental platform for the gear fault simulation [29] is shown in Figure 9. Compared to the bearing fault simulator, the gear fault simulator has a slight difference; that is, a gearbox and a transmission belt are added. In addition, two normal bearings are installed at both the bearing holder on left side and the bearing holder on right side. During the experiment, the data are acquired by an accelerometer mounted on the top of the gearbox. The vibration signals of four cases are collected, that is, normal case, broken teeth, wears, and gear missing teeth. The spindle speed is 1764 rpm and the end of the gearbox is free of loading, the gearbox transmission ratio is 1.5 : 1, the gear teeth is 18, pitch diameter is 28.575 mm, and helix angle is 33 ∘ 41 . Two fault cases, that is, a gear with broken teeth and a gear with missing teeth, are shown in Figures 10(a) and 10(b), respectively.
Similar to bearing fault diagnosis, we use the same sampling frequency and also 163800 data points (gear vibration signal). To detect the faults using SVC, we proceed with the same procedures as shown in Section 3.1. The original signals and the corresponding eight energy ratios distribution maps for four cases, including normal case and three fault cases (broken teeth, wear and missing teeth), are shown in Figures  11, 12, 13, and 14, respectively.
The first three training samples of the first signal and the first three test samples of the second signal are shown in Tables 4 and 5, respectively. To represent the four cases numerically, we define the standard labels, that is, the normal case, the broken teeth case, the wear case, and the missing teeth case as 1 to 4, respectively.  The SVC results are shown in Table 6. The recognition rates for the normal case, the broken teeth, the wear, and the missing teeth are 94%, 96%, 100%, and 100%, respectively.
Based on the above experimental investigations, the proposed hybrid approach is reasonably effective for detecting different kinds of faults in both bearings and gears.

Conclusion
This paper proposes a hybrid approach using wavelet packet and SVC to classify faults for bears and gears. The collected vibration signals are directly employed as inputs without any pretreatment. The signals are decomposed by wavelet packet and the energy ratios of all frequency bands are calculated to construct the feature vectors so as to train and test the support vector machines to predict the fault type of bearings and gears. Experimental investigations for bearing fault diagnosis and gear fault diagnosis are made using MFS-MG experimental platform (the bearing fault simulator and the gear fault simulator). The results show that the proposed hybrid approach is effective to the rotating and the transmission structures. Moreover, the present approach has a good recognition rate not only for a single fault but also for the compound fault.