Aero-Engine Fault Diagnosis Using Improved Local Discriminant Bases and Support Vector Machine

This paper presents an effective approach for aero-engine fault diagnosis with focus on rub-impact, through combination of improved local discriminant bases (LDB) with support vector machine (SVM). The improved LDB algorithm, using both the normalized energy difference and the relative entropy as quantificationmeasures, is applied to choose the optimal set of orthogonal subspaces for wavelet packet transform(WPT-) based signal decomposition. Then two optimal sets of orthogonal subspaces have been obtained and the energy features extracted from those subspaces appearing in both sets will be selected as input to a SVM classifier to diagnose aero-engine faults. Experiment studies conducted on an aero-engine rub-impact test system have verified the effectiveness of the proposed approach for classifying working conditions of aero-engines.


Introduction
Aero-engine is one of the key components in an airplane and its reliability directly affects the flight safety of the airplane.However, in order to maintain good performance under high speed running condition, the clearance between rotor and stator in the aero-engine is getting smaller and smaller.This increases the possibility of rub-impact [1], which will generate unexpected vibrations, making the aeroengine not functioning well and even causing catastrophic consequences.Therefore, identifying the rub-impact fault in the aero-engine at its early stage is of great significance to both research and industrial communities.
As it is known that the rub-impact fault information is often carried by the weak transient vibrations, which are mixed together with other vibration sources, as a result, it is difficult to observe the fault symptom directly from the measured signals.With the development of modern signal processing, some advanced technologies, such as wavelet transform and Hilbert-Huang transform, have been utilized as viable tools for extracting fault-related features from vibration signals.As a classical time-frequency analysis method with solid mathematical foundation, wavelet transform in both continuous and discrete forms has been widely used for fault diagnosis [2][3][4][5][6][7].As an extension of the discrete wavelet transform (DWT), the WPT has also been successfully applied to the field of fault diagnosis.For example, Boškoski and Juričić [8] proposed a novel approach for the diagnosis of gearboxes in presumably nonstationary and unknown operating conditions by making use of information indices based on the Renyi entropy derived from coefficients of the WPT of measured vibration records.Shen et al. [9] extracted statistical parameters from the signals obtained via the WPT at different decomposition depths and proposed a support vector regressive-(SVR-) based generic multiclass solver to identify the different fault patterns of rotating machinery.Keskes et al. [10] utilized stationary WPT for feature extraction under lower sampling rate to detect broken-rotor-bar and used the multiclass SVM to automatically recognize the faults.They utilized WPT to decompose multiclass signals into a library of time-frequency subspaces and calculated the wavelet packet energy in each subspace to produce a feature vector in each signal for classification [11].Among these researches, most of researchers use the wavelet packet coefficients in the last decomposition level to extract the defect features of signals.It should be noted that the WPT has various wavelet packet subbands; thus there are multiple ways (> 2  ) to analyze a signal using a -level decomposition.This implies that the subbands in the last decomposition level may not best reflect the signal feature and makes it necessary to optimize the decomposition process and improve its effectiveness.A widely applied criterion for optimal WPT-based signal decomposition is the Shannon entropy, which can be used to identify orthogonal subspaces with high-energy concentration that correlate with transients of interest by search for the minimum Shannon entropy [12].But this criterion is mainly for signal representation.For classification problem, it is better to find optimal set of orthogonal subspaces that can yield high discriminant information for differentiating various classes as much as possible.In this study, local discriminant bases (LDB) algorithm has been employed to solve this problem.It selects the optimal set of orthogonal subspaces that can provide maximum dissimilarity information among different classes [13,14].Up to date, LDB has been applied to deal with real-world classification problems in the areas of audio signal analysis [15,16], physiological signal classification [17,18], and vibration data processing [13,19].From these applications, it can be seen that the results of LDB algorithm for a given dataset are driven by the nature of the dataset and the dissimilarity measures.At present, various dissimilarity measures, such as Euclidean distance, symmetric relative entropy, relative entropy, energy difference, correlation index, and nonstationarity, have been successfully utilized in many cases.In fact, accuracy of the classification results is highly influenced by the extent of class separation in feature space generated by the chosen dissimilarity measure and most researchers mainly use a single discriminant measure for the optimal subspace selection.
Motivated by these research efforts, an integrated approach that combines improved LDB algorithm with SVM is investigated for area-engine fault diagnosis in this study.The improved LDB utilizes two outstanding dissimilarity measures to choose the optimal set of orthogonal subspaces derived from WPT, and SVM obtains input from energy features derived from the optimal wavelet packet subspaces to classify working conditions of the aero-engine.This paper is organized as follows.Section 2 introduces the principle of the WPT; then the improved LDB algorithm is illustrated in Section 3; subsequently, Section 4 presents a multiclass classification method based on SVM.After that, the scheme for fault diagnosis using improved LDB and SVM is described and experiment study is conducted on an aero-engine rub-impact device to verify the effectiveness of the proposed method in Section 5. Finally, conclusions are drawn in Section 6.

Brief Introduction of WPT
WPT is an extension of DWT and can be obtained by a generalization of the fast pyramidal algorithm [20].Mathematically, a wavelet packet consists of a set of linearly combined wavelet functions, which are generated using the following recursive relationships: where  0 () = () is the scaling function and  1 () = () is the wavelet function.The symbols ℎ() and () represent coefficients of a pair of quadrature mirror filters (QMF) associated with the scaling function and the wavelet function.Furthermore, ℎ() and () are related to each other by () = (−1)  ℎ(1−).Using the QMF, a time-domain signal () can be decomposed recursively as where    () denotes the wavelet packet coefficients at the th level and th subband.The symbol  represents the number of the wavelet coefficients at the th subband within the level .Using this equation, each detailed coefficient vector and approximation coefficient vector can be both decomposed into two parts and then a signal contained in Ω 0,0 space can be decomposed into 2  wavelet packet nodes (denoted as subspace Ω , ) with the form of a full binary tree as shown in Figure 1.Each subspace Ω , can be spanned by a series of base vectors{ ,, } =2 − −1

𝑚=0
, where 2  corresponds to the length of the signal.Then a signal   can be represented by a set of coefficients as Through the 3-level decomposition process as shown in Figure 1, it can be seen that the WPT has various styles for the selection of orthogonal subspaces, such as , Ω 3,7 }.Therefore, the optimal selection of orthogonal subspace set needs to be investigated.

Improved LDB Algorithm
The LDB algorithm is a pruning algorithm which identifies the subspaces that exhibit high discrimination between signal classes by using a given dissimilarity measure [21].LDB selects an orthogonal basis from a dictionary of bases in a wavelet packet to distinguish different classes in a given set of data belonging to several classes and is used to select the optimal set of complete orthogonal subspaces derived from the WPT.
Suppose that  , represents the desired local discriminant basis restricted to the span of  , , which is a set of basis vectors at (, ) node.Then, for a given dataset consisting of  classes of signals {{ ()   } of signals in class , the traditional LDB algorithm with an additive dissimilarity measure D can then be summarized as follows.
(1) The WPT is used to decompose the signals contained in the training dataset.
(4) After a complete set of orthogonal subspaces are found in the decomposition results, their corresponding basis functions are ranked from higher to lower according to their discrimination power, and the t (much less than n) most discriminant basis functions can be used for constructing classifiers.
From the algorithm above, it should be noted that the optimal choice of LDB subspaces for a given dataset is significantly affected by the dissimilarity measures used to distinguish among classes.The dissimilarity measure indirectly controls the classification accuracy achieved.In order to obtain good classification results, a significant dissimilarity measure, which is capable of discriminating among different classes as much as possible, should be studied.However, when dealing with complex datasets such as the vibration signals of aero-engines, using a single dissimilarity measure for the optimal subspace selection may not be able to capture all the characteristic information of its class while using multiple dissimilarity measures provides additional feature dimensions for classification.Hence, instead of using a single dissimilarity measure, a combination of two dissimilarity measures ( 1 and  2 ) with varying complexity is studied to select the LDB with different characteristics to achieve high classification accuracies in the presented approach.
The first dissimilarity measure  1 is defined as the difference in the normalized energy between the corresponding wavelet packet nodes of the training signals from different classes.The normalized energy difference  1 is given by where  1 , and  2 , are the normalized energy of the corresponding wavelet packet nodes (, ), which can be calculated by where  = 0, 1, . . ., ,  = 0, 1, . . ., 2  − 1,  0 = log 2  ≥  ( is the signal size and  0 is the maximum level of signal decomposition).In addition,  ,, is the wavelet packet coefficient of the corresponding nodes (, ) at position () and    represents the total energy of the vibration signals.
It can be seen in ( 5) and ( 7) that  1 and  2 are always nonnegative and will be zero if distributions of  , or  from two classes are the same.Furthermore, the further the two distributions are, the higher the dissimilarity measures  1 and  2 will be.
Similarly, for multiple class ( > 2) problems, the normalized energy difference and the relative entropy can be expressed as Based on the normalized energy difference and relative entropy, the improved LDB selection process in searching the optimal wavelet packet subspaces is shown in Figure 2 and described below.
The vibration signals are first decomposed by the WPT.Then the normalized energy difference and relative entropy of each subspace are calculated among classes using ( 5) and (7).After that, the wavelet packet tree is pruned from bottom to top according to the following rules: if the discriminative measure of the parent node is larger than that of the cumulative discriminative measure of the children nodes, the parent node is kept and the children nodes need to be deleted; otherwise, the children nodes need to be kept and the dissimilarity measure of the parent node should be set as the sum of the dissimilarity measure of the children nodes.At the end of this iterative process, the tree structure contains only those terminal nodes, which contribute to maximizing the distance among different classes.Since we utilized two dissimilarity measures, two optimal local discriminant bases are obtained at last.As  1 is expected to reveal the energy concentration locations on the time-frequency plane for different types of vibration signals while  2 describes the degree of separation between different distribution series, the nodes that exist in both sets possess high discriminatory values among all the classes for both of the given dissimilarity measures and can be used to form feature vectors.
In this study, the energy feature of each subspace is investigated for constructing the feature vector.The energy of each subspace is defined as where  is the number of the wavelet packet coefficients in each subspace and   () is the wavelet packet coefficient.Then, for the  chosen subspaces that exist in both sets in LDB, a feature vector can be constructed from all the subspaces as The vector  will be selected as input to a classifier for identifying aero-engine working conditions.

Multiclass SVM Classifier
SVM is a linear learning method that finds an optimal hyperplane to separate two classes.As a supervised classification approach, SVM seeks to maximize the distance to the closest training point from either class in order to achieve better classification performance on test data [22].Due to the small-sample characteristic of the SVM, it is suitable to distinguish different classes with a small number of data.As the training data is often limited in real-time fault diagnosis, SVM is utilized as a classifier in this study to diagnose different aero-engine working conditions.However, the SVM cannot be directly applied to the multiclassification problems since traditional SVM is designed to deal with the twoclass problem.For multiclass problems, SVM can solve this dilemma through the combination of two-class problems.
The crucial widely used multiclass SVM (MSVM) strategy is the one-against-all (OAA) strategy and one-against-one (OAO) strategy.OAA is the simplest MSVM strategies.It involves  binary SVM classifiers, one for each class.Each binary SVM is trained to separate one class from the rest.The winning class is the one that corresponds to the SVM with the highest output.OAO involves (−1)/2 binary SVM classifiers.Each classifier is trained to separate each pair of classes.The advantage of OAA is the fastness of classification; therefore, a multiclass classification method based on OAA strategy is used in this study.
The multiclass classification method can be clearly described in Figure 3: for -class sample training, −1 SVMs are trained and the first samples are seen as positive samples while the other  − 1 classes are viewed as negative samples to train the SVM1; then the first samples are removed, and the same process will repeat until the ( − 1)th classifier is designed.During the test process, the samples are treated as input to the first classifier and the test will be over only if the output is "1, " which means that the sample class is the corresponding category of the classifier; otherwise, the samples will be sent to the next classifier until the test samples are distinguished.

Fault Diagnosis Scheme with Experimental Verification
Following the knowledge as explained in previous sections, the proposed aero-engine fault diagnosis approach is depicted in Figure 4.It includes two parts: training and testing parts.For training, vibration signals from each of the working conditions are decomposed into wavelet packet trees with a selected wavelet function.After that, the corresponding nodes of the trees are compared using a set of dissimilarity measures to identify the nodes that exhibit high discriminative values among various aero-engine working conditions.After selecting the significant LDB nodes, a new wavelet packet tree is constructed, and all the signals are then decomposed using this new wavelet packet tree.Features are finally extracted from the LDB nodes to train a multiclass SVM classifier.For testing, energy features which are extracted from those selected LDB nodes are input to the trained multiclass SVM classifier for working condition identification.
In order to verify the effectiveness of the proposed aeroengine fault diagnosis approach, an experimental study was carried out on a twin-shaft aero-engine test system.The vibration signals were acquired at 64 kHz sampling rate by a velocity sensor, which was mounted on the outside of Compare the dissimilarity measure Prune wavelet packet tree from bottom to top j = j − 1 j = 0? Use the next dissimilarity measure i = i + 1 i < 3?
Two optimal local discriminant bases A j,k Use the nodes that exist in both sets for feature extraction Initialize decomposition level j = J − 1, the aero-engine casing.Due to the complex structure of the aero-engine, the signals often contain vibrations generated by low pressure shaft, high pressure shaft, and the transmission system, causing nonstationarity.Three different working conditions, that is, faultless, rub-impact fault, and unbalance fault, were considered in this study.Figure 5 shows waveforms of the sampled signals.
The proposed approach is then used to process the vibration signals.It should be noted that an appropriate wavelet function should be chosen for WPT, as it will affect the decomposition performance.In this study, a mutual information criterion is used for guiding the selection of wavelet function [23].In information theory, mutual information is usually used to measure the degree of similarity between two groups of data sequence.The greater the mutual information is, the more similar the two groups of data sequence will be.Such a relationship is applicable to wavelet function selection for aero-engine fault diagnosis by taking the vibration signal and wavelet packet coefficients as data sequences  and , respectively.By comparison, the wavelet function that maximizes the mutual information between the vibration signal and the reconstruction signal represents the most appropriate wavelet for rub-impact vibration extraction.Based on this criterion, a total of 30 candidate wavelet functions (e.g., Haar, Db2, Db4, Coif1, Coif2, Bior1.3,Bior5.5, etc.) are evaluated, and the Bior5.5 wavelet is considered as the most appropriate wavelet function to process the rub-impact signals.After that the aero-engine vibration signals are processed using the selected wavelet function for a 4-level decomposition and the improved LDB is utilized to select the optimal subspaces derived from the decomposition results.Figure 6 shows the selected wavelet packet nodes (marked with black block) that contain the best discriminant information to classify different working conditions using the dissimilarity measures  1 (normalized energy difference) and  2 (relative entropy), respectively.These blocks in each figure represent   the complete information of the signal with the capability of differentiating various aero-engine working conditions, as manifested by the nature of LDB algorithm.For two dissimilarity measures ( 1 and  2 ), altogether 30 LDB nodes as shown in Figure 6 are identified.Some of the nodes are selected by both dissimilarity measures.These nodes that exist in both of the LDB trees demonstrate relatively high discriminatory behavior among the combinations of all working conditions for both of the given dissimilarity measures.In other words, these nodes demonstrate high statistical distance among all working conditions for both of the given dissimilarity measures.Therefore, the LDBs appearing in both of the LDB trees are used to extract features.Generally, the basis vector coefficients from each of the selected LDB nodes can be directly used as features.However, considering that more features may not necessarily increase the performance of a given classifier, the energy content of the selected LDBs calculated by ( 9) is extracted as features.In this study, the energy values of the 12 LDB nodes that exist in both of the LDB trees (40 groups of training signals and 20 groups of testing signals, each containing 1,024 data points) are input to the multiclass SVM classifier for characterizing aero-engine working conditions.

Mathematical Problems in Engineering
Table 1 lists the classification results of this experimental study.It can be seen that the SVM classifier results in much high classification accuracies scoring 100%, which indicates that the developed approach is suitable for aero-engine fault diagnosis.
For the purpose of performance comparison, the single dissimilarity measure is also used to select the LDB nodes and the corresponding energy features are used as input to the SVM classifier.The classification results are shown in Table 2, which indicates that the diagnosis approach using multiple dissimilarity measures can achieve better classification performance than that using single dissimilarity measure.
The effect of different classifiers, such as the Bayes classifier, hidden Markov model (HMM) classifier, and backpropagation (BP) neural network (NN) classifier, on the classification performance, is also studied.As it is shown in Table 3, the SVM classifier performs the best; this is contributed by its good ability of dealing with small size of samples.
In addition, the effect of wavelet functions on the classification performance is investigated in this study.Three difference wavelet functions, including Haar wavelet, Db2 wavelet, and Bior5.5 wavelet, are used to process the aero-engine

Conclusions
Based on the improved LDB and SVM, an integrated approach for aero-engine fault diagnosis has been developed.The results of experimental study conducted on an aeroengine test system indicate that the proposed approach has good ability to classify different aero-engine working conditions.Furthermore, the comparison study shows that the improved LDB algorithm can improve the classification accuracy, and an appropriate wavelet function provides better signal decomposition.Further study is being conducted for providing effective and efficient solutions on aero-engines condition monitoring and fault diagnosis.
Suppose A J,k = B J,k and set Δ j,k = D i k = 0, . . ., 2 j − 1, where J is the given decomposition level Keep A j,k = B j,k

Figure 2 :
Figure 2: Flow chart of the improved LDB algorithm.

Table 1 :
Results of the experimental study.

Table 2 :
Classification performance using different dissimilarity measures.

Table 3 :
Classification performance using different classifiers.

Table 4 :
Classification performance using different wavelet functions.

Table 4 .
It can be seen that the Bior5.5 wavelet function chosen by the quantitative mutual information measure leads to higher classification rate than the other two wavelet functions.