Trace Ratio Criterion-Based Kernel Discriminant Analysis for Fault Diagnosis of Rolling Element Bearings Using Binary Immune Genetic Algorithm

The rolling element bearing is a core component of many systems such as aircraft, train, steamboat, and machine tool, and their failure can lead to reduced capability, downtime, and even catastrophic breakdowns. Due to misoperation, manufacturing deficiencies, or the lack ofmonitoring andmaintenance, it is often found to be themost unreliable component within these systems. Therefore, effective and efficient fault diagnosis of rolling element bearings has an important role in ensuring the continued safe and reliable operation of their host systems. This study presents a trace ratio criterion-based kernel discriminant analysis (TR-KDA) for fault diagnosis of rolling element bearings. The binary immune genetic algorithm (BIGA) is employed to solve the trace ratio problem in TR-KDA. The numerical results obtained using extensive simulation indicate that the proposed TR-KDA using BIGA (called TR-KDA-BIGA) can effectively and efficiently classify different classes of rolling element bearing data, while also providing the capability of real-time visualization that is very useful for the practitioners to monitor the health status of rolling element bearings. Empirical comparisons show that the proposed TR-KDA-BIGA performs better than existing methods in classifying different classes of rolling element bearing data.The proposed TR-KDA-BIGAmay be a promising tool for fault diagnosis of rolling element bearings.


Introduction
The rolling element bearing is a core component of many systems such as aircraft, train, steamboat, and machine tool, and their failure can lead to reduced capability, downtime, and even catastrophic breakdowns [1][2][3][4][5][6].Due to misoperation, manufacturing deficiencies, or the lack of monitoring and maintenance, it is often found to be the most unreliable component within these systems.Therefore, effective and efficient fault diagnosis of rolling element bearings has an important role in ensuring the continued safe and reliable operation of their host systems.
Over the past few years, much research effort has been devoted to developing approaches to fault diagnosis of rolling element bearings.When faults occur in rolling element bearings, vibration signals in the relevant time/frequencydomain have been demonstrated to deviate from their normal ones because of the increased friction and impulsive forces [7][8][9][10].Usually, several dozens or even hundreds of time/frequency-domain features are calculated from the bearing vibration signals to represent the different health status.In the current study, 9 time-domain features and 6 timefrequency-domain features are extracted from the bearing vibration signals to jointly construct a 15-dimension feature vector.In that way, fault diagnosis of rolling element bearings is usually solved as a high-dimensional pattern recognition problem.However, for high-dimensional data, the intrinsic dimension may be small.For example, the number of features responsible for a certain type of fault pattern may be small.Moreover, projection of high-dimensional data onto 2-or 2 Shock and Vibration 3-dimension subspace can provide real-time visualization, which is convenient for the user to monitor the health status of rolling element bearings.In addition, projection of high-dimensional data onto low dimension subspace also plays a part of data compression, which is helpful for efficient storage and retrieval.Thus, dimensionality reduction techniques are often used to project the high-dimensional feature space to a lower-dimensional space while preserving most of "intrinsic information" contained in the data properties [11][12][13][14][15]. Upon performing dimensionality reduction on the data, its compact representation can be utilized for succeeding tasks (e.g., visualization and classification).Among various dimensionality reduction methods [16][17][18][19][20][21][22][23][24], principal component analysis (PCA) and linear discriminant analysis (LDA) are the two most common methods [21].The former is an unsupervised method, which pursues the direction of maximum variance for optimal reconstruction.The latter is a supervised method, which aims to maximize the between-class scatter while minimizing the within-class scatter.Owning to the utilization of labeled information, the latter generally outperforms the former if sufficient labeled samples are provided [21].In the past few years, a series of studies have been conducted to formulate the LDAs for pattern recognition by Fukunaga [21], Wang et al. [22], Sun and Chen [25], Guo et al. [26], Zhao et al. [27], Jin et al. [28], Jia et al. [29], and so on.Generally, the formulation of LDAs is based on the ratio trace criterion but not trace ratio criterion, because the ratio trace problem is more tractable than the trace ratio problem.Nevertheless, as pointed out by Wang et al. [22], solutions obtained based on ratio trace criterion may deviate from the original intent of the trace ratio problems.To improve the behaviour of LDA implementation, Wang et al. [22], Guo et al. [26], Zhao et al. [27], Jin et al. [28], and Jia et al. [29] presented various trace ratio criterion-based LDAs (TR-LDAs), in which the numerator and denominator of the criterion directly reflect Euclidean distances between of inter-and intraclass samples.Another advantage of trace ratio criterion is that the calculated projection matrix is orthogonal, which can eliminate the redundancy between different projection directions.In addition, the orthogonal projection can thus preserve such similarities without any change when using Euclidean distance to evaluate the similarity between data points [22].Although the above TR-LDA formulation methods have the aforementioned advantages, they are criticized due to their incapability of dealing with the redundancy among eigenvectors.For example, if the most discriminative eigenvector is duplicated several times, the above TR-LDA formulation methods are prone to selecting all of them.This is problematic for selection of an optimal subset of eigenvectors because other discriminative and complementary eigenvectors will be missed.A classifier with the eigenvectors selected in this way can give rise to poor classification performance.Therefore, the issue of TR-LDA formulation has remained unresolved.
A review of the related literature also indicates that most of the previous work in the area of applying LDA or TR-LDA to fault diagnosis assumed that samples in each class follow a linear distribution.However, in many fault diagnosis practices, samples in each class that may follow a nonlinear distribution cannot satisfy the assumption.Without this assumption, the separation of different classes may not be well characterized by the scatter matrices, causing the classification results to be degraded [21].To solve this problem, kernel trick [30][31][32], which is to extend many linear methods to its nonlinear kernel version, can be used to extend TR-LDA to handle nonlinear problem.Thus, this study develops a nonlinear kernel version of TR-LDA, that is, trace ratio criterion-based kernel discriminant analysis (TR-KDA), for fault diagnosis of rolling element bearings.However, like many other TR-LDA models, the TR-KDA model presented in this study shares the trace ratio problem in the formulation of projection matrix.Although the above TR-LDA formulation methods have the aforementioned advantages, they are criticized due to the inability to handle redundancy in eigenvector selection.For example, if the most discriminative eigenvector is duplicated several times, the above TR-LDA formulation methods are prone to selecting all of them.This is problematic for selecting the best set of eigenvectors because other discriminative and complementary eigenvectors will be missed.A classifier with the eigenvectors selected in such a way can lead to a poor classification performance.Fortunately, immune genetic algorithm (IGA), a novel evolutionary computation technique developed by Jiao and Wang [33], has the potential to determine a set of discriminative and mutually irredundant eigenvectors.In this study, we propose a method called TR-KDA-BIGA that uses binary IGA (BIGA) to formulate TR-KDA for dimensionality reduction of statistical and wavelet features extracted from the vibration signals and gives rise to effective and efficient fault diagnosis of rolling element bearings.In particular the contributions are to (i) use immune evolutionary computation technique such as BIGA to obtain a reduced set of discriminative and mutually irredundant eigenvectors for TR-KDA-BIGA formulation, (ii) provide the capability of two-dimensional representation of bearing data that is very useful for the practitioners to monitor the health status of bearings, (iii) build a TR-KDA-BIGA model architecture for the vibration measurements for effective and efficient fault diagnosis of rolling element bearings.
The rest of this study is structured as follows.Section 2 briefly reviews the basic concepts of TR-LDA and kernel extension.Section 3 presents a TR-KDA-BIGA method.Section 4 discusses its convergence and initialization.Section 5 conducts performance evaluations of the proposed TR-KDA-BIGA on benchmark problems.Section 6 describes an overall flowchart of the proposed TR-KDA-BIGA for fault diagnosis of rolling element bearings.Section 7 summarizes the conclusions drawn from this study.

Review of TR-LDA and Kernel Extension
2.1.Review of TR-LDA.Suppose we are given a set of  dimensional samples x 1 , x 2 , . . ., x  , belonging to  different classes.The goal of LDA tries to obtain a linear projection matrix W ∈ R × that can map the original -dimensional data x  onto the -dimensional data y  (usually  ≪ ) by maximizing the between-class scatter and meanwhile minimizing the within-class scatter.The between-class scatter matrix S  and the within-class scatter matrix S  are expressed as follows: where m represents the total sample mean vector,   represents the number of samples in the th class, m () represents the average vector of the th class, and x ()  represents theth sample in the th class.The new mapped feature vectors   ∈ R  can then be expressed as y  = W T x  .The original LDA formulation, known as the Fisher LDA [21], only handles binary classification problems.However, many practical applications involve multiclass classification.In order to overcome this issue, a number of researchers have proposed optimization criteria for extending the Fisher LDA to handle multiclass classification problems.The first optimization criterion is in a ratio trace form (referred to as RT-LDA): where Tr[⋅] denotes the matrix trace; I is an identity matrix.
In order to achieve a set of orthogonal normalized vectors, it usually adds the constraint W T W = I to (2).The second optimization criterion is in a trace ratio form (referred to as TR-LDA): The optimization problem in (3) can be solved directly through the generalized eigenvalue decomposition (GED) method [22]: where   is the th largest eigenvalue, w  is the eigenvector corresponding to   , and w  constitutes the th column vector of the matrix W.Although a closed-form solution for (3) can be approximately obtained with the GED, it does not necessarily guarantee best trace ratio optimization.Thus, this approximation of ratio trace optimization to trace ratio optimization may lead to classification capability loss of the derived optimal low-dimensional feature space.Moreover, the physical meaning of the trace ratio form is clearer than that of the ratio trace form.However, the optimization problem in (3) is generally nonconvex and a closed-form solution for it does not exist.Fortunately, a recent study conducted by Guo et al. [26] showed that, using the trace difference function Thus, these eigenvectors with the largest eigenvalues are not necessarily representative for discriminating one class from others as previously mentioned in Section 1.To overcome the above shortcomings, this study presents a BIGA-based solution method for trace ratio criterion (to be in detail discussed in Section 3).

Kernel Extension.
In some applications, it is insufficient to model the data using the TR-LDA, which is a linear discriminating method.To address the issue of nonlinearities in the data, this section presents a nonlinear discriminating method using kernel trick [30][31][32], that is, TR-KDA.The so-called kernel trick is to map the original data to a highdimensional Hilbert space through a nonlinear mapping function .Let (X) denote the data matrix in the Hilbert space: (X) = [(x 1 ), (x 2 ), . . ., (x  )].The function form of the mapping does not need to be known since it is implicitly defined by the choice of kernel function (x  , x  ) = (x  ) T (x  ), that is, the inner product in the kernel-induced feature space.The kernel function  may be any positive kernel satisfying Mercer's condition.Radial basis function (RBF) kernel function, one of the most popular kernel functions employed in various kernelled learning algorithms, is adopted in this study.Then, (3) in Hilbert space can be written as follows: where W  , S   , and S   are the matrices in Hilbert space corresponding to W, S  , and S  in (3), respectively.Notably, we can show that matrices S  and S  in (3) can be essentially expressed as S  = XL  X T and S  = XL  X T through simple manipulation, respectively.X is the vector, where X = [x 1 , x 2 , . . ., x  ].Matrices L  and L  are the graph Laplacian matrices [34] of the weighted undirected graphs reflecting the between-class and within-class relationship of the samples.Consider where e = [1, 1, . . ., 1] T is an -dimensional vector.We can simplify the above equation even further by defining that Thus, we get Then, the matrix S  can similarly be computed as follows: where is an   ×   matrix, and X  L  X T  is the data covariance matrix of the th class.Based on (7), the above equation can be simplified similarly by defining Thus, we get Using the definitions in ( 9) and ( 12), ( 5) can be rewritten as follows: In order to pursue the matrix W  * , solving the above equation involves decomposition of (X) into an orthogonal matrix Q (satisfying QQ T = I) and a right triangular matrix R such that Q * R = (X).We have Let us map W  into the span of Q. Q is currently an orthogonal basis of (X), so we have where V  = R × is an orthogonal matrix satisfying (V  ) T V  = I.Using the definitions in ( 14) and ( 15), ( 13) can be further rewritten as follows: Let S = RL  R T and S = RL  R T ; then ( 16) can be further rewritten as follows: After the matrix V  * is obtained with the BIGA-based solution method (to be in detail discussed in Section 3), the output points in the reduced data space can thus be expressed as

The Proposed TR-KDA-BIGA
As previously mentioned, construction of TR-KDA needs to select  out of  eigenvectors to form the matrix V  for dimensionality reduction.However, finding a subset of eigenvectors based on the trace ratio criterion is not an easy task since the space of possible subsets is very large especially when  is a large number.Thus, it is not impractical to use exhaustive search to find an optimal subset of  eigenvectors.Instead, in this study, the BIGA is utilized to select  out of  eigenvectors of V   as the bases for projection matrix formulation based on the trace ratio criterion such that the trace ratio value ] can be maximized.Immune genetic algorithm, originally developed by Jiao and Wang [33], is a novel genetic algorithm based on the biological immune theory, which combined the immune mechanism with the evolutionary mechanism.In what follows, further discussion of the proposed TR-KDA-BIGA is carried out.

Chromosome Encoding.
Encoding a solution of a problem into a chromosome is an important issue when using BIGAs.In this study, every chromosome in a BIGA corresponds to a discrete binary selector u = [ 1 ,  2 , . . .,   ], where each gene in the chromosome is "1," indicating an eigenvector k   ( = 1, 2, . . ., ) of S −   S appearing in forming the projection matrix V  of the th step, while "0" denotes its absence.Thus, the length of the chromosome is .

Genetic Operators.
Genetic operators give every chromosome the chance to become the fittest chromosome of its generation.If it is difficult to reach the target of trace ratio optimization, crossover and mutation may introduce degeneracy into generations of chromosomes.

Crossover Operator.
Crossover operator in a BIGA is employed to generate two new children chromosomes based on two existing parent chromosomes selected from the current population in terms of a prespecified crossover rate.In this study, "one-point" crossover operator was adopted to randomly select a cut point to exchange the parts between the cut point and the end of the string of the parent chromosomes.Specifically, suppose that two parent chromosomes  1 and  2 selected randomly from the population are undergoing the crossover operation at a randomly selected crossover point  (1 ≤  ≤ ), where  1 = ( ,1 ,  ,2 , . . .,  , ) ,  2 = ( ,1 ,  ,2 , . . .,  , ) .

Mutation
Operator.Mutation operator in a BIGA is used primarily as a mechanism for maintaining diversity in the population.For each gene in a chromosome that is undergoing the mutation, a real-valued number is randomly selected within the range of [0, 1].If the real-valued number is less than the prespecified mutation rate, then the gene will change from "0-bit" to "1-bit" and vice versa.Upon adding (or removing) one eigenvector in that way, we shall randomly remove (or add) a different one such that the number of eigenvectors to be included in the subset is equal to .The mutation operator helps the chromosomes to guide the search in new areas.

Immune Operators.
The immune ability of BIGAs is realized through two kinds of immune operators: a vaccination and an immune selection.The vaccination is responsible for improving individuals' overall fitness levels.The immune selection is responsible for prevention of deterioration.
Shock and Vibration

Vaccination
Operator.Given a chromosome u, vaccination operation in a BIGA is employed to modify the genes on some bits according to a priori knowledge such that individuals with higher fitness have a greater probability of being selected.Let U = (u 1 , u 2 , . . ., u  0 ) be a population; the vaccination operation on U means that the operation is performed on   =  0 chromosomes selected from U according to the proportion of , where  0 represents the population size of a BIGA.A vaccine is abstracted from the prior knowledge of the pending problem, whose information amount and validity play an important role in the performance of the algorithm.

Immune Selection
Operator.The immune selection operation consists of the following two steps.The first step is the immunity test: if the fitness of a chromosome u is smaller than that of the parent chromosome, which indicates that degeneration occurred during crossover and mutation, then the parent chromosome will be used for the next competition.The second step is the annealing selection [35]: a chromosome u  is selected from the current offspring population U  to join with the new parents with the probability as follows: where (u  ) is the fitness of the individual u  and {  } is the temperature-controlled series tending towards 0.

Fitness Evaluation.
Fitness evaluation plays a critical role in selecting offspring chromosomes from the current population for the next generation.In this study, the fitness function for eigenvector selection is defined as where ℎ  denotes the v T  S v  value for the th eigenvector,   denotes the v T  S v  value for the th eigenvector, h = [ℎ 1 , ℎ 2 , . . ., ℎ  ], g = [ 1 ,  2 , . . .,   ], u = [ 1 ,  2 , . . .,   ],   ∈ {0,1}, u1 T = , and  = 1, 2, . . ., .Notably, u is called the binary selector and  is the desired lower feature dimension.Finally, according to the evolved binary selector u, we can thus form the projection matrix V  of the th step by choosing the  eigenvectors with   = 1 ( = 1, 2, . . ., ).The procedures of the proposed TR-KDA-BIGA are summarized in the procedures of the proposed TR-KDA-BIGA part.The computational flow of the BIGA obtained using the aforementioned genetic and immune operators is also provided in the computational flow of the BIGA part.
The Procedures of the Proposed TR-KDA-BIGA.The procedures are as follows: (1) Construct the kernel matrix K = ((X)) T (X).
(2) Perform Cholesky decomposition to the kernel matrix K = R T R. (3) Form the kernel scatter matrixes as S = RL  R T and S = RL  R T .
(5) Set the initial trace ratio value   to Tr( S )/Tr( S ).
],  =  + 1, and go to step (6).Repeat this procedure until a convergence condition was established when the trace ratio value does not increase in consecutive 5 iterations.
The Computational Flow of the BIGA.The computational flow is as follows: (1) Set  (time of generation) to 1.
(3) Evaluate each chromosome in the original population   .
(4) Abstract vaccines according to the prior knowledge.
(5) Check for termination criteria.If the fixed number of generations is not reached or the optimal chromosome found thus far is not satisfied, then go to the next step.Otherwise, output the optimal chromosome as the final solutions for further decision-making.

The Convergence of the Proposed TR-KDA-BIGA
In this section, we analyze the convergence of the proposed TR-LDA-BIGA.Before doing this task, it should be worth noting that the BIGA is convergent.It has been demonstrated by Jiao and Wang [33] that as long as enough iteration has been completed, the immune genetic population converges towards the true optimum with probability one.Recall the trace difference function it follows that Since Consider the inequality and the equation and we have Consequently, Substituting the subscript  + 1 by  yields So we obtain the following inequality which gives the first expression of convergence of the proposed TR-KDA-BIGA.Further, suppose that  * is the optimal trace ratio value; it follows that where V  * is the optimal projection matrix.We therefore have Consider ( * ) = 0, ( +1 ) ≥ 0 and S is semipositive definite; we have So we obtain the following inequality which gives the second expression of convergence of the proposed TR-KDA-BIGA: We conclude therefore that, for a particular initial trace ratio value   , the updated value  +1 can always satisfy (1)  +1 ≥   and (2)  +1 ≤  * .

Performance Evaluation on Benchmark Problems
In order to extensively verify the performance of the proposed TR-KDA-BIGA, it is first tested on wide types of commonly used benchmark problems taken from the UCI machine learning repository and evaluated with the classification rate (i.e., the number of correctly identified training examples/total number of training examples) by comparison with other existing methods such as PCA, LDA, KPCA [30,31], KDA [32], and TR-LDA.These data sets include Heartstatlog, Ionosphere, Iris, Wine, Waveform, Balance, and Synthetic Control Chart Time Series (SCCTS) data sets (Table 1), which are of small sizes, low dimensions, large sizes, and/or high dimensions.For comparative study, we randomly select 50% data points from each data set as training set and the rest of the data points as test set.All methods use training set

Standard deviation
where () is a digital signal series,  = 1, 2, . . ., ,  is the number of elements of the digital signal, and  = ∑  =1 ()/ and  rms = √ ∑  =1 () 2 / are the mean value and root-mean-square value of the digital signal series, respectively.in the output reduced space to train one nearest neighborhood (1NN) classifier for evaluating the classification rate of test set.To restrict the influence of random effects, the experiments of PCA, LDA, KPCA, KDA, TR-LDA, and TR-KDA-BIGA compared on each benchmark problem are independently performed for 20 runs.Table 2 compares the classification rate for benchmark problems of the proposed TR-KDA-BIGA with that of the PCA, LDA, KPCA, KDA, and TR-LDA.As seen in Table 2, the proposed TR-KDA-BIGA can perform better than all the compared methods, except in the case of Heart-statlog.
The results obtained demonstrate the ability of the proposed TR-KDA-BIGA in classifying different classes well.Thus, the proposed TR-KDA-BIGA may be effectively employed for fault diagnosis of rolling element bearings.

The Proposed TR-KDA Using BIGA for Fault Diagnosis of Rolling Element Bearings
In this section, the proposed TR-KDA-BIGA is applied to fault diagnosis of rolling element bearings.Vibration signals resulting from rolling element bearings are first filtered by using a low-pass filter.Then, the filtered vibration signals are divided into sections of equal window length.One set of relevant features obtained from each window is used for characterizing to some extent the health status of the rolling element bearings.Most of the faults occurring in rolling element bearings will introduce the increased friction and impulsive forces when bearings are rotating, which generally lead the vibration signals in time-domain, frequency-domain, and/or time-frequency-domain to vary (become different) from the normal ones.In this study, 9 time-domain statistical features (Table 3) are extracted from the vibration signal.All of these 9 time-domain statistical features reflect the characteristics of time series data in the time-domain.Moreover, 6 timefrequency-domain wavelet features about the percentages of energy corresponding to wavelet coefficients are extracted from the vibration signal by using Daubechies-4 (db4) wavelet to decompose the vibration signal into five levels [32].Wavelet features extracted in such a way can to the greatest extent reflect the vibration energy distribution in the timefrequency-domain. Thus, 9 time-domain statistical features together with 6 time-frequency-domain wavelet features are used to represent each window's vibration signals.to detect such impacts that behaved like damped oscillations.Vibration signals were captured from four different health statuses of bearing, that is, normal bearings (Normal), inner race fault (IR), ball fault (BA), and outer race fault (OR).For each of the three abnormal statuses (IR, BA, and OR), there are three different levels of severity with fault diameters (0.007 inches, 0.014 inches, and 0.021 inches).All the experiments were done for three different load conditions (1 HP, 2 HP, and 3 HP).Figure 1 illustrates the experimental setup.Experimental data were collected from the drive end ball bearing of an induction motor (Reliance Electric 2 HP IQPreAlert) driven test rig.Table 4 gives a short description of rolling element bearing data.

Visualization of Bearing Data.
Visualization performances of the proposed TR-KDA-BIGA are compared with those of PCA, LDA, KPCA, KDA, and TR-LDA using simulations, where KPCA and KDA are the kernel extensions to PCA and LDA, respectively.The two-dimensional visualization results of bearing data for three different load conditions (1, 2, and 3 HP) obtained with PCA, LDA, KPCA, KDA, TR-LDA, and the proposed TR-KDA-BIGA are summarized in Figures 2, 3, and 4, respectively.As seen in Figures 2, 3, and 4, the proposed TR-KDA-BIGA outperforms all the compared methods in not only closely conglomerating bearing data belonging to the same class but also clearly separating bearing data belonging to different classes of three different load conditions (1, 2, and 3 HP).Compared with the unsupervised methods (i.e., PCA and KPCA), the supervised methods (i.e., LDA, KDA, TR-LDA, and TR-KDA-BIGA) can preserve more discriminative information embedded in bearing data and obtain clearer and less overlapped boundaries.It can also be concluded from Figures 2, 3, and 4 that the methods using kernel trick (i.e., KPCA, KDA, and TR-KDA-BIGA) performed better than the methods without using kernel trick (i.e., PCA, LDA, and TR-LDA) in separating the discriminative property-samples from different classes in the learned subspace.

Classification of Bearing Data.
Classification performances of the proposed TR-KDA-BIGA are compared with those of PCA, LDA, KPCA, KDA, and TR-LDA.In order to show the robustness of the proposed TR-KDA-BIGA, we

Conclusions
The rolling element bearing is a core component of many systems, and their failure can lead to reduced capability, downtime, and even catastrophic breakdowns.Effective and efficient fault diagnosis of rolling element bearings plays an extremely important role in the safe and reliable operation of their host systems.classify different classes of rolling element bearing data, while also providing the capability of real-time visualization that is very useful for the practitioners to monitor the health status of rolling element bearings.Empirical comparisons show that the proposed TR-KDA-BIGA performs better than existing methods in classifying different rolling element bearing data.
The proposed TR-KDA-BIGA may be a promising tool for fault diagnosis of rolling element bearings.Three research directions are worth pursuing.First, although this study considers the specific fault diagnosis of rolling element bearings, the proposed method can be modified and extended to address the fault diagnosis of gearboxes [37,38] and cutting tools [39,40].Second, frequencydomain information can be utilized for fault diagnosis of rolling element bearings [41,42]; it would thus be interesting to integrate frequency-domain features to time-domain and time-frequency-domain features.Third, empirical mode decomposition is a very powerful tool for nonlinear and nonstationary signal processing [43][44][45]; it would be also interesting to employ the empirical mode decomposition to extract periodic components and random transient components from the bearing vibration signal mixture, which may be very helpful for extraction of fault signatures from a collected bearing vibration signal.

( 6 )( 7 )( 8 )( 9 )
Perform crossover operation on the   and then generate the population   .Perform mutation operation on the   and then generate the population results   .Perform vaccination operation on the   and then generate the population   .Perform immune selection operation on the   and then generate the next generational population  +1 .Go to step (3).

Figure 1 :
Figure 1: Experimental setup: (a) test rig; (b) schematic description of the test rig.

S 𝐵 W]/Tr[W T S 𝑊 W] in
an iterative and incremental manner.The W in the th iteration step (referred to as W  ) is obtained through solving the trace difference problem argmaxW T W=I Tr[W T (S  −  S  )W],where   represents the trace ratio value derived from the W in the previous iteration step (referred to as W −1 ).However, the initialization for the W influences substantially the convergence performance of the ITR algorithm.
A good initialization can generally make the ITR algorithm yield a quick convergence.A bad initialization usually increases the number of iterations.Moreover, in ITR algorithm, although it seems that the W formed with these eigenvectors corresponding to the  largest eigenvalues of S  −  S  can maximize the trace difference Tr[W T (S  −   S  )W], it cannot necessarily maximize the trace ratio Tr[W T S  W]/Tr[W T S  W].On the other hand, from the perspective of fault diagnosis, the aim is mainly to find a set of projection vectors that can pose the highest levels of discrimination in the different fault patterns.

Table 1 :
Specification of benchmark problems.

Table 2 :
Results of the classification rate for the benchmark problems (mean ± derivation).

Table 4 :
Description of rolling element bearing data.

Table 5 :
Classification accuracy of bearing data under 1 HP motor load.

Table 6 :
Classification accuracy of bearing data under 2 HP motor load.

Table 7 :
Classification accuracy of bearing data under 3 HP motor load.