Dimensionality reduction is a crucial task in machinery fault diagnosis. Recently, as a popular dimensional reduction technology, manifold learning has been successfully used in many fields. However, most of these technologies are not suitable for the task, because they are unsupervised in nature and fail to discover the discriminate structure in the data. To overcome these weaknesses, kernel local linear discriminate (KLLD) algorithm is proposed. KLLD algorithm is a novel algorithm which combines the advantage of neighborhood preserving projections (NPP), Floyd, maximum margin criterion (MMC), and kernel trick. KLLD has four advantages. First of all, KLLD is a supervised dimension reduction method that can overcome the out-of-sample problems. Secondly, short-circuit problem can be avoided. Thirdly, KLLD algorithm can use between-class scatter matrix and inner-class scatter matrix more efficiently. Lastly, kernel trick is included in KLLD algorithm to find more precise solution. The main feature of the proposed method is that it attempts to both preserve the intrinsic neighborhood geometry of the increased data and exact the discriminate information. Experiments have been performed to evaluate the new method. The results show that KLLD has more benefits than traditional methods.
With the information collection technology becoming more and more advanced, a huge number of data have been produced during mechanical equipment running process. The sensitive information which reflects the running status of the equipment has been submerged in a large amount of redundant data. Effective dimensionality reduction can solve this problem. Dimensionality reduction is one of the key technologies for equipment condition monitoring and fault diagnosis. Nonlinear and nonstationary vibration signals generated by the rolling bearing [
Manifold learning is a data-driven approach and can reveal the underlying nature of the complex data structure, which provides a new approach for the analysis of the intrinsic dimension based on the data distribution. Manifold learning has got a series of research achievements in the feature extraction [
NPP [
As for these issues, a fault feature extraction method named KLLD is proposed in the paper. This method studies both the iris dataset and the rolling bear original feature dataset constructed by wavelet packet energy with dimensionality reduction application. The effectiveness of this method is verified by contrast with conventional analysis methods.
The rest of this paper is organized as follows. In Section
The LLE (local linear embedding) algorithm is a new nonlinear data dimension reduction technology, which utilizes local neighborhood relation to learn the global structure of nonlinear manifolds.
For the given
In LLE algorithm low dimensional points can be reconstructed by high-dimensional matrix. Each of the high-dimensional data
Select neighbors by
Reconstruct with linear weights by (
Map to embed coordinates
LLE algorithm processes so many advantages; however, its computation cost is more expensive compared to other linear dimensionality reduction methods. Moreover, it does not have the ability to map new test datasets into low-dimension space directly, which is called out-of-sample problems. NPP derives from LLE and inherits LLE’s neighborhood property. However, it modifies the LLE by introducing a linear matrix
The low-dimensional feature datasets
This algorithm was proposed by Floyd, which is used to calculate the shortest distance between two specified points.
Initialization: computing
For
LDA (linear discriminate analysis) is a popular linear feature exactor. The key step is to find a transform matrix
However, we can find drawbacks, because (
According to LLTSTA derivation process in the literature [
Suppose that
To find local linear discrimination information in the feature space
Traditional Euclidean distance method has many advantages such as perceptual intuitional, easy to understand and calculation. However, Euclidean method could easily lead to short-circuit problem [
The short-circuit problem.
Figure
According to Sections
This step involves computing the Euclidean distance between any two points
Set connection threshold
The distance between
Reconstruct weighting matrix
Calculate matrix
Calculate
Besides the input and output of LLD, kernel parameter
Reconstruct weighting matrix
Calculate
Calculate
We evaluated the performance of the new approach on the iris plants database.
Scatter plots of iris dataset by different features: (a) KPCA Matrix feature of dataset1, (b) KPCA Matrix feature of dataset2, (c) NPP Manifold feature of dataset1, (d) NPP Manifold feature of dataset2, (e) LLD Manifold feature of dataset1, (f) LLD Manifold feature of dataset2, (g) KLLD Manifold feature of dataset1, and (h) KLLD Manifold feature of dataset2.
KPCA and NPP methods hardly discriminate three types of the plant as shown in Figures
Comparison of four dimensionality reduction methods for iris dataset classification.
|
|
|
Average accuracy | |
---|---|---|---|---|
KPCA | 0.960 | 0.760 | 0.760 | 0.8267 |
NPP | 1.0 | 0.80 | 1.0 | 0.9333 |
LLD | 1.0 | 0.920 | 0.920 | 0.9467 |
KLLD | 1.0 | 0.920 | 1.0 | 0.9733 |
The calculation steps are designed as follows.
Collect vibration signal of the rolling bear.
Wavelet packet energy is used to construct the original features datasets.
Projection matrix
Find out dimension reduction result by
KLLD dimensionality reduction process for rolling bearing fault datasets is shown in Figure
Scheme of KLLD dimension reduction for rolling bear dataset.
Under normal and fault operating conditions, time-domain waveform signals are shown in Figure
Waveform of vibration signals from bearing under four conditions: (a) normal, (b) inner race fault, (c) ball fault, and (d) outer race fault.
The time-domain waveform characteristic of bearing inner race fault is typical shock component. The waveform of normal bearing rolling shows the feature of stable and little fluctuation in amplitude. The waveform of rolling element bearings fault includes random single punch strike component, while the time-domain waveform of bearing outer race fault is very similar to the inner race fault waveform. It is hard to grasp the rolling bear feature of different fault condition only from time-domain waveform. Wavelet packet analysis is a precise method for signal analysis. It is widely used in bearing fault diagnosis currently. So we use this method to construct the original feature. Typical fault of wavelet packet energy is shown in Figure
Wavelet packet energy spectrums of bearing vibration signals under four conditions: (a) normal, (b) inner race fault, (c) ball fault, and (d) outer race fault.
To verify the validity of the KLLD method, the experiment was performed on Electrical Engineering Laboratory rolling bear vibration database of Case Western Reserve University. We selected the bearing model SKF6203, with the running speed 1730 rpm under normal, inner race fault, ball fault, and outer race fault. They were processed by wavelet packet decomposition and two original feature datasets were constructed, named dataset1 and dataset2. Both of the datasets have 40 points. Table
Original features constructed by wavelet packet energy spectrums of rolling bear dataset1.
Energy band | Normal | Inner race fault | Outer race fault | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1 | 0.6363 | 0.6593 |
|
0.6046 | 0.0172 | 0.0181 |
|
0.0160 |
|
0.0303 | 0.0287 |
|
0.0284 |
2 | 0.1865 | 0.1622 |
|
0.2291 | 0.4220 | 0.4204 |
|
0.4222 |
|
0.1484 | 0.1579 |
|
0.2012 |
3 | 0.0074 | 0.0075 |
|
0.0071 | 0.0648 | 0.0701 |
|
0.0742 |
|
0.1505 | 0.1522 |
|
0.1463 |
4 | 0.1654 | 0.1666 |
|
0.1554 | 0.0451 | 0.0373 |
|
0.0338 |
|
0.0220 | 0.0223 |
|
0.0275 |
5 | 0.0004 | 0.0004 |
|
0.0004 | 0.0107 | 0.0110 |
|
0.0114 |
|
0.0010 | 0.0010 |
|
0.0009 |
6 | 0.0013 | 0.0012 |
|
0.0012 | 0.0311 | 0.0290 |
|
0.0256 |
|
0.0085 | 0.0087 |
|
0.0073 |
7 | 0.0005 | 0.0004 |
|
0.0005 | 0.3345 | 0.3383 |
|
0.3387 |
|
0.6169 | 0.6076 |
|
0.5681 |
8 | 0.0023 | 0.0025 |
|
0.0018 | 0.0745 | 0.0758 |
|
0.0782 |
|
0.0225 | 0.0216 |
|
0.0203 |
According to [
The parameter
LLE is the basic version of NPP, LLD, and KLLD and it can be used to determine the optimal
Finding optimal
We can find that the distance between different classes is large in numerical while that within the same class is small from Table
Distances between each point of rolling bear dataset1.
|
|
|
|
|
|
|
|
|
|
|
|
| |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
0.0000 | 0.0062 | 0.0206 | 3.9698 | 3.9639 | 3.8892 | 3.3080 | 303006 | 3.0511 | ||||
|
0.0062 | 0.0000 |
|
0.0482 | 4.1455 | 4.1385 |
|
4.0637 |
|
3.3738 | 3.3704 |
|
3.1379 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
0.0206 | 0.0482 |
|
0.0000 | 3.6692 | 3.6591 |
|
3.5795 |
|
3.1847 | 3.1706 |
|
2.8929 |
|
3.9698 | 4.1455 |
|
3.6692 | 0.0000 | 0.0083 |
|
0.0442 |
|
2.3464 | 2.2919 |
|
2.1105 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3.8892 | 4.0637 |
|
3.5795 | 0.0442 | 0.0153 |
|
0.0000 |
|
2.1124 | 2.0596 |
|
1.8712 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
3.0511 | 3.1379 |
|
2.8929 | 2.1105 | 1.9920 |
|
1.8712 |
|
0.0322 | 0.0239 |
|
0.0000 |
KPCA, NPP, and LLD algorithms are also used to analyze the effectiveness of KLLD. Table
Scatter plots of bearing vibration samples by different features: (a) KPCA Matrix feature of dataset1, (b) KPCA Matrix feature of dataset2, (c) NPP Manifold feature of dataset1, (d) NPP Manifold feature of dataset2, (e) LLD Manifold feature of dataset1, (f) LLD Manifold feature of dataset2, (g) KLLD Manifold feature of dataset1, and (h) KLLD Manifold feature of dataset2.
As shown in Figures
The inner-class distance in low dimension is calculated and it is shown in Table
Comparison of within-class distance of low dimension using four dimensionality reduction methods.
Normal | Inner race fault | Ball fault | Outer race fault | |
---|---|---|---|---|
KPCA |
|
|
|
|
|
|
|
|
|
|
||||
NPP |
|
|
|
|
|
|
|
|
|
|
||||
LLD |
|
|
|
|
|
|
|
|
|
|
||||
KLLD |
|
|
|
|
|
|
|
|
Ratio of between-class distance and within-class distance using four dimensionality reduction methods.
KPCA | NPP | LLD | KLLD | |
---|---|---|---|---|
Dataset1 | 159.5 | 517.7 | 226.6 |
|
Dataset2 | 38.97 | 47.28 | 120.8 | 186.4 |
Comparison of four dimensionality reduction methods for classification.
Normal | Inner race fault | Ball fault | Outer race fault | Average accuracy | |
---|---|---|---|---|---|
KPCA | 1.0 | 1.0 | 0.15 | 1.0 | 0.7875 |
NPP | 1.0 | 1.0 | 0.775 | 1.0 | 0.9438 |
LLD | 1.0 | 1.0 | 0.90 | 1.0 | 0.975 |
KLLD | 1.0 | 1.0 | 1.0 | 1.0 | 1.0 |
A novel dimension reduction algorithm for the purpose of discrimination called kernel local linear discriminate (KLLD) has been proposed in this paper. The most prominent property of KLLD is the complete preservation of both discriminate and local geometrical structures in the data. However, traditional dimension reduction algorithm can not properly preserve the discriminate structure. We have applied our algorithm to iris databases dimension reduction. The experiment demonstrated that our algorithm can extract the different kinds of iris features and is suitable for classification. And then we applied KLLD to machinery fault diagnosis. At first, the original feature space of rolling bear dataset was constructed by wavelet energy. Secondly, KLLD algorithm and other dimensionality reduction methods were used, respectively, in the original feature space. Finally, SVM was used for classification. The experiment shows that our method has excellent capability of clustering and dimension reduction.
The authors declare that there is no conflict of interests regarding the publication of this paper.
This work is supported by the National Nature Science Foundation of China (no. 51175316), the Specialized Research Fund for the Doctoral Program of Higher Education (no. 20103108110006), and Shanghai Science and Technology Commission Basic Research Project (no. 11JC1404100).