^{1}

^{1}

^{1}

^{1}

^{1}

^{1}

Rotating machinery vibration signals are nonstationary and nonlinear under complicated operating conditions. It is meaningful to extract optimal features from raw signal and provide accurate fault diagnosis results. In order to resolve the nonlinear problem, an enhancement deep feature extraction method based on Gaussian radial basis kernel function and autoencoder (AE) is proposed. Firstly, kernel function is employed to enhance the feature learning capability, and a new AE is designed termed kernel AE (KAE). Subsequently, a deep neural network is constructed with one KAE and multiple AEs to extract inherent features layer by layer. Finally, softmax is adopted as the classifier to accurately identify different bearing faults, and error backpropagation algorithm is used to fine-tune the model parameters. Aircraft engine intershaft bearing vibration data are used to verify the method. The results confirm that the proposed method has a better feature extraction capability, requires fewer iterations, and has a higher accuracy than standard methods using a stacked AE.

Effective health diagnosis of rolling bearing is a significant initiative in today’s industry. The bearing of rotating machinery will inevitably experience various faults under harsh working conditions such as large loads, strong impacts, and high speed [

The method based on vibration signal has been widely studied and applied in virtue of vibration signals usually carrying rich information [

Deep learning represents a novel pattern recognition approach and was proposed by Hinton and Salakhutdinov in 2006 [

The kernel function is an effective method utilized in machine learning to solve the nonlinear problems; examples include SVM [

In this study, an enhancement deep feature extraction method is developed with one KAE and

The remainder of this paper is organized as follows. Section

A common AE is a three-layer network consisting of an encoder network and a decoded network. The encoder network connects the input layer and the hidden layer, which can obtain the features of the original data. The hidden layer and the output layer are connected by the decoder network that reconstructs the output, which is equal to the input based on the low-dimensional coding data.

The encoder network is defined as an encoding function denoted by

The decoder network is defined as a reconstruction function denoted by

The parameter set

When the loss function is sufficiently small, it can be assumed that the coding vector is capable of reconstructing the original input vector; that is, most of the information contained in the original data is included in the encoding vector. The automatic coding network is also a nonlinear reduction method, in which the dimensions are lower for the encoding vector than the input vector.

An AE is an unsupervised three-layer learning network but its information extraction ability is limited and it lacks sufficient structure to represent the deep characteristics of the signal. A stacked autoencoder (SAE) uses multiple AE layers to develop more hidden layers; each AE layer performs a nonlinear transformation of the input samples from the preceding layer to the following one. During the training process, the hidden layers of the AE layers represent the inputs to the succeeding AE layer and the network uses an unsupervised learning algorithm layer by layer to extract the features from the input data.

The input layer and the first hidden layer of the SAE are regarded as the encoder network of the first AE. After the first AE is trained through minimizing the reconstruction error in (

Then the encode vector

Subsequently, a backpropagation (BP) algorithm is used to fine-tune the network parameters using a supervised approach. The SAE is a type of deep neural network (DNN) that combines a supervised and an unsupervised approach [

The encoding process of the AE is a nonlinear calculation but for low-dimensional raw data an accurate classification requires a large number of iterations and a long calculation time; this approach is also prone to misclassifications. In order to solve this problem, the kernel function method is combined with the AE.

A kernel function is defined as a nonlinear mapping

Based on the above theory, an improved method KAE, which combines the kernel function and the AE, is proposed. First, the Gram matrix of the kernel function is calculated and its input is the new automatic encoder; the coding process changes to

Correspondingly, the decoding function is changed to

The improved AE network firstly maps the data to a high-dimensional space; then the high-dimensional data are coded and calculated and the nonlinear low-dimensional features are obtained. By adding the kernel functions, the original signal components are mapped to a high-dimensional space, which speeds up the coding process and improves the efficiency of extracting the signal characteristics and the classification accuracy. The algorithm structure diagram is shown in Figure

The structure of KAE.

The proposed method can be summarized as follows: choose the KAE as the first layer of the deep network and use the hidden layer of the KAE as the input of the next layer of the AE. Connect multiple AE layers to form a deep network and use the BP algorithm to fine-tune the parameters and obtain the diagnosis model.

Based on the Mercer theorem, any semidefinite function can be used as a kernel function. Common kernel functions include the linear kernel function, polynomial kernel function, and radial basis kernel function. The radial basis function is a real-valued function that only depends on distance. The Gaussian radial basis kernel function uses the Euclidean distance as the distance function; the transformation matrix is qualitatively good and only has one undetermined parameter; as a result, the complexity of the model is low. Therefore, the Gaussian radial basis function is used in this study; its mathematical expression is as follows:

The Gaussian kernel function has only one pending parameter but its performance is directly dependent on the choice of the nuclear parameters [

In this study, an enhanced deep learning method is developed for the fault diagnosis of rotating machinery. A flowchart of the proposed method is shown in Figure

The flowchart of the proposed method.

The fault diagnosis process takes place in a stepwise manner.

The original vibration signal of the bearing is obtained and the selected signal is divided into samples.

The Gaussian kernel parameters

The input data are used to train the KAE and the output data are used to train the next AE.

The output layer is classified and the reverse error propagation algorithm is used to fine-tune the network parameters.

The kernel parameters

The intershaft bearing is one of the key components of an aircraft engine. In order to verify the proposed method for the bearing fault diagnosis, a test rig for aircraft engine intershaft bearing based on a double rotor is used to simulate the different fault types of the bearing. Subsequently, the vibration signal data are analyzed. The test rig is shown in Figure

The test rig of aeroengine intershaft bearing.

The intershaft bearing is fixed at the joint of the low-voltage axis and the high-voltage axis and is connected to two motors. The bearing’s outer ring is connected to the high-voltage end of the shaft and the inner ring is connected to the low-voltage end of the shaft. Four acceleration sensors are installed to collect the vibration signal of the intershaft bearing on the support bearing pedestal of high and low-voltage axis. The hardware acquisition system uses an NI acquisition card to collect the data and the sampling frequency is 25.6 K.

In this case, ten operating conditions are considered, including the inner race fault, outer race fault, and roller fault. The faults are introduced to the intershaft bearing under the running conditions of high-voltage-motor single rotation (HR), low-voltage-motor single rotation (LR), and high-voltage-motor/low-voltage-motor relative rotation (HLR), respectively. In addition, a normal condition of a two-motor relative rotation is tested. The rotation speed of the motors is 20 Hz. The artificial axial crack fault sizes of inner race, outer race, and roller are all 2.0 mm in width and 0.8 mm in depth. The fault grooves in the bearing are machined by an electric spark, as shown in Figure

Faults in the intershaft bearing.

Due to the distance of the sensors from the bearing, the bearing signals will attenuate during transmission and contain noise. In this study, we chose the closest vertical acceleration sensor installed on the high-pressure axis bracket seat of the intershaft bearing for recoding the vibration of the signal data. The experiment was performed four times under the same conditions and each condition consisted of 10 seconds of data. The data was divided into 200 samples, in which each sample is a measured vibration signal consisting of 1200 sampling data points. Random three-quarters of the data were randomly selected to serve as the training set and the remaining 1/4 was used as the test set, the details of the samples are shown in Table

Description of the intershaft operation conditions.

Bearing operating condition | Motor speed (Hz) | Size of training/testing samples | Label | |
---|---|---|---|---|

ILR | Inner fault (LR) | 0/20 | 150/50 | 1 |

IHR | Inner fault (HR) | 20/0 | 150/50 | 2 |

IHLR | Inner fault (HLR) | 20/20 | 150/50 | 3 |

OLR | Outer fault (LR) | 0/20 | 150/50 | 4 |

OHR | Outer fault (HR) | 20/0 | 150/50 | 5 |

OHLR | Outer fault (HLR) | 20/20 | 150/50 | 6 |

RLR | Roller fault (LR) | 0/20 | 150/50 | 7 |

RHR | Roller fault (HR) | 20/0 | 150/50 | 8 |

RHLR | Roller fault (HLR) | 20/20 | 150/50 | 9 |

NHLR | Normal (HLR) | 20/20 | 150/50 | 10 |

In this case study, three experiments are taken into account. In order to verify the diagnosis result, the standard SAE and standard DBN method are compared. A polynomial kernel function (PK) and a power exponent kernel function (PEK) are also used for comparison.

The raw vibration data were used as the input without performing any signal preprocessing or manual feature extraction.

The fast Fourier transformation was implemented on each signal to get the 1200 Fourier coefficients. Then the Fourier coefficients are used as input to feed into different methods for fault classification.

The eighteen statistical features, same as [

Four trials are carried out for the diagnosis and the results of the different methods are shown in Figure

Detailed diagnosis results of the 4 trials in Experiment

Multiclass confusion matrix of the proposed method for the second trial.

In Experiment

Parameter description of the five methods in Experiment

Methods | Parameter description |
---|---|

The proposed method | The network structure parameters are 1200-800-100-20-10, learning rate is 0.3, momentum is 0.5, training iteration number is 10, fine-tuning iterations are 50, the Gaussian kernel parameter is 26.40. |

| |

Standard SAE | The network structure parameters are 1200-800-100-20-10, learning rate is 0.3, momentum is 0.5, training iteration number is 10, fine-tuning iterations are 50. |

| |

Standard DBN | The network structure parameters are 1200-800-100-20-10, learning rate is 0.3, momentum is 0.5, training iteration number is 10, fine-tuning iterations are 50. |

| |

The proposed method with PK | The network structure parameters are 1200-800-100-20-10, learning rate is 0.3, momentum is 0.5, training iteration number is 10, fine-tuning iterations are 50, the PK parameters are |

| |

The proposed method with PEK | The network structure parameters are 1200-800-100-20-10, learning rate is 0.3, momentum is 0.5, training iteration number is 10, fine-tuning iterations are 50, the PEK parameter is 30.00. |

The mean and standard deviation of the test results.

Method | Average of accuracy (%) | Standard deviation of accuracy (%) | ||||
---|---|---|---|---|---|---|

Ex 1 | Ex 2 | Ex 3 | Ex 1 | Ex 2 | Ex 3 | |

The proposed method | | | | | | |

Standard SAE | 44.90 | 96.35 | 43.45 | 4.45 | 1.40 | 5.25 |

Standard DBN | 19.65 | 92.55 | 12.50 | 7.09 | 2.46 | 4.33 |

The proposed method with PK | 24.25 | 87.30 | 86.15 | 4.40 | 0.87 | 6.15 |

The proposed method with PEK | 65.55 | 98.80 | 97.15 | 6.66 | 0.93 | 2.23 |

Ex 1: Experiment

It can be seen from Figure

The classification accuracy of the PK is in the range of 18–30.2%, which is lower than the value for the standard SAE method. The classification accuracy of the PEK for the three trials’ accuracy is about 60% and 77% for the fourth test. This is higher than the accuracy of the traditional SAE, but lower than the accuracy of the Gaussian kernel function method; it also has greater fluctuations.

Figures

Detailed diagnosis results of the 4 trials in Experiment

Detailed diagnosis results of the 4 trials in Experiment

The average accuracy of the PK is 87.30% in Experiment

In Experiment

In Experiment

To further investigate the reasons behind the higher accuracy of the proposed method in Experiment

Curves of the training error of the proposed method, standard SAE method, and standard DBN method.

The accuracy rates of the proposed method, standard SAE, and DBN are all very high in Experiment

Curves of the accuracy and fine-tuning iteration of the proposed method, standard SAE, and standard DBN.

In order to verify the feature extraction ability of the proposed method, the principal components (PCs) of the last AE layer are extracted using the principal component analysis method (PCA).

As shown in Figure

Feature visualization maps of standard SAE in Experiment

Feature visualization maps of the proposed method in Experiment

In Experiment

Feature visualization maps of standard SAE in Experiment

Feature visualization maps of the proposed method in Experiment

In Experiment

Feature visualization maps of standard SAE in Experiment

Feature visualization maps of the proposed method in Experiment

In conclusion, the combination of the Gaussian kernel function and the deep AE network expands the applications of the traditional SAE method, achieves a higher accuracy with improved extraction capability, a more obvious clustering center, and a better diagnosis effect, and requires fewer iterations.

In this paper, a novel method which combines a Gaussian kernel function with the deep AE network is proposed for bearing fault diagnosis. The proposed method can be divided into three major steps. Firstly, kernel function is employed to enhance the feature learning capability, and a new AE is designed termed kernel AE (KAE). Subsequently, a deep neural network is constructed with one KAE and multiple AEs to extract inherent features layer by layer. Finally, softmax is adopted as the classifier to accurately identify different bearing faults, and error backpropagation algorithm is used to fine-tune the model parameters.

Compared to conventional deep learning methods, the proposed method has a better feature extraction ability with better clustering effect, and results in a better diagnosis effect, higher accuracy, and wider range of application. The main contributions of this paper are (i) to introduce the kernel function to AE and form a new KAE network for better processing the nonlinear component; (ii) to propose a new deep feature learning method constructed with one KAE and multi -AEs to automatically and effectively learn the fault features from the raw vibration signals; and (iii) to conduct three experiments on the aircraft engine intershaft bearing test rig, which verify that the proposed method has better feature extraction ability and better fault diagnosis results than standard SAE.

In addition, the investigation indicates that the proposed fault diagnosis method has great potential to be an effective tool for fault diagnosis of rolling bearings and the authors will continue to investigate this topic in the future.

The authors declare that they have no conflicts of interest.

This work is financially supported by the National Natural Science Foundation of China (no. 51375067).