Remaining useful life (RUL) prediction plays a significant role in developing the condition-based maintenance and improving the reliability and safety of machines. This paper proposes a remaining useful life prediction scheme combining deep-learning-based health indicator and a new relevance vector machine. First, both one-dimensional time-series information and two-dimensional time-frequency maps are input into a hybrid deep-learning structure network consisting of convolutional neural network (CNN) and long short-term memory network (LSTM) to construct health indicator (HI). Then, the prediction results and confidence interval are calculated by a new RVM enhanced by a polynomial regression model. The proposed method is verified by the public PRONOSTIA bearing datasets. Experimental results demonstrate the effectiveness of the proposed method in improving the prediction accuracy and analyzing the prediction uncertainty.
Rotating machinery has played an essential role in industrial applications. However, most rotating machinery operates under severe working conditions which may cause different types of faults. Therefore, timely maintenance is vital for the reliability of the rotating machinery [
The data-driven techniques for RUL of machinery mainly consist of two steps: health indicator (HI) construction and remaining useful life prediction based on the constructed HI [
Recently, the deep-learning network has shown great potential in dealing with big data [
Relevance vector machine (RVM) is an artificial intelligence method to learn the machinery degradation patterns from available data instead of building statistical models. It can deal with the prognostic issues of sophisticated machinery whose degradation process is challenging to be interrelated by the statistical model [
There are many sources of uncertainty in RUL prediction, such as measurement error, randomness of load, degradation feature extraction error, and modeling error, which need to be quantified and managed during the prediction process, and the confidence interval of forecast results is given to facilitate the planning of maintenance. At present, the research on the uncertainty of RUL mainly focuses on statistical data-driven methods. The statistical data-driven method is based on the theory of probability and statistics. Through statistical or random model, the probability distribution of the remaining life can be solved naturally, which is easy to quantify the uncertainty of the prediction results of the remaining useful life. Liao et al. [
Although the deep-learning-based HI construction methods and RVM-based RUL prediction methods have been widely studied, the methods combining them are relatively lacking. To fill the research gap, a new RUL prediction scheme that combines a new deep-learning structure-based HI construction method and a new RVM-based RUL prediction method is proposed. The new RUL prediction scheme can not only learn the degradation process features from different types of data and get RUL prediction result automatically but also provide a confidence interval (CI).
The contributions of this paper can be summarized as follows: A new deep-learning structure that can deal with one-dimensional time-series data and two-dimensional image data simultaneously is proposed to construct HI. The constructed HI has better performance compared with other deep-learning-based HI construction methods. The proposed systematic approach integrates deep-learning-based HI and a new RVM-based prediction method into a framework to realize the goal of estimating RUL automatically and provide a confidence interval. A new RVM model is proposed by combining traditional RVM and polynomial regression model, improving the long-term prediction accuracy.
The paper is organized as follows: Section
The proposed RUL prediction scheme mainly consists of four functional layers, which are time-series information learning layer, time-frequency map information learning layer, fully connected layer, and RUL prediction layer. The hybrid deep-learning structure consists of two parallel paths followed by a fully connected multilayer neural network to use the information contained in the original data fully. The two parallel paths are time-series information learning layer constructed by long short-term memory (LSTM) neural network and time-frequency map information layer made up of convolutional neural network (CNN), respectively. The LSTM is used to extract temporal features, while the CNN is utilized to extract spatial features, which are then fused by fully connected layer to construct an HI. Finally, the HI is put into RUL prediction layer to get the remaining useful time and its confidence intervals. The theoretical background of each layer is introduced as follows.
The time-series information learning layer mainly consists of the long short-term memory network. The long short-term memory network is a state-of-the-art sequence data processing method. It develops from the recurrent neural network with a memory cell, which overcomes the problem of gradient vanishing or exploding. Figure
Structure of LSTM.
The memory cell of LSTM mainly consists of an input gate, output gate, and forget gate. Equations (
In the above equations,
The time-frequency map information learning layer is made up of a deep convolutional neural network, consisting of a convolutional layer and a pooling layer.
In the convolutional layer, local features are generated by convolutional kernels from the feature maps. Then, the convolutional results are input into the activation layer to construct the feature maps of the current layer, whose equation process is as follows [
In the above equation,
In the pooling layer, the feature is extracted from feature maps with the subsampling method to increase computational efficiency. The max-pooling method is given as
In the above equation,
The fully collected layer is added after the time-series information learning layer and time-frequency map information learning layer. The features leaning from the above two layers are flattened to construct the fully connected layer, which can be represented by the following equation:
In the above equation,
This layer can filer the unwanted measurement noise and manage the uncertainty in prognostics. The RUL prediction layer is constructed with a new relevance vector machine (RVM) combining the traditional RVM method with polynomial models.
RVM is a kernel function algorithm based on Bayesian inference framework [
According to the Bayesian inference, the likelihood of the dataset
Maximum-likelihood estimation of
In the above equation,
The posterior over the unknowns could be computed with Bayes’ rule, given the defined noninformative prior distribution.
Equation (
The posterior distribution of the weights is
The posterior covariance and the mean of equation (
As can be seen from equations (
In the training process, most
The polynomial models are suitable for long-term RUL prediction. Polynomial regression belongs to the least-square curve fitting family. Specifically speaking, it estimates the coefficients of a polynomial function to approximate the curve closely. The mathematical expression of polynomial regression is as follows:
In this paper, we take advantage of the RVM and polynomial model, the response variable
In the process of performance degradation of rolling bearings, vibration acceleration signals have nonstationary characteristics. Time-frequency analysis includes both time-domain information and frequency-domain information, which can effectively characterize the characteristics of nonstationary signals. Continuous wavelet transform is a time-frequency analysis method commonly used in state monitoring of rotating machinery. The calculation formula is as follows:
A hybrid deep-learning structure that can learn temporal features and spatial features simultaneously is proposed to take advantage of mutual information from multidimensional features for degradation assessment and RUL prediction. What is more, the training set is constructed with historical whole lifetime monitoring data. Then the training set consisting of different HI curves is used to train the RVM. The sparsity of RVM regression is highly dependent on the choice of kernel functions. The common kernel functions are classified into local kernels and global kernels. In local kernels, only the data points that are close or in proximity of each other have an effect on the kernel values.
In contrast, a global kernel allows data points that are far away from each other to affect the kernel values as well. Furthermore, the common global kernels are polynomial function, spline function, and so forth [
The proposed RUL scheme.
First, the time-series information including time-domain features, frequency-domain features, and time-frequency map information of the whole lifetime is extracted from the original vibration signal. Different information is processed by different information learning layer. Then, a fully connected layer is used to combine different features learned from the time-series information leaning layer and time-frequency map information learning layer together. The HI is constructed by a three-layer neural network using combined information. Finally, the constructed HI curve is used to predict the RUL with the RUL prediction layer, which is constructed by the RVM and polynomial model.
At the inspection time
In the above equation,
In this section, the run-to-failure data acquired from accelerated degradation tests of rolling element bearings are used to verify the effectiveness and superiority of the proposed RUL scheme in practical applications. The experimental data comes from PROGNOSTIA in the IEEE PHM 2012 Data Challenge [
The experiment platform.
In this experiment, 17 rolling element bearings working under three different conditions are tested. The experimental conditions are listed in Table
The experimental conditions.
Working condition | Rotate speed (rpm) | Load (N) |
---|---|---|
1 | 1800 | 4000 |
2 | 1650 | 4200 |
3 | 1500 | 5000 |
Dataset.
Dataset | Condition 1 | Condition 2 | Condition 3 |
---|---|---|---|
Training set | Bearing1_1 | Bearing2_1 | Bearing3_1 |
Bearing1_2 | Bearing2_2 | Bearing3_2 | |
Testing set | Bearing1_3 | Bearing2_3 | Bearing3_3 |
Bearing1_4 | Bearing2_4 | ||
Bearing1_5 | Bearing2_5 | ||
Bearing1_6 | Bearing2_6 | ||
Bearing1_7 | Bearing2_7 |
The whole lifetime data of the first bearing is selected to be analyzed. The acceleration signal on the horizontal direction shown in Figure
The acceleration signal on the horizontal direction.
Features extracted from the vibration signal.
Type | Feature | |
---|---|---|
Time-domain features | A1: root mean square | A2: kurtosis |
A3: peak-peak value | A4: shape factor | |
A5: peak factor | A6: impulse factor | |
A7: clearance factor | A8: mean absolute | |
A9: standard deviation | A10: crest factor | |
Frequency-domain features | B1: mean value | B2: standard deviation |
B3: skewness | B4: kurtosis | |
B5–B12: entropy of different frequency band | ||
Time-frequency features | C1: time-frequency map |
The training data can be presented by
In the proposed deep learning network, the convolution structure mainly refers to classical AlexNet network and time-series information learning layer constructed by stacking three-LSTM-layer network. The literature shows that the network structure can effectively extract the characteristics of time-series data. The CNN and LSTM connected layer is used to connect the information extracted by time-series information learning layer and time-frequency map information learning layer together, which can get degradation information comprehensively. Finally, a fully connected layer is constructed to output the final result. Detailed network parameters can be seen in Table
Parameters of hybrid deep learning network.
Network layer | Parameters | ||||
---|---|---|---|---|---|
Input picture size | Number of channels | Convolution kernel size | Step size | Number of nodes | |
Input layer | [100 × 100] | 3 | — | — | — |
Convolutional layer 1 | — | 96 | [11 × 11] | 4 | — |
Pooling layer 1 | — | 96 | [3 × 3] | 2 | — |
Convolutional layer 2 | — | 256 | [5 × 5] | 1 | — |
Pooling layer 2 | — | 256 | [3 × 3] | 2 | — |
Convolutional layer 3 | — | 384 | [3 × 3] | 1 | — |
Convolutional layer 4 | — | 384 | [3 × 3] | 1 | — |
Convolutional layer 5 | — | 256 | [3 × 3] | 1 | — |
Pooling layer 5 | — | 256 | [3 × 3] | 2 | — |
CNN flatten layer | — | 7424 (6400 + 1024) | — | — | — |
LSTM layer 1 | — | — | — | 80 | |
LSTM layer 2 | — | — | — | — | 60 |
LSTM layer 3 | — | — | — | — | 30 |
LSTM flatten layer | — | — | — | — | 30 |
CNN + LSTM connected layer | — | — | — | 7454 (7424 + 30) | |
Fully connected layer 1 | — | 4096 | — | — | — |
Fully connected layer 2 | — | 1000 | — | — | — |
Output layer | — | 1 | — | — | — |
The HI constructed in Section
RUL prediction result of bearing 1_5 at inspection time
Figure
The HI curve constructed by different method. (a) Root-mean-square-based HI. (b) Kurtosis-based HI. (c) Crest-factor-based HI. (d) Peak-peak-based HI. (e) CNN-based HI. (f) LSTM-based HI. (g) Hybrid-structure-based HI.
Figures
Figure
The RUL prediction results of bearing1_3. (a) LSTM-based HI for RUL. (b) CNN-based HI for RUL. (c) Hybrid-structure-based HI for RUL.
The RUL prediction results of all the test bearings are shown in Figure
The RUL prediction results of all the test bearings. (a) Bearing 1_3 performance degradation trend prediction. (b) Bearing 1_4 performance degradation trend prediction. (c) Bearing 1_5 performance degradation trend prediction. (d) Bearing 1_6 performance degradation trend prediction. (e) Bearing 1_7 performance degradation trend prediction. (f) Bearing 2_3 performance degradation trend prediction. (g) Bearing 2_4 performance degradation trend prediction. (h) Bearing 2_5 performance degradation trend prediction. (i) Bearing 2_6 performance degradation trend prediction. (j) Bearing 2_7 performance degradation trend prediction. (k) Bearing 3_3 performance degradation trend prediction.
Different RUL prediction methods are compared with six other studies with the same dataset to illustrate the superiority of the proposed scheme, which are listed in Table
The RUL prediction method proposed by Sutrisno et al. [
Guo et al. [
In Table
The RUL prediction results of different methods.
Current time (s) | Actual RUL (s) | Predicted RUL (s) | Er (%) | |||||||
---|---|---|---|---|---|---|---|---|---|---|
Sutrisno | Hong | Lei | Guo | Yoo | Wang | Proposed method | ||||
1_3 | 18010 | 5730 | 5600 | 37 | −1.04 | −0.35 | 43.28 | 1.05 | −1.05 | 2.27 |
1_4 | 11380 | 339 | 320 | 80 | −20.94 | 5.6 | 67.55 | 20.35 | −17.99 | 5.60 |
1_5 | 23010 | 1610 | 1410 | 9 | −278.26 | 100 | −22.98 | 11.18 | 21.74 | 12.42 |
1_6 | 23010 | 1460 | 1300 | −5 | 19.18 | 28.08 | 21.23 | 34.93 | 6.16 | 10.96 |
1_7 | 15010 | 7570 | 9270 | −2 | −7.13 | −19.55 | 17.83 | 29.19 | 7.79 | -22.46 |
2_3 | 12010 | 7530 | 7460 | 64 | 10.49 | −20.19 | 37.84 | 57.24 | 43.03 | 0.99 |
2_4 | 6110 | 1390 | 1310 | 10 | 51.8 | 8.63 | −19.42 | −1.44 | 1.44 | 5.76 |
2_5 | 20010 | 3090 | 2290 | −440 | 28.8 | 23.3 | 54.37 | -0.65 | 18.77 | 25.89 |
2_6 | 5710 | 1290 | 1430 | 49 | −20.93 | 58.91 | −13.95 | -42.64 | 2.33 | -10.85 |
2_7 | 1710 | 580 | 570 | −317 | 44.83 | 5.17 | −55.17 | 8.62 | -3.45 | 1.72 |
3_3 | 3510 | 820 | 850 | 90 | −3.66 | 40.24 | 3.66 | −1.22 | 13.41 | -3.66 |
Mean | 100.27 | 44.28 | 28.18 | 32.48 | 18.96 | 12.47 | 9.32 | |||
SD | 173.28 | 90.29 | 35.41 | 37.57 | 25.59 | 15.90 | 12.57 | |||
Score | 0.31 | 0.36 | 0.43 | 0.26 | 0.57 | 0.62 | 0.64 |
This paper proposes a new RUL prediction scheme combining deep learning and a new RVM method. Firstly, different types of degradation data are input into the deep-learning network with a hybrid structure to construct the health indicator. Then the new RVM model consisting of RVM and a polynomial model is used to predict the RUL and calculate confidence interval. Finally, the proposed method is compared with different RUL prediction methods to verify the effectiveness.
The proposed deep-learning network with a hybrid structure could learn from different types of degradation data. The constructed health indicator curve has better monotonicity and trendability than the single-structure deep-learning network, such as CNN and LSTM. The RVM is widely used in RUL prediction. On the one hand, the RVM could reduce the redundancy of the degradation curve to enhance the prediction accuracy. On the other hand, the prediction results of RVM are profoundly affected by kernel function and the long-term prediction ability is reduced. The proposed method retains the advantage of RVM and overcomes the disadvantage by combining the polynomial model with RVM. The final RUL prediction results show that the proposed method can enhance prediction accuracy and narrow down the confidence interval.
Although the proposed RUL scheme improves the prediction results, it is time-consuming. In future work, it is expected to raise the computational efficiency by researching a better deep-learning structure.
The experimental data are obtained from PROGNOSTIA in the IEEE PHM 2012 Data Challenge by Nectoux, P.; Gouriveau, R.; Medjaher, K.; Ramasso, E.; Chebel-Morello, B.; Zerhouni, N. “PRONOSTIA: An Experimental Platform for Bearings Accelerated Degradation Tests,” presented at the IEEE Int. Conf. Prognostics Health Manage., Denver, CO, USA, 2012, 1–8.
The authors declare that there are no conflicts of interest.
This work has been supported in part by the National Natural Science Foundation of China (61640308).