Intelligent Diagnosis of Rolling Bearing Fault Based on Improved Convolutional Neural Network and LightGBM

Aiming at the problems of weak generalization ability and long training time in most fault diagnosis models based on deep learning, such as support vector machines and random forest algorithms, one intelligent diagnosis method of rolling bearing fault based on the improved convolution neural network and light gradient boosting machine is proposed. At ﬁrst, the convolution layer is used to extract the features of the original signal. Second, the generalization ability of the model is improved by replacing the full connection layer with the global average pooling layer. Then, the extracted features are classiﬁed by a light gradient boosting machine. Finally, the veriﬁcation experiment is carried out, and the experimental result shows that the average training and diagnosis time of the model is only 39.73s and 0.09s, respectively, and the average classiﬁcation accuracy of the model is 99.72% and 95.62%, respectively, on the same and variable load test sets, which indicates that the diagnostic eﬃciency and classiﬁcation accuracy of the proposed model are better than those of other comparison models.


Introduction
Rolling bearing is one of the most critical components widely used in a modern machine, and it is easy to appear cracks, pitting corrosion, and other local damages or defects on the inner and outer ring raceways and rolling elements of the rolling bearings under the harsh working conditions of high temperature, alternating load, and long-time fatigue. As one key component, once the rolling bearing fails, it will affect the safe operation of mechanical equipment, or even damage the equipment and cause casualties. It is of great significance for the safe operation of the mechanical equipment to avoid the occurrence of catastrophic accidents, if we can accurately, timely, and intelligently identify the faults of the rolling bearing and carry out maintenance as soon as possible.
In recent years, various fields made great achievements in the research of algorithms. In order to overcome the slow convergence speed, poor global search ability, and difficult designing rotation angle of quantum-inspired evolutionary algorithm (QEA), Xing et al. [1] proposed an improved quantum-inspired cooperative coevolutionary algorithm, named MSQCCEA, which is based on combining the strategies of cooperative coevolution, random rotation direction, and Hamming adaptive rotation angle, and the results demonstrate that the proposed MSQCCEA has faster convergence speed and higher convergence accuracy. In order to overcome the low solution efficiency, insufficient diversity in the later search stage, slow convergence speed, and a high search stagnation possibility of differential evolution (DE) algorithm, Deng et al. [2] studied the quantum computing characteristics of quantum evolutionary algorithm (QEA) and, combined with the divide and conquer idea of cooperative evolutionary algorithm (CCEA), proposed an improved differential evolutionary algorithm (HMCFQDE), and the results proved that the proposed HMCFQDE has higher convergence accuracy and stronger stability and a strong ability to optimize high-dimensional complex functions. Deep learning theory has made great progress in the field of fault diagnosis, and as one of the important models of deep learning theory, continuous neural network (CNN) has shown its own value and great potential in the field of bearing fault diagnosis. For example, Verstraete et al. [3] proposed a deep learning-enabled featureless methodology to automatically learn the features of the data, and the proposed CNN architecture achieved better results. He [4] presented a deep learning-based approach for bearing fault diagnosis and built an optimized deep learning structure LAMSTAR neural network to diagnose the bearing faults, and the approach shows the accurate classification performance on various bearing faults under different working conditions. Hoang and Kang [5] provided a systematic review of deep learning-based bearing fault diagnosis, introduced the three popular deep learning algorithms for bearing fault diagnosis including autoencoder, restricted Boltzmann machine, and convolutional neural network, and reviewed their applications in the area of bearing fault diagnosis. Zhenghong et al. [6] proposed an adaptive deep transfer learning method for bearing fault diagnosis and verified the method with two kinds of datasets, and the results demonstrate the effectiveness and robustness of the proposed method. Sun et al. [7] proposed a novel intelligent diagnosis method for fault identification of rotating machines, which can not only reduce the amount of measured data that contained all the information of faults but also realize the automatic feature extraction in the transform domain, and the proposed method can reduce the need of human labor and expertise and provide a new strategy to more easily handle the massive data. Ding and He [8] proposed a novel energy-fluctuated multiscale feature mining approach based on wavelet packet energy (WPE) image and deep convolutional network (ConvNet) for spindle bearing fault diagnosis, which is quite suitable for spindle bearing fault diagnosis with multiclass classification regardless of the load fluctuation. Chen et al. [9] proposed a rolling bearing fault diagnosis method based on discrete wavelet transform and the convolution neural network so as to achieve the adaptive feature extraction and intelligent diagnosis of rolling bearing faults, and the experimental results showed that the proposed method has the better generalization ability and robustness. Haidong et al. [10] proposed a novel method for intelligent fault diagnosis of rolling bearing based on deep wavelet autoencoder and extreme learning machine, the method is applied to analyze the experimental bearing vibration signals, and the results showed that the method is superior to the traditional methods and standard deep learning methods. Ding and Jia [11] proposed a one-dimensional multiscale convolutional autoencoder fault diagnosis model of rolling bearings based on the standard convolutional autoencoder, and the test results show that the proposed model has a better recognition effect for rolling bearing fault data. ese studies have achieved good diagnosis results. Although the convolutional neural network has achieved good results in the field of fault diagnosis, it cannot well separate the feature extraction and classification functions of the model using the softmax layer to classify the features extracted from the convolution layer, and it may lead to poor classification and generalization ability of the model.
Machine learning algorithms play an important role in the field of fault diagnosis. e single machine learning algorithm, such as support vector machine (SVM) [12,13] and K-nearest neighbor (KNN) [14], and the ensemble learning algorithm, such as random forest algorithm [15] and extreme gradient boosting [16,17], all have made great achievements in the field of mechanical fault diagnosis. However, it is difficult for these classification algorithms to meet the requirements in terms of efficiency and accuracy in big data and high-dimensional environments. Light gradient boosting machine (LightGBM) [18,19] is a gradient lifting algorithm based on the decision tree, it optimizes the classification accuracy and computational efficiency based on the boosting algorithm, and it is more suitable for classification in a large sample environment, while there will be a lot of unprocessed redundant signals if the original signals are directly input in LightGBM, and it will consume too much memory space in model training and easily cause overfitting of LightGBM classifier.
In order to solve the above problems, in this study, a bearing fault diagnosis model combined with LightGBM algorithm and the improved convolutional neural network that is optimized by replacing the full connection layer to the global average pooling (GAP) layer is proposed (hereinafter referred to as GCNN). e two kinds of data sets under the same load and variable load conditions are constructed. e improvement effect of the global average pooling layer on the model generalization ability and the effectiveness of the proposed model are proved through the comparative analysis with other models.

Improved Convolution Neural Network and LightGBM
e convolution neural network is one kind of feed-forward neural network, which adopts unsupervised or semisupervised learning mode. It contains convolution calculation and deep structure and can classify the input information according to its hierarchical structure [20]. Figure 1 shows the structure of the convolution neural network [21,22], and it includes the input layer, convolution layer, pooling layer, full connection layer, and output layer. Convolution layer, pooling layer, and full connection layer constitute the hidden layer.

Convolution Layer and Pooling Layer.
Convolution layer is the most basic structure of the convolution neural network, it is the feature extraction layer, its main function is to extract features from the input data, and it uses the local link, weight sharing, and multiple convolution kernels to extract features from data. e most significant features of convolution layer are local sensing and parameter sharing compared with the general deep learning network structure, which can greatly reduce the model parameters and ensure the sparsity of the network. e convolution layer formula is where y l(i,j) is convoluted output, k l(j′) i is the j ′ weight value of the convolution kernel i in the l layer, x l(j+j′) is the convoluted local region j in the l layer, and m is the width of convolution kernel. e pooling layer is mainly used to select and filter the feature graph extracted from the convolution layer and replace the results of a single point in the feature graph with the statistics of its adjacent regions so as to reduce the number of nodes in the final fully connected layer. It can reduce the overfitting and improve the fault tolerance of the model. e common pooling methods are maximum pooling and average pooling. Compared with the average pooling method, the maximum pooling can select the most significant features in the region; therefore, in this study, the maximum pooling method is selected to select the maximum value in the region as the pooled value of the region. e expression of the maximum pooling method is where p lij is pooled output and n is the width of the pooling area. e comparison between the full connection layer and the global average pooling layer is shown in Figure 2. It can be seen from Figure 1 that it needs to expand all the features of each feature graph before using the full connection layer, while the global average pooling layer only needs to calculate the average value of each feature graph. It is easy to see from this simple comparison structure diagram that it can greatly reduce the parameter calculation of the classical convolutional neural network using the global average pooling layer to replace the full connection layer.

Light Gradient Boosting
Machine. Gradient boosting decision tree (GBDT) is a long-standing model in machine learning. Its main idea is to use a weak classifier, decision tree, and iterative training to get the optimal model, which has the advantages of good training effect and is not easy to overfit. LightGBM is a framework to realize the GBDT algorithm, it supports efficient parallel training, and it has the advantages of faster training speed, lower memory consumption, better accuracy, support for distribution, and can quickly process massive data. LightGBM uses the negative gradient of the loss function as the residual approximation of the current decision tree to fit the new decision tree. It uses the histogram algorithm, which takes up less memory and reduces the complexity of data separation. It adopts the leafwise strategy with depth restriction, and it will find the leaf with the largest splitting gain, the largest amount of data, from all the current leaves every time, and then splits it. In this way, if the splitting times are the same, the leaf-wise strategy can reduce more errors and get better accuracy. LightGBM can skillfully solve the problem that traditional boosting algorithm is very time-consuming in the large sample environment, and the key of LightGBM is to combine two new methods of gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB). GOSS is a balancing algorithm in reducing the amount of data and ensuring accuracy. GOSS is to reduce the amount of calculation by distinguishing the instances of different gradients, retaining the larger gradient instances, and randomly sampling the smaller gradients, so as to improve the efficiency. EFB is a way to reduce the feature dimension by binding features to improve computing efficiency. Usually, the bundled features are mutually exclusive so that the two features will not lose information.

Structure Diagram of the Model.
e structure diagram of the improved convolution neural network and LightGBM (GCNN-LightGBM) model is shown in Figure 3, and it is mainly composed of convolution layer, pooling layer, global average pooling layer, and LightGBM classifier. Before the original one-dimensional vibration signal was input into the convolution layer, the random deactivation with a probability of 0.2 was carried out on it, so as to improve the generalization ability of the training model and the stability of fault diagnosis under variable load conditions. ere are two convolution layers and two pooling layers. In the first layer, a large convolution kernel is used to obtain more effective information in the low-and medium-frequency bands of the original signal. e feature maps obtained by

Parameter Setting of the GCNN-LightGBM Model.
e GCNN-LightGBM model uses the improved convolution neural network to extract the adaptive features of the bearing vibration signals. It is very important for the feature extraction effect to select the superparameters of the convolution neural network. erefore, the parameters of the convolution neural network are trained by softmax classifier (shown as Table 1). After the convolution part is trained, LightGBM is used to replace the softmax layer. e selection of LightGBM parameters is processed by the Bayesian parameter adjustment algorithm. e meanings and values of some important parameters are shown in Table 2.    with the damage diameters of 0.178, 0.356, and 0.534 mm are selected, and the fault bearing is at the drive end. A total of 10 kinds of bearing operation state data are selected. e sampling frequency is set to 12 kHz, and 1,024 data points are collected as a sample each time. e z-sore normalization method is used to preprocess the data before feature extraction so as to accelerate the convergence speed of the convolution neural network. e expression of the z-sore method is

Test and Performance Analysis
where x is the original sample value, u is the mean value of all sample data, σ is the standard deviation of all samples, and x ′ is the normalized value. e selected data are divided into three data sets corresponding to the load of 1 HP, 2 HP, and 3 HP. Each data set contains 10,000 samples, there are 10 kinds of bearing operation state, and each bearing state includes 1,000 samples. About 70% of samples are selected as the training set, 20% samples are selected as the verification set, and 10% samples are randomly selected as the test set. e specific data sets are shown in Table 3.
Generally, the distribution of the data set is different due to the amplitude, fluctuation period, and phase inconsistency of the vibration signals under different working conditions. erefore, it needs the classifier designed has strong generalization ability and robustness. However, it is not realistic to collect and mark enough training samples to make the classifier robust to all the working conditions. In this study, one method using the single load to train the fault diagnosis model and using the test set of the other loads to carry out the fault diagnosis is adopted [23]. For example, it requires the model trained under 1 HP load to not only have high classification accuracy in the 1 HP test set but also in the 2 HP or 3 HP test set, and the variable load adaptive data set constructed to achieve this goal is shown in Table 4.

Model Validation.
In the experiment, the GCNN-LightGBM model uses the deep learning framework Keras in Python language, the classification module directly calls the LightGBM software package, and the established network is used to train and test using different data sets. Because the initialization of input data and neural network weights is random, the average value was calculated after each data set was trained 10 times so as to ensure the reliability of the test results.
In order to verify that the improved convolutional neural network has stronger generalization ability, the contrast model of the classical convolutional neural network and LightGBM is constructed, and the network structure and training parameters of the contrast model are consistent with the GCNN-LightGBM model except for the full connection layer. At the same time, in order to verify that LightGBM has a stronger classification ability than the softmax layer, one contrast model of the improved convolutional neural network and softmax also is constructed, and the feature extraction part of the contrast model is consistent with the model in this study except the softmax classifier.
e recognition accuracy of each model under different load conditions is shown in Figures 4 and 5.
It can be seen from Figures 4 and 5 that the average classification accuracy of the GCNN-LightGBM model is slightly higher than that of the CNN-LightGBM model under the same load condition, while the average classification accuracy of the GCNN-LightGBM model is 2.72% higher than that of the CNN-LightGBM model under variable load condition, and it is verified that the improved convolutional neural network has a better anti-overfitting effect and can improve the generalization ability of the model. e average classification accuracy of the GCNN-LightGBM model is 1.05% and 0.77% higher than that of the GCNN-softmax model, respectively, under the same load and variable load conditions, and it indicates that LightGBM has a stronger classification ability than softmax. LightGBM classifier can achieve good classification results under the same load condition, but the average classification accuracy is less than 68% under the variable load condition, which indicates that it is easy to overfit when LightGBM is directly used to train the original data although it is a very powerful classifier, and it is necessary to extract features from the original data. e classification accuracy between adjacent conditions is high under variable load conditions, which indirectly reflects that the distribution difference of adjacent load data sets is small but the distribution difference of nonadjacent load data sets is large.

Accuracy Rate of Fault Diagnosis.
Since the classification accuracy of the GCNN-LightGBM model is close to 100% under the same load test set, several deep learning models [24,25] that have achieved good classification results under the same load conditions are selected to carry out the  Shock and Vibration 5 comparative analysis under the variable load test sets, so as to highlight the advantages of the generalization ability and load migration ability of GCNN-LightGBM model. e classification accuracy of different deep learning models under the variable load adaptive data sets is shown in Figure 6.
It can be seen from Figure 6 that the models of CNN-LSTM and WDCNN have strong adaptability when they are trained under the load conditions of 1 HP and 2 HP, and the classification accuracy can reach more than 90% in other variable load test sets. But the load migration ability of the models is not strong when they are trained under the load conditions of 3 HP, and the classification accuracy can only reach about 80% under the load conditions of 1 HP and 2 HP. e classification accuracy of the CNN-SVM model trained under the load condition of 3 HP in other test sets is close to 100%, but the classification accuracy of the CNN-SVM model trained under the load condition of 1 HP in other test sets is even less than 80%. It shows that the overall robustness and load migration ability of the three comparison models are not very strong, although they can achieve good classification results on a variable load test set. e lowest classification accuracy of the GCNN-LightGBM model is about 87.89% under the variable load conditions, and it is increased by 20.57%, 9.93%, and 11.10%, respectively, compared with the worst cases of the models of CNN-LSTM, WDCNN, and CNN-SVM. e average classification accuracy of the GCNN-LightGBM model is 94.64%, which is significantly higher than that of other models. It can be seen that the GCNN-LightGBM model has the better overall classification effect under the variable load conditions and also has better generalization ability and load migration ability.

Efficiency of Fault Diagnosis.
In order to further highlight the advantages of the GCNN-LightGBM model on the efficiency of rolling bearing fault diagnosis, the training time and diagnosis time of each model, the amount of training parameters of the deep learning model, and the number of required layers of training parameters (excluding pooling layer) are recorded in the process of comparative test, and these parameter values are shown as Table 5. Among them, the average duration is the average value of each model trained for 10 times under different load conditions, and the average value of different loads is calculated again.
It can be seen from Table 5 that the required training parameters and network layers of the GCNN-LightGBM model are the least, especially the amount of the training parameters are several orders of magnitude different from the other three networks, and the required average training and diagnosis time is the smallest among the four models.

Conclusions
In order to solve the problem that weak generalization ability and long training time in most fault diagnosis models based on deep learning, one intelligent diagnosis method of rolling bearing fault based on the improved convolution neural network and light gradient boosting machine is proposed.
(1) First, the random deactivation with a probability of 0.2 was carried out on the original one-dimensional vibration signal, so as to improve the generalization ability of the training model and the stability of fault diagnosis under variable load conditions. Second, the signal was input into the GCNN. In the first layer, a large convolution kernel is used to obtain more effective information in the low-and medium-frequency bands of the original signal. e feature maps obtained by two-layer convolution pooling operation are input into the global average pooling layer, and the secondary feature extraction and data dimension reduction are realized by averaging each feature map. Finally, the extracted low-dimensional features are input into the LightGBM classifier for classification. (2) e results show that (1) the average classification accuracy of the GCNN-LightGBM is 99.72% for the same load test set and 95.62% for the variable load test set; (2) the GCNN-LightGBM model has the higher average classification accuracy on the variable load test set compared with the models of CNN-LSTM, WDCNN, and CNN-SVM, and it has the stronger generalization ability and load migration ability; (3) the GCNN-LightGBM model only needs two training layers, and the amount of parameter calculation is less than 3,000, the training and fault diagnosis durations are 39.73 s and 0.09 s, respectively, and these data are far lower than other comparison models, which shows that the GCNN-Light GBM model has the advantages of simple structure, less parameter calculation, and high efficiency of training and fault diagnosis. (3) In this study, the generalization ability of the model is improved according to the change of load. In the future, the robustness of the model will be further improved by adding noise interference to the samples.

Data Availability
e data used to support the findings of this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.