Study on Intelligent Diagnosis of Rotor Fault Causes with the PSO-XGBoost Algorithm

On basis of fault categories detection, the diagnosis of rotor fault causes is proposed, which has great contributions to the field of intelligent operation and maintenance. To improve the diagnostic accuracy and practical efficiency, a hybrid model based on the particle swarm optimization-extreme gradient boosting algorithm, namely, PSO-XGBoost is designed. XGBoost is used as a classifier to diagnose rotor fault causes, having good performance due to the second-order Taylor expansion and the explicit regularization term. PSO is used to automatically optimize the process of adjusting the XGBoost’s parameters, which overcomes the shortcomings when using the empirical method or the trial-and-error method to adjust parameters of the XGBoost model..e hybrid model combines the advantages of the two algorithms and can diagnose nine rotor fault causes accurately. Following diagnostic results, maintenance measures referring to the corresponding knowledge base are provided intelligently. Finally, the proposed PSO-XGBoost model is compared with five state-of-the-art intelligent classification methods. .e experimental results demonstrate that the proposed method has higher diagnostic accuracy and practical efficiency in diagnosing rotor fault causes.


Introduction
e steam turbine rotor plays an important role in transforming thermal energy into mechanical energy. In a highspeed rotating working station, any defect on the rotor will affect the safe running and even cause serious accidents [1][2][3].
erefore, intelligent diagnosis of rotor fault causes is essential besides diagnosing rotor fault categories intelligently.
In the field of industrial intelligent operation and maintenance, the research studies mainly focus on the detection of rotor fault categories [4][5][6], while the studies on the diagnosis of rotor fault causes are less. e specific rotor fault causes provide a reasonable and practical maintenance decision, ensuring the steam turbine's safe and stable running. e traditional diagnosis of rotor fault causes is mainly based on the expert system [7], but the knowledge is difficult to obtain, and the portability is poor. A serious of running parameters, such as temperature and pressure, can accurately assess the operating status of equipment [8], but they are rarely used to build the intelligent diagnosis system of rotor fault causes. erefore, the intelligent algorithms can be used to diagnose rotor fault causes and realize the intelligent operation and maintenance depending on running parameters of a rotor.
In essence, diagnosing rotor fault causes is a classification problem, and various intelligent classification methods have been applied. Support vector machine [9] (SVM) is a popular supervised learning algorithm that many researchers have used to train for classification. Jan et al. [10] used SVM classify sensor faults. Lobato et al. [11] used SVM for the classification of the machinery condition. However, the intelligent diagnosis of rotor fault causes is a typical nonlinear problem. Because the principle of SVM is a linear classifier based on maximum interval, it does not work well in solving nonlinear problems. Random forest [12] (RF) and gradient boosting decision tree [13] (GBDT) are two wellknown ensemble machine learning methods, and the weak learning model used in them is the decision tree (DT) model. Wang et al. [14] proposed a hybrid approach of a random forest classifier for the fault diagnosis in rolling bearings. Quiroz et al. [15] used random forests to diagnose broken rotor bar failure in a line start-permanent magnet synchronous motor. Zhu et al. [16] proposed a novel performance fault diagnosis method for SaaS software based on the GBDT algorithm. Zhong et al. [17] used GBDT to predict railway accident types and analyze causes. Although RF and GBDT have advantages such as high classification accuracy, less overfit, excellent generalization performance, and a good explanation, they also have some shortcomings for the intelligent diagnosis of rotor fault causes. RF may not produce good classification for small data or low dimensional data. GBDT uses the first-order Taylor expansion to calculate the loss, which is not accurate enough. On basis of GBDT, the extreme gradient boosting (XGBoost) algorithm was proposed by Chen Tianqi [18]. e XGBoost algorithm introduces second-order derivatives and regularization terms, which improve the accuracy on classification no matter whether the data scale is large or small. Zhang et al. [19] designed a data-driven method for fault detection of wind turbines using XGBoost. Lei et al. [20] diagnosed hydraulic valves by integrating PCA and XGBoost. Wu et al. [21] proposed a method of wind turbine fault diagnosis based on the ReliefF algorithm and XGBoost algorithm in order to improve the accuracy of fault diagnosis on wind turbines. Although the XGBoost has excellent classification results, there are many parameters in the XGBoost model, such as the learning rate, the subsample ration of columns when constructing each tree, the subsample ration of columns for each level, the regularization term on weights, and so on. Different combinations of these parameters determine the performance of the model to a large extent [22]. Usually, the parameter setting of the XGBoost model is to find a set of parameters making the performance best by fixing the values of several parameters and optimizing other parameters by a finite number of exhaustive methods. But different permutations and combinations increase the complexity of the work, and it is difficult to find the optimal parameters. It is an optimization problem to find the most suitable parameters of the XGBoost model. In recent years, various intelligent optimization algorithms have been proposed one by one [23][24][25]. In [26], an improved PSO-based QEA method was proposed to allocate gate resource. In [27], an enhanced MSIQDE algorithm with multiple strategies was proposed to solve global optimization problems. In [28], an enhanced success history adaptive DE with greedy mutation strategy is employed to optimize parameters of PV models. Aiming at the optimization problems of model parameters, particle swarm optimization (PSO) has simple principle and easy implementation. Many researchers have achieved better results by combing PSO with other classification methods. Wang et al. [29] used PSO to search the optimal architecture of convolution neural networks. Li et al. [30] used PSO to search the penalty factor and kernel function of SVM.
To diagnose rotor fault causes accurately and efficiently, a hybrid model based on the particle swarm optimizationextreme gradient boosting algorithm (PSO-XGBoost) is proposed. XGBoost, a scalable end-to-end tree boosting system, with the second-order Taylor expansion and the explicit regular term, is used as a classifier to diagnose the rotor fault causes. PSO is used to automatically optimize the parameters such as the L1 and L2 regularization terms on weights during the XGBoost model training, which overcomes the low accuracy and low efficiency when using the empirical method or the trial-and-error method to adjust these parameters of the XGBoost model. e hybrid model combined with the advantages of the two algorithms can diagnose rotor fault causes more accurately. Following diagnostic results, maintenance measures referring to the corresponding knowledge base are provided intelligently. e innovations and main contributions of this study are described as follows: (1) On basis of fault categories detection, the diagnosis of rotor fault causes is proposed, which has great contributions to the field of intelligent operation and maintenance (2) A novel hybrid model based on PSO and XGBoost is developed to effectively simplify the parameter adjustment process of the XGBoost model and improve the accuracy of diagnosis e further detailed structure of this study is summarized in the remaining sections. Section 2 introduces the preliminaries for diagnosis of rotor fault causes. Section 3 conducts an experiment to validate the performance of the proposed method. Conclusions are elaborated in Section 4. [16] is a highly scalable end-to-end tree boosting system, which provides a theoretically justified weighted quantile sketch for efficient proposal calculation, a novel sparsity-aware algorithm for parallel tree learning, and an effective cache-aware block structure for out-of-core tree learning.

XGBoost Algorithm. XGBoost
For a given dataset with n examples and m features D � (X i , y i ) (|D| � n, X i ∈ R m , y i ∈ R), a tree ensemble model uses K additive functions to predict the output. Here, . , x is ] is the characteristic parameter of the i th sample, composed of fault types F � F 1 , F 2 , · · · , F r and operating parameters x � x 1 , x 2 , . . . , x s , where r is the number of fault types, and s is the number of operating parameters. e predicted category of fault cause is where K is the number of trees, and f is a function in the functional space F. e objective function is e first term l(y i , y i ) is the training loss function, and the second term Ω(f) is the regularization term. e training loss measures how predictive the model is with respect to the training data. e regularization term controls the complexity of the model, which helps to avoid overfitting.

Formally, let y (t)
i be the prediction of the i th instance at the t th iteration; then, add f t to minimize the following objective. (3) Second-order approximation can be used to optimize equation (3) in the general setting, i.e., where g i � z ) are the first-and second-order gradient statistics on the loss function. After removing all the constants, the specific objective at step t becomes e definition of the tree f(X) is refined as Here, ω is the vector of scores on leaves, q is a function assigning each data point to the corresponding leaf, and T is the number of leaves. e regularization term is defined as After reformulating the tree model, the objective value with the t th tree can be written as where I j � i|q(X i ) � j is the set of indices of data points assigned to the j th leaf. By defining G j � i∈I j g i and H j � i∈I j h i , the expression can be compressed as In equation (9), ω j is independent with respect to each other, the form G j ω j + 1/2(H j + λ)ω 2 j is quadratic, and the best ω j for a given structure q(X) and the best objective reduction are Equation (11) measures how good a tree structure q(X) is. Typically, it is impossible to enumerate all the possible tree structures q. A greedy algorithm that starts from a single leaf and iteratively adds branches to the tree is used instead. By splitting a leaf into two leaves, the score it gains is Equation (12) can be decomposed as the score on the new left leaf, the score on the new right leaf, the score on the original leaf, and regularization on the additional leaf, respectively.

Particle Swarm Optimization Algorithm.
e particle swarm optimization algorithm [23] is a popular populationbased heuristic algorithm that is inspired by the foraging behavior of birds flocking.
Suppose a population X � (X 1 , X 2 , . . . , X n ) of the n particles in a D-dimensional search space, where the i th particle is represented as a D-dimensional vector. According to the objective function, each particle's corresponding fitness of position can be calculated. e individual extremum of i th particle's speed is During each iteration, the particle updates its speed and position through the extremum of the individual and the extremum of the population as In equations (13) and (14) above, ω is the inertia weight; d � 1, 2, . . . , D; i � 1, 2, . . . , n; t is the current iteration number; V id is the velocity of the particle; P id is the individual optimum; P gd is the global optimum; c 1 and c 2 are the acceleration constants; and r 1 and r 2 are the subjects to a uniform distribution in the (0, 1) interval.

Improved XGBoost Algorithm Based on PSO.
Although XGBoost has excellent results in many aspects, there are many parameters in it and different combinations of parameters determine the performance of the model to a large extent. PSO has the unique advantage of optimizing the parameters of XGBoost, which can effectively improve the effectiveness and accuracy of diagnosing rotor fault causes. In this study, six parameters that have a great influence on Mathematical Problems in Engineering the model are optimized by PSO. e information of each parameter is given in Table 1.
According to Table 1, the velocity vector and the position vector of the i th particle at the t th iteration can be expressed as e position vector is assigned to the corresponding parameters of XGBoost, and the negative accuracy score of the XGBoost model is used as the fitness value to measure the performance of PSO. e fitness value of the i th particle at the t th iteration is shown as where F (t) i represents the negative accuracy score of XGBoost; · is the indicator function, taking 1 and 0, respectively, when · is true and false; y i is the prediction label of XGBoost; y i is the real label of samples; and N is the total number of samples. e individual optimum of the i th particle at the t th iteration is e global optimum of the i th particle at the t th iteration is where n is the number of particles. e XGBoost algorithm and the PSO-XGBoost algorithm are shown in Figure 1. e process of the improved XGBoost algorithm based on PSO is shown in Figure 1(b). Compared with the original XGBoost algorithm shown in Figure 1(a), the PSO-XGBoost final training accuracy score is used as the objective function to search out the optimal parameters. e optimal result can be obtained by running PSO-XGBoost once, while XGBoost needs to be adjusted manually many times, and the optimal result may not be obtained.
e procedures of the proposed method for diagnosis of rotor fault causes, which can be seen from Figure 1(b), are as follows: Step 1. Initialize the particle swarm. Initialize the particle swarm parameters, including the particle number, learning factors, weighting coefficient, and the maximum number of iterations.
Step 2. Train the XGBoost model. e parameters to be optimized change along with the flying of particles.
Step 3. Calculate and assess the fitness value. e fitness value, originating from the output negative accuracy score of the XGBoost model, is used to evaluate the performance of PSO. A smaller fitness value indicates better performance.
Step 4. Judge the stop condition. Terminate the iteration process and obtain the optimal parameters of the XGBoost model if the number of iterations is reached. Otherwise, proceed to the iterative calculation.
Step 5. Validate the classification model. Use the optimization results to build the XGBoost model and output the results of diagnosing rotor fault causes.

Data Description.
In this study, 450 sets of operation data related to three kinds of high-pressure rotor faults of a 330 MW unit in a power plant are summarized as example verification. e specifications are given in Table 2. ree kinds of faults are represented by F1, F2, and F3, respectively. ey are high-pressure rotor rubbing fault (F1), the mass imbalance fault (F2), and the self-excited oscillation (including oil film half-speed whirl and oil film oscillation) fault (F3) and are taken as objects. C1-C9 represent nine different fault causes. Among them, four causes are leading to rotor rub impact, including rubbing at shaft seal caused by cylinder deformation (C1), rubbing at shaft seal caused by the fast rate of loading up (C2), rubbing at shaft seal caused by a long time of low load remaining (C3), and rotor rubbing with oil baffle (C4); three causes are leading to mass imbalance fault, including inadequate stiffness of bearing pedestal (C5), fracture and falling off of rotating parts (such as blades and coupling windshields) (C6) and other reasons (C7); and two causes are leading to self-excited oscillation fault, including poor stability of bearing (C8) and excessive journal disturbance (C9). A total of 50 groups of data samples for each fault cause constitute the sample set.
In this study, ten running parameters with high correlation with rotor rubbing fault, mass imbalance fault, and self-excited oscillation fault are selected, including the steam temperature of high-pressure cylinder shaft seal and cylinder expansion value of high-pressure cylinder. e details are given in Table 3.

Data Preprocessing.
Data preprocessing aims to make the data adapt to the model and match the model's needs. Data preprocessing mainly includes missing value processing, data dimensionless processing (including central processing and scaling processing), classified feature processing (text to digital), and continuous feature processing.

Missing Value Processing.
For missing values, in this study, the mean is used to fill the numerical feature, and the mode is used to fill the character feature.

Feature Coding of Character Features.
In the original dataset, digits do not represent the fault types in the classification features (rubbing fault (F1), mass imbalance fault (F2), and self-excited oscillation (F3)) and fault cause category labels (rubbing at shaft seal caused by cylinder deformation (C1) and rubbing at shaft seal caused by the fast

Data Standardization.
First, decentralize the data by mean (μ). en, scale them by standard deviation (σ). e conversion function is given in (10). After the above two steps, the data will follow the standard normal distribution, i.e., x ∼ N(μ, σ 2 ).
e preprocessed dataset is given in Table 4.

Experimental Results.
e test set is used to verify the performance of the PSO-XGBoost model. e model is quantitatively evaluated using evaluation indicators, such as the accuracy, confusion matrix, precision, recall, and F1score [31][32][33].
e results can be divided into four classes, including true positive (TP), false positive (FP), true negative (TN), and false negative (FN). Here, TP is the correct predicted positive category, FP is the incorrect predicted positive category, TN is the correct predicted negative category, and FN is the incorrect predicted negative category.
Accuracy is simply a ratio of the correctly predicted classifications to the total dataset. In formula, the accuracy ratio is A � (TP + TN)/(TP + FP + FN + TN).
Precision is the ratio of the system generated results that TP to the system's total predicted positive observations, both TP and FP. In formula, the precision ration is P � TP/(TP + FP).
Recall is the ratio of the system generated results that TP to all categories in the actual class. In formula, the recall ratio is R � TP/(TP + FP). F1-score is the weighted average of precision and recall, and the calculation formula is F1 � 2 × P × R/(P + R). e confusion matrix is used for evaluating the model when faced with a multiclassification problem. Each column of the confusion matrix represents a predicted category, and the total numbers of data for each column represent the number of data predicted to be in the category. Each row represents the data's actual category, and the total numbers of data for each row represent the number of data instances belonging to that category. For a confusion matrix, the larger the value on the diagonal is, the better the matrix. e smaller the value on other locations is, the better the matrix. e result is shown in Figure 2, and PSO-XGBoost's confusion matrix is shown in Figure 3. Figure 2 shows that the overall accuracy of the PSO-XGBoost model is 98.52%. From Figure 3, the accuracy of rubbing fault caused by cylinder deformation is 92.86%, the accuracy of rubbing at shaft seal caused by the fast rate of loading up is 92.31%, and the accuracy of three faults caused by other reasons is 100%. Excessive journal disturbance 50 C9     Table 5 provides the accuracy, precision, recall, and F1score for the PSO-XGBoost model. From this table, it can be seen that the accuracy, precision, recall, and F1-score of the proposed method as a whole are all above 98% for the performance of diagnosing rotor fault causes, and it can perform the accurate and comprehensive identification of various categories. erefore, the proposed method's performance has good results in accuracy, precision, recall, and F1-score.

Comparative
Analysis. An investigation of five different classifiers is performed to verify the superiority of PSO-XGBoost in classification performance, including XGBoost, RF, GBDT, DT, and SVM. e classification results of these algorithms are shown in Figures 4-8 . e results of accuracy are 95.56%, 93.33%, 92.59%, 91.85%, and 84.44%, respectively. Compared with Figure 3, we can conclude that the PSO-XGBoost algorithm is superior to the other five algorithms in classification accuracy.
To have a detailed quantitative analysis related to each classifier's classification results, five confusion matrixes according to five studied classification experiments are introduced for recording the recognition results and the percentage of misclassification of the rotor with different  fault causes. Figures 9-13 show the confusion matrixes of XGBoost, RF, GBDT, DT, and SVM, respectively. Figures 10-12 show that the RF model and the DT model are confused with C1, C2, and C3, and the GBDT model is confused with C1, C2, C3, C6, and C7. Figure 13 shows that the SVM model plays the worst performance. Figures 3 and 9 show that the PSO-XGBoost model and the XGBoost model are all confused with C1 and C2, but in category C1, PSO-XGBoost has higher accuracy than XGBoost. erefore, the PSO-XGBoost model is superior to the other five algorithms. e comprehensive model evaluation indicators are given in Table 6.   Mathematical Problems in Engineering e detailed comparison of six algorithms in the accuracy, precision, recall, and F1-score is shown in Figures 14-17.
In the view of Figures 14-17, the SVM model's accuracy, precision, recall, and F1-score are the lowest of the five algorithms because its principle is a linear classifier based on maximum interval, which does not work well in solving nonlinear problems. DT's performance is better than SVM, but its accuracy, precision, recall, and F1-score are slightly lower than other four algorithms because RF, GBDT, and XGBoost all use the DTmodel as their weak learning models. Except PSO-XGBoost, the performance fault causes the   Figure 18.
From Figure 18, in the initial iteration stage, for the proposed method, the iterative curve shows a rapid downward trend; then, the iterative process in the proposed method is easily converged. Obviously, other five methods still need several iterations to achieve the final convergence result. erefore, the proposed method is faster in convergence rate and has higher efficiency in practice.

Maintenance Strategy according to Fault Causes.
For nine different rotor fault causes, we build a knowledge base, mapping each rotor fault cause to a specific solution, in order  Use bearings with good stability such as tilting pad and elliptical bush Increase the bearing specific pressure such as reducing the bearing length and adjusting the bearing height Increase the temperature of lubricating oil and reduce the viscosity of lubricating oil Reduce the top clearance of the fixed pad bearing and improve the bearing preload C9 Excessive journal disturbance M9 Reduce the vibration of the shaft and the disturbing force of the journal to achieve the purpose of intelligent operation and maintenance. For example, when we diagnose the rotor fault cause C1, the computer will automatically link to the solution M1. Other details in the knowledge base are given in Table 7.

Conclusions
On basis of fault categories detection, the diagnosis of rotor fault causes is proposed, which has great contributions to the field of intelligent operation and maintenance. is study proposes a hybrid model for diagnosing rotor fault causes using the PSO-XGBoost algorithm. Aiming at the problems of low accuracy and low efficiency in using empirical methods to adjust parameters of the XGBoost model, PSO is used to solve the difficulty of parameter adjustment when using the XGBoost model to diagnose rotor fault causes and improve the diagnostic accuracy at the same time. e experimental results show that (1) Compared with the direct construction of the XGBoost model to diagnose rotor fault causes, the hybrid model can achieve higher diagnostic accuracy and practical efficiency (2) e hybrid model can effectively identify nine different failure causes under three types of failures, and the classification accuracy, precision, recall, and F1score are all above 98%. Compared with XGBoost, RF, GBDT, DT, and SVM, from the perspective of the PSO-XGBoost's comprehensive classification performance, choosing the PSO-XGBoost model in diagnosing rotor fault causes is more effective than other algorithms.

Data Availability
e csv data used to support the findings of this study have been deposited in the Baidu Netdisk repository (https://pan. baidu.com/s/1A8jqMmykRYbOxqJwYPC3fw; password: suep).

Conflicts of Interest
e authors declare that they have no conflicts of interest.