A Fault Diagnosis Approach for Gas Turbine Exhaust Gas Temperature Based on Fuzzy C-Means Clustering and Support Vector Machine

. As an important gas path performance parameter of gas turbine, exhaust gas temperature (EGT) can represent the thermal health condition of gas turbine. In order to monitor and diagnose the EGT effectively, a fusion approach based on fuzzy C-means (FCM) clusteringalgorithmandsupportvectormachine(SVM)classificationmodelisproposedinthispaper.Consideringthedistribution characteristicsofgasturbineEGT,FCMclusteringalgorithmisusedtorealizeclusteringanalysisandobtainthestatepattern,on thebasisofwhichthepreclassificationofEGTiscompleted.Then,SVMmulticlassificationmodelisdesignedtocarryoutthestate patternrecognitionandfaultdiagnosis.Asanexample,thehistoricalmonitoringdataofEGTfromanindustrialgasturbineis analyzedandusedtoverifytheperformanceofthefusionfaultdiagnosisapproachpresentedinthispaper.Theresultsshowthatthis approachcanmakefulluseoftheunsupervisedfeatureextractionabilityofFCMclusteringalgorithmandthesampleclassification generalizationpropertiesofSVMmulticlassificationmodel,whichoffersaneffectivewaytorealizetheonlineconditionrecognition andfaultdiagnosisofgasturbineEGT.


Introduction
With the development of high efficiency and clean energy, gas turbine plays an increasingly significant role in different domains, such as aviation and marine propulsion systems, electric power stations, and natural gas transportation petroleum [1].With the increasing demand of security operation for gas turbine, the traditional regular maintenance technology has been unable to fully keep up with the actual demand and gas turbine health management technology has gradually become one of the most problems concerned by researchers and users in recent years [2].In order to guarantee the gas turbine to run efficiently under the safe reliable condition, many sensors are often used to monitor the health state of gas turbine in the practical application.Massive amounts of data gathered by these sensors are easy to make difficulties in data analysis and affect the maintenance decision.Therefore, the choices of appropriate monitor parameters, signal processing methods and data mining techniques are very important to realize the health management of gas turbine.
Exhaust gas temperature (EGT) is an important gas path performance parameter of gas turbine, which can represent the thermal health condition of gas turbine [3,4].Considering the characteristics of different gas path performance monitoring parameters, the multiple linear regression models for analyzing the relationship between EGT and other parameters were established by Song et al. [5].Their results showed that there were strong linear correlations between different gas path performance parameters and all the low turbine outlet pressure, high rotational speed, high pressure compressor outlet temperature, low rotational speed, and high pressure compressor outlet pressure could be reflected through the change of EGT.Yilmaz [6] also found the similar results by analyzing the relationship between EGT and other engine operational parameters at two different power settings, including maximum continuous and take-off, in the CFM56-7B turbofan engine.Hence, EGT is often used 2 Mathematical Problems in Engineering as an important parameter to evaluate the health state of gas turbine and determine the maintenance policy [7].
In the past half century, different methods have been developed to monitor and diagnose the EGT of gas turbine.Wang and Yang [8] analyzed many faults of PG6551B industrial gas turbine, such as turbine ablation, combustion component, and fuel system failure.They found that the uniformity of EGT could effectively reflect the feature of above fault.Chen et al. [9] proposed a general regression neural network (GRNN) approach to construct an autodetection network for EGT sensors, on the basis that they also studied the optimizing design of network and error controlling and developed the method of threshold for sensor detection.Based on the advantage of artificial neural networks (ANN), Muthuraman et al. [10] developed an autoassociative neural network approach to detect combustor-related damage by monitoring EGT.Błachnio and Pawlak [11] established a nonlinear observer and chose EGT as the important parameter to evaluate the health state of turbine blades.Korczewski [12][13][14] analyzed the change rules of EGT for a naval gas turbine engine under steady and unsteady operation conduction in detail.And they proposed an effective approach for detecting and evaluating the failures of the flow section and supply system of gas turbine by using EGT.Kenyon et al. [15] developed an intelligent system for detection of EGT anomalies in gas turbines by using the strong nonlinear mapping ability of ANN.Considering the characteristics of gas turbine operation control based on thermocouple measured exhaust temperatures, Xia et al. [16] discussed the application of Fiber-Bragg-grating-based sensing technology in the EGT measuring of gas turbine.Their results demonstrated that the fiber sensing method was more valuable for the monitoring and fault diagnosis of gas turbine because it could well reflect the changing of EGT.In order to increase the operational availability of industrial gas turbines, Yang et al. [17] presented a generalization of multidimensional linear regression to facilitate multisensor fault detection and signal reconstruction through the use of analytical optimization.Gülen et al. [18] discussed the relationship between EGT and other gas turbine performance parameters, on the basis of which an important diagnostic parameter named profile factor that was the ratio of the maximum exhaust thermocouple and the average of all exhaust temperature thermocouples was used to evaluate the performance of combustor and the whole gas turbine in their paper.
From the reviews discussed above, it is noted that the average EGT is often used to evaluate the health state of gas turbine in most researches.However, it may be more important and valuable to extract the relationship among different EGT sensors in order to realize condition monitor and fault diagnosis of gas turbine effectively.Although many studies have presented the effects of EGT distribution characteristics on health state of gas turbine or its hot sections and many analyses were discussed in detail, there was still a lack of systematic research in the area of online automatic identification and fault diagnosis for gas turbine EGT.Besides, EGT can be affected by many uncertain factors in the practical applications, which make it difficult to realize fault diagnosis quickly by using traditional model-driven approach.Therefore, it is very useful to develop a data-driven approach based on artificial intelligence technology in order to improve fault diagnosis accuracy.
The fault diagnosis of gas turbine EGT based on datadriven approach essentially is the cluster and classification of fault information.In the concrete implementation process, the training samples including normal and fault information need to be obtained firstly.Then the fault diagnosis model based on artificial intelligence algorithm can be established and trained by using training samples.The feature information collected from sensors will be inputted to the well trained fault diagnosis model and we can get the diagnostic results finally.Obviously, the establishment of fault information features space and design of artificial intelligence algorithm are two key steps to realize accurate fault diagnosis of gas turbine EGT.As mentioned above, many specific state patterns or fault types of gas turbine EGT cannot be determined directly based on experience in the practical applications due to the effects of many uncertain factors.Besides, enough prior knowledge including specific fault types is indispensable for the supervised artificial intelligence algorithms (such as ANN [19] and SVM [20]).
Considering the distribution characteristics of gas turbine EGT and the deficiencies of present literatures, a fusion approach based on FCM clustering algorithm and SVM classification model (FCM-SVM) is proposed in this paper.Firstly, FCM clustering algorithm is used to realize clustering analysis and obtain the state patterns of EGT, which means that the preclassification of EGT is completed.Then, SVM multiclassification model is designed to carry out the state pattern recognition and fault diagnosis of EGT.As an example, the historical monitoring data of EGT from an industrial gas turbine is analyzed to verify the effectiveness of the FCM-SVM approach finally.
The rest of this paper is organized as follows.In Section 2, the distribution characteristics of gas turbine EGT are described briefly.Section 3 introduces the basic theory of FCM clustering algorithm and SVM classification model in detail.The fusion fault diagnosis approach which combines FCM clustering with SVM is discussed in Section 4. Application examples and discussion are included in Section 5. Finally, Section 6 presents some conclusions.

Signal Feature of Gas Turbine EGT
As mentioned above, it is very important to choose the appropriate measured parameters to monitor and diagnose the health state of gas turbine.Gas path and vibration parameters are two main types in the practical applications [2,21].Theoretically, as the most important gas path performance parameter for gas turbine, the outlet temperature of combustor chamber can not only affect the overall performance of engine, but also directly determine the ultimate strength of turbine blade.For example, the creep life of hot channel components can reduce the order of magnitude when the outlet temperature of combustor chamber increases 50 ∘ C [4], which may cause major fault and incur great maintenance costs.However, the outlet temperature of combustor chamber is usually so high that it cannot be measured directly by using conventional sensors.According to the well-defined Brayton thermodynamic cycle, there is a consistent relationship between the outlet temperature of combustor chamber and EGT.Therefore, EGT, as a measured parameter, is often used for gas engine control, condition monitoring, fault diagnosis, and maintenance decisions.
Compared with the average EGT, EGT profile can contain more information about the health state of gas turbine.Figures 1 and 2 show two EGT profiles with the same average EGT of an industrial gas turbine that has 12 EGT sensors.From Figures 1 and 2, it is easy to see that the EGT profiles of normal and fault condition are different although the average EGT are the same.This means that the fault information is incorrect or incomplete if only the average EGT is used to monitor the health state of gas turbine.Besides, Figures 1 and 2 also show that all the sensors should give similar outputs when gas turbine operates in normal condition.If the component of gas turbine is failure, different temperatures will be observed.Therefore, the uniformity of EGT can more effectively reflect the health state of gas turbine, especially for steady state condition.
In order to quantitatively describe the uniformity of EGT, Mao [22] presented three indexes which can be calculated by the following functions.Assume that there are  sensors and their original outputs are   ,  = 1, . . ., , respectively: where According to above indexes, it is obvious that all the values of  1 ,  2 , and  3 are smaller when gas turbine operates in normal condition.If a sensor fails, it usually causes  1 or three indexes to increase.But the values of  1 and  2 or all three indexes can increase due to hot sections failure.However, it is worth pointing out that the above indexes only consider 4 EGT and others are ignored.Besides, it is difficult to diagnose the specific cause of failure when the sensor is fault because the above indexes ignore the adjacent information between different sensors.Therefore, there are some limitations to evaluate the uniformity of EGT only using the above three indexes.In order to solve this problem, all the measured EGT are used to realize cluster and fault diagnosis in this paper.

Fuzzy C-Means Clustering Algorithm.
As an unsupervised machine learning method, FCM clustering algorithm was improved by Bezdek [23] in 1981 in order to solve the hard clustering problem by using fuzzy set theory.In the FCM clustering algorithm, membership degree function is used to indicate the extent to which each data point belongs to each cluster, and this information is also used to update the values of cluster centers [24].Based on the concept of fuzzy C-partition, FCM clustering algorithm has been applied successfully in a wide variety of applications, such as image segmentation [25], data mining [26], thermal system monitoring [27], and fault diagnosis [28].
For the sample set  = { 1 ,  2 , . . .,   }, the object of FCM clustering algorithm is to divide the sample set into  groups and obtain the cluster centers by minimizing the following dissimilarity function [29]: where  is the fuzzy weighting parameter varying in the range [1, ∞].The bigger the , the more fuzzy the final cluster result. is fuzzy partition matrix,  is cluster center matrix, and  and  are the number of samples and cluster centers, respectively.  is the fuzzy membership degree of the th sample in the th cluster and it should be meeting the following three constraints [30]: For the distance   between th sample   and the centre of th cluster   , it can be calculated by using Euclidean distance as follows: can be calculated by utilizing the following formulation: In essence, fuzzy cluster is performed through an iterative optimization by updating fuzzy membership degree [29]: where  is the iterative step.
When the below requirement is met, we can stop iteration and obtain the cluster result: where  is the iterative threshold in the range [0, 1].
Based on above method, the cluster process of FCM clustering algorithm is virtually to determine the fuzzy membership degree and cluster centers through continuous iteration, which is shown in Figure 3.

Support Vector Machine Classification Model.
Compared with the conventional classifiers, support vector machine (SVM), developed by Vapnik [31], can effectively solve the classification problem by implementing the structure risk

Original space
Higher dimensional feature space minimization based on statistical learning theory.Nowadays, SVM has been widely and successfully applied to detection and diagnosis of machine conditions due to its high accuracy and good generalization for a smaller number of samples [32,33].SVM is initially used to deal with binary classification problems.Its core idea is to transform the sample data from original space to a higher dimensional feature space through some nonlinear mapping functions and then find the optimal separating hyperplane in this feature space to realize linear classification.Figure 4 shows the classification principle based on SVM for the nonlinear classification problem.
For the nonlinear training sample data set including two classes {  ,   },  = 1, 2, . . ., ,  ∈   ,  ∈ {−1,+1},  is the number of samples.The nonlinear mapping function () will be used to transform the sample data from original space to a higher dimensional feature space and the optimal separating hyperplane can be constructed to maximize the margin between the two classes by the following linear function: where  is the normal vector of optimal separating hyperplane and  is a scalar.In essence, the solution of optimal separating hyperplane is the corresponding constraint optimization problem: where  is the penalty factor that can realize the trade-off between empirical risk and confidence interval.  is slack factor.
Combining the method of Lagrange multipliers, the above convex optimization problem can be simplified into the dual quadratic optimization problem: where  is Lagrangian multiplier.Then, the nonlinear decision function is described as In order to calculate the value of (  ) ⋅ (  ), the kernel function (  ,   ) is used and the above function can be expressed as For the SVM, there are many kinds of kernel function, such as linear kernel, polynomial kernel, polynomial kernel, and radial basis function (RBF) kernel.Compared with other kernel functions, the RBF kernel can obtain the higher classification accuracy in many practical applications [34].Therefore, the RBF kernel is used in this study.
As previously mentioned, SVM is initially designed for binary classification.However, there are often many faults in the practical applications, which mean that it is necessary to develop a method to deal with a multiclassification problem.Currently, different methods have been developed for the multiclassification based on SVM, such as "one-against-one, " "one-against-all, " and directed acyclic graph (DAG).According to the comparison results obtained by Hsu and Lin [35], the "one-against-one" method is more suitable for practical use than other methods.For the sample set including  class, ( − 1)/2 SVM classifiers can be constructed by using "oneagainst-one" method and every SVM classifier is trained.

Fusion Fault Diagnosis Modeling of EGT Based on FCM-SVM Approach
As the supervised artificial intelligence method, enough prior knowledge including specific fault types is necessary for SVM classification model.However, the fault types of gas turbine EGT cannot be determined directly based on experience in the practical applications due to the effects of many uncertain factors.In order to achieve the automatic monitor and diagnosis of EGT effectively, a fusion approach based on FCM clustering algorithm and SVM classification model (FCM-SVM) is proposed.Firstly, FCM clustering algorithm is used to realize clustering analysis and obtain the state patterns, which means that the preclassification of EGT is completed.Then, SVM multiclassification model is designed and used to carry out the online state pattern recognition and fault diagnosis of gas turbine EGT.
Figure 5 shows the fusion fault diagnosis framework of gas turbine EGT based on FCM-SVM approach.The detailed modeling processes are as follows.
Step 1. Generated sufficient EGT samples from the historical database and the essential preprocessing upon EGT data are carried out before data analysis, such as supplementary data, eliminating noise and outliers.
Step 2. According to the cluster process which is shown in Figure 3, FCM clustering algorithm is used to obtain the initial clustering results of gas turbine EGT.
Step 3. Cluster validity index () is used to evaluate the validity of clustering and determine the number of clusters.The  is optimum when () reaches its maximum value: Step 4. After obtaining the optimal clustering results, the fault diagnosis sample set including specific fault types can be established.
Step 5. SVM multiclassification model will be designed based on "one-against-one" method and trained by using fault diagnosis sample set.
Step 6.The measured EGT obtained from real gas turbine are preprocessed and inputted to the well trained SVM multiclassification model.Then we can get the final diagnostic results.
Step 7. The measured EGT also are stored into the historical database and used for later analysis.

Case Study and Discussions
In order to demonstrate the effectiveness of FCM-SVM approach introduced in this paper, the historical monitoring data of EGT from one industrial single shaft gas turbine will be analyzed as a case study in this section.

Sample Data.
As an industrial single shaft gas turbine, Taurus70 is made in solar turbines incorporated and used for power generation.12 thermocouple temperature sensors are used to measure the EGT and the average EGT is about 505 ∘ C when gas turbine operates in a normal state.Figure 6 shows the changing curves of 12 EGT varying with time under normal running state of gas turbine.And the EGT profiles can be seen in Figure 7. From Figures 6  and 7, it is clear that there is significant difference between the measured outputs of different thermocouple temperature sensors at the same time even when the gas turbine is running in a normal state.Therefore, much feature information will be ignored which can decrease the fault diagnosis accuracy if only the average EGT is used to evaluate and analyse the health state of EGT.Considering the operating conditions of gas turbine, 490-group data including 4 classes are taken to establish the original sample set.470 samples are selected randomly as training samples and the remaining 20 samples are selected as testing samples.

Optimal Clustering of EGT Based on FCM Clustering Algorithm.
For the FCM clustering algorithm, it is very important to determine an appropriate number of clusters, which is called cluster validity problem.In this study, the cluster number is decided automatically by using the introduced cluster validity index which is shown in (13).Considering the computation complexity and accuracy, the scope of the  number of clusters is commonly [2, √] in practical process and  is the number of samples.For the 470 training samples shown in Table 1, the FCM clustering algorithm can stepwise iterate from 2 to 21 clusters.Figure 8 shows the changing trends of cluster validity index () as a function of the number of clusters.Based on the result shown in Figure 8, it is clear that () increases at first and then drops down with the increases of the number of clusters.And () can reach its maximum value when the number of clusters is 4, which is in agreement with the real samples class.Therefore, it is concluded that the FCM clustering algorithm is suitable for optimal clustering of gas turbine EGT.
Considering the high dimension characteristics of samples, it is difficult to realize graphical analysis directly.In this paper, three temperature uniformity indexes described by Mao [22] are used to further analyze and evaluate the cluster results of gas turbine EGT by using graphic approach.

EGT Fault Diagnosis Based on SVM Classification
Model.According to the optimal clustering results, the fault diagnosis training sample set including fault types can be established, which is shown in Table 2. Then on this basis, we can develop 6 SVM classifiers based on "one-againstone" method.Table 3 shows the fault diagnosis performance of SVM multiclassification model for the training samples.From Table 3, it is clear that the fault diagnosis accuracy rate  4 shows the comparison between actual results and fault diagnosis results by using the well trained SVM multiclassification model for testing samples.Based on the results shown in Table 4, it is demonstrated that the well trained SVM multiclassification model can effectively diagnose the fault of gas turbine EGT with a 95% accuracy rate for the testing samples.Besides, the reason of misclassification is that the sample data is obtained when the turbine blade wear or corrosion is not severe.In order to compare with other models, backpropagation (BP) neural network model is also employed to make the same fault diagnosis and the results are also listed in Table 4.The comparative analysis shows that SVM classification model can improve the fault diagnosis accuracy of gas turbine EGT significantly compared with BP neural network model.All these indicate that SVM is more suitable for fault diagnosis of gas turbine EGT.

Conclusions
Considering the distribution characteristics of gas turbine EGT and its effect on the health state of gas turbine, a fusion approach based on FCM clustering algorithm and SVM classification model (FCM-SVM) is proposed and successfully applied to an industrial gas turbine in this paper.In the analysis presented in this study, it is demonstrated that FCM-SVM based approach can make full use of the unsupervised feature extraction ability of FCM clustering algorithm and the sample classification generalization properties of SVM multiclassification model, which offers an effective way to realize the online condition recognition and fault diagnosis of gas turbine EGT.In the concrete implementation process, the introduced FCM clustering algorithm is a good alternative to achieve automatic identification of the fault types of gas turbine EGT.In other words, it is effective to overcome the influence of experience judgment on fault types.Besides, the introduction of SVM multiclassification model has a great potential to improve the fault diagnosis performance of gas turbine EGT.It is worth noticing that the study of this paper is only focused on researching the artificial intelligence approach for the condition recognition and fault diagnosis of gas turbine EGT but ignores the effects of many other parameters such as inlet temperature of gas  turbine.Therefore, more studies and improvement about the application of this approach are needed further.

Figure 1 :Figure 2 :
Figure 1: EGT profile of gas turbine with normal condition.

Figure 4 :
Figure 4: Classification of two classes using SVM.

Figure 5 :
Figure 5: Fusion fault diagnosis framework of gas turbine EGT based on FCM-SVM approach.

Figure 6 :Figure 7 :
Figure 6: Real-time measured EGT curve of gas turbine with normal condition.

Figure 8 :Figure 9 :
Figure 8: The effect of the number of clusters on cluster validity index ().

Figures 9 -
11 show the cluster results of gas turbine EGT based on FCM clustering algorithm.It may be clearly observed in Figures 9-11 that all the three temperature uniformity indexes of F1 class are relatively small (0 ∘ C ≤  1 ≤ 30 ∘ C, 0 ∘ C ≤  2 ≤ 30 ∘ C, and 0 ∘ C ≤  3 ≤ 30 ∘ C).This means that 12 thermocouple temperature sensors give the almost same outputs,

Table 1 :
The original sample set of gas turbine EGT.

Table 2 :
Fault diagnosis training sample set of EGT.

Table 3 :
Fault diagnosis performance of SVM multiclassification model for training samples.

Table 4 :
The comparison results of different fault diagnosis models for testing samples.