Detection of Emerging Faults on Industrial Gas Turbines Using Extended Gaussian Mixture Models

This paper extends traditional Gaussian mixture model (GMM) techniques to provide recognition of operational states and detection of emerging faults for industrial systems. A variational Bayesian method allows a GMM to cluster with its mixture components to facilitate the extraction of steady-state operational behaviour; this is recognised as being a primary factor in reducing the susceptibility of alternative prognostic/diagnostic techniques, which would initiate false-alarms resulting from control set-point and load changes. Furthermore, a GMM with an outlier component is discussed and applied for direct novelty/fault detection. An advantage of the variational Bayesian method over traditional predefined thresholds is the extraction of steady-state data during both fulland part-load cases, and a primary advantage of the GMM with an outlier component is its applicability for novelty detection when there is a lack of prior knowledge of fault patterns. Results obtained from the real-time measurements on the operational industrial gas turbines have shown that the proposed technique provides integrated preprocessing, benchmarking, and novelty/fault detection methodology.


Introduction
Industrial gas turbines (IGTs) are utilised globally with units generally acting as prime-movers for either pumps, generators, or compressors, for both on-and off-shore systems.Various root-cause factors responsible for failures on such systems include vibration, shock, noise, heat, cold, dust, corrosion, humidity, rain, oil debris, flow, pressure, and excessive operating speed [1].A key challenge of condition monitoring in order to provide an "early warning" of faults is to distinguish between sensor-based failure, component failure, and the normal operational transient behaviour of the system, for example, due to control or load changes.With advances in instrumentation, communications hardware, and computational capability [2], there has been an increased ability to realize such prognostic and diagnostic methods and provide remedial action and informed flexible maintenance scheduling prior to encountering unplanned downtime.This is especially pertinent as a result of the increasing operational speeds of IGTs [3].
Two popular categories of monitoring techniques have emerged over recent years, namely, model-based and signal processing-based approaches [4].Model-based approaches construct models (or virtual sensors) to estimate physical variables from which residuals are calculated and used as indicators of emerging failure modes [5].However, to build an accurate dynamic model that can accommodate the full operating envelope of IGTs is, in general, computationally demanding.In such circumstances, direct signal processing and data fusion methods often provide for more practical and effective monitoring solutions [6].It is this latter category of techniques that is considered here.
Traditional signal processing-based methodologies use techniques such as principal component analysis (PCA) [7,8], artificial neural networks (ANNs) [9], and data filter methods [10].However, monitoring systems based on these algorithms are generally only applicable during steadystate operating conditions, since measurement transients caused by changes of loading or control action can generate "false alarms."Many techniques are therefore only effective under very constrained operating regimes.For instance, [11] employed ANNs for fault detection on gas turbines during the engine start-up phase, whilst [12] only considers solutions during steady-state operation.Typically, such techniques do not attempt to address the issue by incorporating implicit methods that discriminate between steady-state and transient behaviour as part of the fault detection system, and it is this aspect that is initially considered here [13].
By recognizing that steady-state measurement data is often superimposed by noise having a characteristic Gaussian distribution [14], that is, containing practical "measurement transients" that are not due to operational variations, the signals can be modelled through use of Gaussian Mixture Models (GMMs) [15].This characterization of signals has previously been reported [16,17] and used in a related family of clustering methods based on GMMs, regarded as "soft clustering" techniques, with the benefits for condition monitoring and fault detection explored in [16][17][18].
Here then, it is initially shown that GMMs provide a convenient mechanism to effectively discriminate between data relating to steady-state operation and that relating operation with transients and are specifically regarded in this instance as a preprocessing tool for subsequent operational pattern discrimination [13].An essential part of developing the GMM methodology is parameter fitting, which is often carried out using an expectation-maximization (EM) method [19].However, this requires the number of the mixture components (MCs) in the model to be fixed a priori [18].Consequently, Bayesian-based frameworks are commonly used to provide a probabilistic inference on the data [20], and Variational Bayesian GMM techniques (VBGMMs) have been proposed [21] to provide improved performance [22,23] by automatically selecting the number of MCs in the GMMs.VBGMMs are able to classify steady-state operation that can occur under full-or part-load conditions (e.g., 50% load).Additionally therefore, as well as identifying steadystate operation, the remaining data, including that associated with start-ups, shutdowns, and load changing conditions, is also naturally separated and it can therefore be used as a preanalysis tool for alternative dynamic scenarios, as reported in [24,25], for instance.
An additional property of GMMs that facilitates novelty/fault detection is the ability to filter novel class samples when the machine learning system has no a priori knowledge [26].Moreover, GMMs have been previously reported for the detection of "novel" vibration signatures with experimental results showing good fault detection properties with known classifications [27].However, the technique was sensitive to vibration characteristics that are not associated with the detection of wear or damage and so remained susceptible to initiating false alarms.As a consequence of these features, the outlier components are now included as background "noise" for the GMM [28,29], and the resulting technique is considered as a GMM with an outlier component (GMMOC).When using GMMOC, measurements remain characterized by the GMM, but novel characteristics, that is, measurements that have a very low probability of being clustered into existing distributions, are considered as outliers.It is the identification of such outliers that provides a robust mechanism for IGT fault detection considered here to provide an "early warning" fault detection tool.
Key contributions of this paper are summarized as follows: (1) An extension to the use of GMMs, including VBGMM and GMMOC, is proposed for novelty/fault detection on industrial systems.
(2) Steady-state operation is discriminated from transient operation using VBGMM.
(3) GMMOC is used to indicate the presence of outliers and hence the emergence of faults.
(4) The efficacy of the proposed techniques is demonstrated from case studies (CSs) on IGTs, where CS1 is considered as a feasibility study of VBGMM and GMMOC for novelty detection and CS2 demonstrates a real bearing fault case study.

Methodology
The stages in the proposed methodology are depicted in Table 1.To provide an application focus for the development of the algorithms, measurements taken from bearing vibration probes on IGTs during commissioning are used as an illustrative example.Operational pattern separation is achieved through the use of VBGMM, where the steadystate data are distinguished for further analysis in this paper.Datasets from the identified transient operation can also be used for fault detection through start-up analysis and shutdown analysis and during load changes [6,24,25], which is not included in the current paper.The most relevant features are then extracted from the steady-state data and a statistical "fingerprint" for the extracted features is obtained through the application of GMMOC.

Underlying Principles of GMM.
The empirical probability distribution of sampled data can be estimated by a GMM using a linear combination of Gaussian distributions ( | , ) [15], for example, as a sum of  Gaussian distributions with mean   and standard deviation   .The GMM with  × MCs is expressed as where  is a multidimensional variable and   are the mixing coefficients that need to be chosen.Let  ≡ {  } represent  data samples,  = 1, . . ., , where each sample consists of a multidimensional variable   .Provided that   are statistically independent, the probability function of  can be expressed as The mixture density is then where  indicates the MC index and  indicates the data sample index.The conditional probability is calculated as for a selected component  on the given data sample   .Evaluation of (3) typically necessitates an EM optimization procedure to maximize the log-likelihood function [18] from the maximization step (termed the -step): The unknowns in (3) are solved in the expectation step (step), from (6), as follows: (1) Choose an initialization of  (0)  ,  (0)  , and  (0)  .
(2) Iteratively update   ,   ,   , and   until convergence to a desired tolerance, using where  is the dimensionality of the data.For the special case  = 1, the whole dataset belongs to only 1 cluster, and the problem reduces to that of a Gaussian distribution fit.
(3) Iteratively update until the predefined tolerance is met.
After calculating (Ψ), () can be obtained.Maximization of (), which minimizes ln( | ), is again achieved by using the EM procedure.The -step for calculating  can be derived from (7), and the optimized distributions   can be obtained through the -step as mentioned above.For brevity, the reader is directed to [30] for a more in-depth discussion of the VB framework on GMMs.

Principles of GMMOC.
GMMOC extends the original GMM [28,29] approach by adding an outlier component that is modelled by a uniform distribution.The hybrid GMMOC model is then written as Since () is uniformly distributed, the EM procedure described previously can still be employed to solve for the required parameters.The outlier component is normally assumed to be small initially (e.g., 0.01, and therefore the initialization of the mixing coefficients in GMM satisfies ∑  =1   = 0.99).In this case, there are (+1) clusters, including  Gaussian distributions and one uniform distribution, the outlier component.
Parameters   ,   , and   can be estimated from (7), and if the probability of a data sample, belonging to any of the  Gaussian MCs, is smaller than a predefined threshold, it is clustered to the outlier component and therefore indicates a warning of an emerging fault, or facilitate novelty detection.

Feature Extraction.
Feature extraction provides an essential tool for reducing the dimensionality of raw data whilst keeping informative features [25].Many feature extraction techniques have been reported and successfully applied, including the use of the Fast Fourier Transform (FFT) and Discrete Wavelet Transform (DWT), all involving elaborate time-frequency transforms [31,32].However, the data used in the following studies are taken from IGT units in the field with sampling rates in the order of minutes, which excludes the use of frequency domain based methods as they do not satisfy traditional sampling rate criteria for the measured variables.In this case, statistical features of the data in the time domain are used, for example, the peak value, root mean square, crest factor, kurtosis, clearance factor, impulse factor, shape factor, and skewness, and the most informative features can be identified through optimization methods [33].For this study, in order not to divert the focus from the use of extended GMMs, only the most basic statistical attributes are employed; the mean (which carries information about measurement equilibrium) and standard deviation (which carries information about signal power) are used here as a proof of concept and also for practical reasons since it has been observed empirically that these features are sufficient in most cases.

Application Case Studies
The proposed methods are applied to monitor the vibration characteristics of fluid-film inlet-and output-bearings which typically support the compressor rotors of sub-15MW IGTs (Figure 1).The thrust bearings and journal bearings have operating speeds in excess of ∼10,000 rpm.Radial and axial positions are monitored using noncontact probes.Two experimental case studies are now considered, both of which adopt the procedure in Table 1.CS1 uses measurements taken over a relatively short time period (1-month) and demonstrates the effectiveness of VBGMM for operational state discrimination (steady-state/transient behaviour), feature extraction, and the initial setup of a benchmarking ellipse using GMMOC, whilst CS2 considers the analysis of longer periods of measurement data (12-month) and aims to show the efficacy of GMMs for identifying longer-term emerging faults.

Case Study 1: Operation State Discrimination, Feature
Extraction, and Novelty Detection.VBGMM is used to cluster measurements of output power (in terms of loading percentage) from a sub-15MW IGT to classify the unit's operational behaviour (Figure 2).One month (31 days) of daily data, each containing 1440 sample measurements, is used (i.e., sampling period = 1 minute).The resulting clusters from the power/load measurements are also shown in Figure 2 after applying VBGMM to each individual set of daily data.
In line with the algorithm description, it can be seen that when the unit is considered to be operating normally, in steady-state, a classification label of 1 is assigned and that classification labels > 1 are assigned for all other detected cases (cluster result given in Figure 3).A classification label of 0 indicates constant null readings and is precluded from further pattern analysis as the unit is considered to be shut down.Corresponding inlet bearing vibration measurements taken for the same 31-day period are shown in Figure 4. Having discriminated between transient and steady-state operation using VBGMM, the steady-state data (days 1-12 and days 29-31) shown in Figure 5 can be used for subsequent novelty detection.
Having identified appropriate datasets, feature extraction is used to capture important characteristics present in the data.The mean and the standard deviation of each day's data for months 1-3 are calculated and given in Figure 6, to present a benchmarking envelope representing normal operation.Having effectively obtained an operational fingerprint of behaviour, measurements taken over subsequent months are then used and compared to the fingerprint.
Considering only the magnitude of vibration amplitudes, levels up to ∼50 m are typically considered normal, with warnings at ∼70 m and unit shutdown occurring at ∼90 m.Having obtained a fingerprint representing normal operation, GMMOC is applied to subsequent periods of data on a daily basis (in this case in month 4, Figure 7).It is well known that gradual bearing wear leading to failure is often preceded by gradual changes in vibration characteristics.Figure 8 provides an ellipse boundary drawn according to the 1-cluster GMM model (see (13)).In general, the confidence level to identify outliers will be set according to application.For instance, in this case, a 99.99% confidence level does not discriminate the outliers (here, it indicates outliers in normal operation which may be caused by sensor malfunctioning); however, a 95% confidence level is clearly seen to be appropriate in this instance.The results from GMMOC are compared with the envelopes drawn from the original GMM, as shown in Figure 8, and the advantage of GMMOC over the original GMM is evident, since the outliers, days 9 and 10, are clearly identified even with a 99.99% confidence level of GMMOC in this case.
By considering the following month of measurement data, it is notable that the measurements have correctly been identified as outliers (day 9 and day 10 in this case), whereas measurements from days 1-4 are correctly considered to correspond to normal behaviour.Although it is not known at that stage if the increase in vibration in days 9 and 10 is related to a component fault, the measurements are considered as anomalies.Although CS1 is used as a proof of concept application of VBGMM and GMMOC for novelty detection, the bearing considered in the study did fail around 3 months after initially being identified using the extended GMM.

Case Study 2.
CS2 uses measurements taken over a longer period of 12 months using a lower sampling rate; specifically 1 data sample taken every 9 minutes, as shown in Figure 9.
Again the procedure depicted in Table 1 is applied.Measurements from months 1-4 are deemed to describe normal operational characteristics.VBGMM is then applied to identify what is considered to be steady-state operational data for further analysis (Figure 10).Important characteristics are then determined; once again it is known from empirical studies that for this application case the mean and standard deviation are effective measures of underlying behaviour.Through the application of GMMOC and by clustering the extracted features into a single cluster, a fingerprint of normal operation is obtained, as shown in Figure 11, where the 95% confidence envelope is considered as an early warning boundary in this instance and the 99% confidence envelope as a fault detection boundary.Using measurements from month 5 as testing data, it is shown in Figure 11 that, between days 20 and 25 of the 5th month, operation falls outside the "normal fingerprint" ellipsoid and is therefore identified as an outlier, thereby providing an early warning of expected failure, which was evidenced during the field service.
Referring again to measurements shown in Figure 9, on day 140 of operation (day 20 of month 5) a transient in the vibration level is evident, and although the mean level ultimately returns to normal after month 5, the variance continued to indicate evidence of emerging failure during and after month 6 (Figure 12).Considering months 6-12 as a period of emerging fault, the number of fault patterns can be discriminated using VBGMM (2 clusters for faults in this case) and the fault pattern locations identified using GMMOC, as shown in Figure 12.It can be seen that Fault Pattern 2 overlaps with the original fingerprint from months 1-4, indicating in this case that ideally other feature extraction indices (e.g., those involved with loading conditions) could be used to further isolate this type of fault characteristic from that of normal operation.Between the 2 sets of fault patterns (Figure 12), some anomalies are apparent which are due to the increasing levels of vibration as the bearing deteriorates, such as the 20th to 25th days of the 5th month's operation.
In this instance the unit was shut down for maintenance in order to prevent a catastrophic failure.The bearing was known to be undamaged in month 1 as it was assessed in the previous service check (see Figure 13).Through subsequent decommissioning of the unit and investigation, vibration damage is evident on the inlet bearing, with excessive wear on the tilt pad shown in Figure 14.It can be clearly seen that there has been abnormal wear from the markings shown on both the shaft (Figure 15) and bearing pads (Figure 14).Through root-cause analysis, the damage was attributed to   an incorrectly specified lubricant oil cooler causing high temperatures in the lubricant oil.

Conclusion
The paper has developed and demonstrated extensions of GMMs to provide a highly practical preprocessing and novelty/fault detection tool.The main contributions of the paper are (1) an automatic clustering method for VBGMM which identifies steady-state operational behaviour from transient operation, allowing the extraction of steady-state measurement segments for subsequent condition monitoring and (2) a GMMOC method that has been proposed and shown to provide a valuable tool for use as an early warning system of emerging failure through novelty detection.The presented techniques are currently being utilised in an industrial environment to monitor the operational status of a global fleet of IGTs.Although the experimental trials have focused on IGTs and bearing vibration measurements in this instance, the proposed methods are much more widely applicable to other industrial components and systems for pattern analysis, benchmarking, and novelty/fault detection.

Nomenclature
IGT: Industrial gas turbine GMM: Gaussian Mixture Model VBGMM: Variational Bayesian Gaussian Mixture Model GMMOC: Gaussian Mixture Model with an outlier component PCA: Principal component analysis ANN: Artificial neural network MC: Mixture component EM: Expectation-maximization -step: Maximization step -step: Expectation step KL: Kullback-Leibler CS: Case study.

Table 1 :
Stages of the proposed methodology.
VBGMM.A Variational Bayesian (VB) method can be used to determine the required number of MCs.Specifically,  ×  binary latent variables   ∈ {0, 1} are used to indicate which MCs the data sample clusters into.When forming a GMM using classical methods,  is selected a priori.However, when using a VB method,  is resolved from the solutions of {  }, termed .