^{1}

The paper illustrates the design and the implementation of a Fault Detection and Isolation (FDI) system to a rotary machine like a multishaft centrifugal compressor. A model-free approach, that is, the Principal Component Analysis (PCA), has been employed to solve the fault detection issue. For the fault isolation purpose structured residuals have been adopted while an adaptive threshold has been designed in order to detect and to isolate the faults. To prove the goodness of the proposed FDI system, historical data of a nitrogen centrifugal compressor employed in a refinery plant are considered. Tests results show that detection and isolation of single as well as multiple faults are successfully achieved.

In oil and gas plants many processes such as refinery, natural gas extraction/compression, energy production, and gasification, involve the use of rotary machines like centrifugal pumps, turbines, and centrifugal compressors. Given the importance and the crucial role played by these machines, in the literature several approaches for their control and supervision have been proposed. For control purposes, the compressor regulation distant from the surge point has been discussed in [

In the present work a multivariable data-driven and model-free approach, that is, Principal Component Analysis (PCA) [

In the literature, other authors suggested using a PCA technique for industrial plant applications: in [

The contribution of the paper is the adoption of the adaptive thresholds approach, suitably implemented, to compute the gains associated with the inputs of the thresholds.

The paper is organized as follows. In Section

A brief overview of the Principal Components Analysis and its use in fault detection and isolation is given here. The main steps of the procedure useful for the understanding of the proposed FDI system are reported. Further detailed descriptions of the method are available in published articles (see among others [

The Principal Component Analysis can be considered a subspace decomposition technique by which the process measurement space is divided into two orthogonal subspaces, that is, the principal components (PC) subspace and residual subspace. The PC subspace contains the components that account for a maximal amount of total variance in the observed variables. In practice, given a data matrix

Consider the transformation of the data matrix

Matrix

Once the number of the

Equation (

When approaching a PCA problem, the number of the Principal Components retained in the model is an essential parameter that strongly determines its performance. When too few components are retained, the model will not capture all of the information in the data and a poor representation of the process will result. On the other hand, if too many components are chosen, the model will be overparameterized and it will include noise. Several approaches for selecting the optimal number of PCs have been developed. The approach followed in this paper is based on the ANalysis Of Variance, ANOVA test (see [

Stored dataset is processed offline applying the ANOVA-based test so that principals components are determined; the loading matrix

Then the signal of the original variables

Developing a (good) diagnostic system, the interest in general is not just to accomplish the faults detectability but rather to specify the kind of the fault that has occurred, thus realizing the primary function of isolation. At this regard various methods have been developed for the generation of particular structured residuals. The authors have chosen to apply the so-called

To achieve isolation the reconstructed variables

This property can be used to identify which components of

Finally, considering the large dimensions of the problem, instead of checking all the components of computed residual vectors separately, it may be convenient to group them in to a single index called square prediction error (SPE) calculated as the square Euclidean norm of the computed residuals:

Fault isolation is based on the threshold violation of one or a few SPE variables.

To perform the fault detection and isolation, it is necessary to compare the SPEs values to threshold values which can be fixed or adaptive. Values for static threshold are typically computed, analyzing the SPEs signal in normal operative conditions. This solution works well if the process stays in a steady state or if the operating point does not change. Furthermore, when operating with real processes data, it is highly possible that due to the presence of noisy data and/or due to the influence of external conditions, the generated residuals exceed the set fixed threshold even without faults. A possible solution is to enlarge threshold values. To set the tolerances, compromises have to be made between the threshold size for avoiding misdetection problems because of too large threshold values and the generations of false alarms because of normal fluctuations of the system.

In order to improve the isolation feature, the authors have chosen to employ an adaptive threshold approach. In fact, the process variability is such that the adoption of a fixed threshold would imply many spurious false alarms.

In the literature, different approaches to the design of adaptive thresholds have been introduced. Isermann in [

Following Clark’s approach, the adaptive threshold scheme, proposed by the authors, is constituted by a term proportional to the amplitude of the input signals and of a constant term for a tight tuning (see Figure

The scheme employed by the authors to realize the adaptive threshold.

For the actual computation of the gains no clear suggestions are given in [

In the present paper, a multishaft centrifugal compressor is considered. Centrifugal machines are critical equipments; their essential characteristics have been the large pressure rises and flow rates involved. Given the importance of and the crucial role played by compressor machines, in recent years an increasing attention has been given to the prevention of possible frequent malfunctions and potential faults which may cause inactivity of compressor or even its complete break.

The machine, called BLNC (base load nitrogen compressor), is located in the air separation unit (ASU) of a refinery plant and is employed for nitrogen compression in the dilution of a particular gas, the Syngas, which is forwarded to a gas turbine. It is a complex machinery consisting of two sections: the first section includes two compression stages while the second comprises three compression stages.

In order to decrease the nitrogen temperature at the exit of the compression stage, leveling it at its input value before compression, a heat exchanger is positioned at the end of each stage. In these conditions, the compression is nearly equivalent to an isotherm process which requires less mechanical work for the compression.

Variables considered in the construction of the data matrix _{2}) flow through the multishaft compressor measured at different points, the commanded actuators signals, and their actual values used to regulate the position of the IGVs. All the variables included in the PCA data matrix

Considered process variables.

Tag name | Description |
---|---|

FYI_89735 | N_{2} mass flow through the BLNC first section |

FV_89700 | Positioner of the Inlet Guide Vanes (IGV) relative to the first and second stages of BLNC |

ZT_89700 | Feedback of IGV position relative to the BLNC first and second stages |

PV_89750 | Measurement of the vent position at the entrance of first section of BLNC |

FYA_89710A | N_{2} mass flow through the second section of BLNC |

FV_89704 | Positioner of the IGV relative to the third stage of BLNC |

ZT_89704 | Feedback of IGV position relative to the third stage of BLNC |

FV_89751A | Throttle valve position relative to inlet high pressure nitrogen gas (HNG) |

FYA_89751 | N_{2} mass flow at the head of the high pressure column |

The first step of PCA method concerns the selection of the Principal Component retained in the model. At this regard, for the computation of the matrix

Accordingly, with the PCA model, the measurement space has been partitioned into two orthogonal spaces: the principal component subspace, which includes data variations according to the principal component model, and the residual subspace, which includes data variation not explained by the model. Applying the ANOVA procedure developed by the authors (see [_{2} mass flow at the head of the high pressure column) where slight mismatching can be noticed, all the reconstructions are in good agreement with measurements.

Figures (a)–(i) show the trends of the real variables (blue) and their reconstruction with PCA (red).

After having trained the model on system data in the absence of faults, that is, after having chosen the PCs that make up the

Faults that may possibly occur in the centrifugal compressor concern errors in the sensor readings and/or in the actuators. By inspection of historical data of the compressor at issue, the most common faults were found to be faults of the actuators and mass flow sensors as specified in Table

Most common compressor faults.

Tag name | Description | Time dependency |
---|---|---|

FYI_89735 | Fault in the first stage N_{2} mass flow sensor |
Abrupt fault; |

ZT_89700 | Fault in the first stage IGV positioner | Abrupt fault; |

FYA_89710 | Fault in the third stage N_{2} mass flow sensor |
Abrupt fault; |

ZT_89704 | Fault in the third stage IGV positioner | Abrupt fault; |

FYA_89751 | Fault in the N_{2} mass flow sensor relative to N_{2} flow at the head of the high pressure column |
Abrupt fault; |

The diagnoser has been tested on the detection and isolation of faults of the Inlet Guide Vanes (IGV) of the third stage of the compressor. An abrupt failure of the actuator, which caused its complete breakdown, was documented on the historical data at disposal; this failure was correctly detected by the Fault Diagnosis module, but since the detection in this case is quite trivial, we have chosen to test the diagnoser performances on an intermittent fault by modifying the real data by the addition of step or ramp variations. These kinds of faults may be likely caused by a temporary malfunction of the leverage used for IGV handling.

A drift on the IGV positioner has been simulated with the addition of a ramp signal up to 10% of the variable amplitude starting from the 50th sample as shown in Figure

Third section IGV position (blue) and ramp additive fault (red).

SPE relative to the N_{2} mass flow sensor and its threshold.

Differences between the SPE relative to N_{2} mass flow sensor and its threshold.

Zoom on threshold overcoming.

This situation allows the correct isolatation of the fault. Figure

Difference between the SPE and its relative threshold: only the SPE relative to the 7th variable (f) stays under the threshold.

These results show that the fault is detected at the 12th sample after its onset; given a sampling time of five minutes, the fault detection time can be computed in the order of 60 minutes. The behavior of the diagnoser in term of promptness is linked to the adopted large sampling time; given that actual process dynamics which are rather slow, the result can be considered fully satisfactory. Given the interest to investigate the performances of the proposed system on an operational dataset covering a large period of time (about two years), the adoption of a sampling time of five minutes was consequentially forced by the dataset at disposal. It is clear that the proposed FDI system, when implemented online, can process data at a faster sampling rate (typically one minute). No sensible limitations are imposed by the computational load of the proposed approach; the lower limit is determined at the I/O acquisition level: the DCS employed for controlling the machinery under study does not handle sampling period lower than 0.1 seconds; moreover, for many of the considered variables in Table

To check the validity of the proposed system on the detection of multiple faults, faults on the two IGV partitioners of the first and third section, respectively, have been simulated.

The simulated faults are supposed to be simultaneous, and they have been constructed with the following characteristics:

Positioner ZT_89700: a trend of the signal is simulated at the time instants where the IGV position is opened around 60% of its total value;

Positioner ZT_89704: a bias of 10% of its average value is simulated.

In Figure

PCA detection results when considering single directions: all the SPEs exceed their thresholds.

The system correctly isolates the faults occurred indicating the presence of a multiple fault on the 3rd and 7th variables. Computed residuals concerning some of the variable pairs are reported in Figures

Some PCA detection results when considering directions pairs: the SPEs exceed their thresholds.

SPE calculated from the faulty variables pair (variable 3 and variable 7) remains under threshold.

A Fault Diagnosis system for the detection and the isolation of expected faults of a rotary machine based on the Principle Components Analysis technique has been developed. The considered machine is a multishaft centrifugal compressor located in an integrated gasification and combined cycle of a refinery plant. The adoption of a model-free technique is justified by the fact that in the process industry, rich process data are available while, conversely, the development of a physical model is a demanding task that may not assure suitable results.

For the Practical implementation of the PCA, the choice of the number of principal components to be retained in the model has been based on an approach centered on the ANalysis Of Variance, ANOVA test, and for what concern the detection and the isolation issues, a structured residual approach has been applied and an adaptive threshold has been adopted for their analysis that was well suited in dealing with the process variability.

The present method has been successfully applied in the development of a Fault Detection and Isolation (FDI) system of a multishaft centrifugal compressor located in a refinery plant; malfunctions on the compressor sensors and actuators have been examined and isolation of single and multiple faults as well as process faults have been successful.