Subspace Method Aided Data-Driven Fault Detection Based on Principal Component Analysis

. The model-based fault detection technique, which needs to identify the system models, has been well established. The objective of this paper is to develop an alternative procedure instead of identifying the system models. In this paper, subspace method aided data-driven fault detection based on principal component analysis (PCA) is proposed. The basic idea is to use PCA to identify the system observability matrices from input and output data and construct residual generators. The advantage of the proposed method is that we just need to identify the parameterized matrices related to residuals rather than the system models, which reduces the computational steps of the system. The proposed approach is illustrated by a simulation study on the Tennessee Eastman process.


Introduction
The safety, stability, and efficiency of dynamic systems have always been a matter of great concern in the field of complex industrial processes.Fault detection plays an extremely important role in improving the safety of processes and attracts more and more attention in the field of process monitoring systems.
During the past three decades, a model-based fault detection technique for linear time invariant (LTI) systems has been well established [1][2][3][4].But in some complex systems and large-scale industries, it is difficult to obtain accurate mathematical models.With the development of the computer and information industry, Big Data has attracted more and more attention.Actually, industrial processes have large amounts of operating data which contain abundant information about the system.Subspace model identification which uses process data to identify system models comes into being under this kind of background.There are several methods of subspace algorithms, such as MOESP [5], N4SID [6], and CVA [7].In a typical SMI, the identification procedure comprises two steps.Firstly, the extended observability matrix Γ  and a block triangular Toeplitz matrix   are identified from input and output data.Then, , , , and  are calculated from the identified observability matrix and the Toeplitz matrix.Fault detection using this method usually comprises two steps: (1) system identification and (2) model-based fault detection.
Residual generation and residual evaluation are essential for model-based fault detection.Fault detection can be achieved if residuals are obtained.Parity space approach (PSA) [8] is the simplest and most widely used among modelbased fault detection methods.We noticed that there is a close bond between the two methods by comparing PSA and SMI.The residual could be calculated by some parameterized matrices related to Γ  and   .So, it is feasible that we only need to use the first step of SMI to identify parameterized matrices instead of the system models, which reduces the calculation steps.To the best of our knowledge, Ding and his coworkers subtly associated SMI with fault detection and proposed a subspace aided approach [9][10][11].But most of these SMI methods are biased under the errors-invariables (EIV) situation.Wang and Qin [12] proposed a subspace identification approach based on PCA that gives consistent model estimates under the EIV situation.Inspired by their work, we propose subspace method aided data-driven fault detection based on PCA in this paper.Figure 1 shows the difference between the proposed approach and the classic model-based approach.

Model based approach
The proposed approach Identify Γ f and H f In this paper, the main contribution of this proposed approach lies in the direct identification of the parameterized matrices by PCA.As long as these matrices related to residuals are identified, we can construct residual generators.The proposed approach is illustrated by a simulation study on the Tennessee Eastman process.
The rest of the paper is organized as follows.Section 2 gives the preliminaries and problem formulation.The identification of parameterized matrices is presented in Section 3. Section 4 illustrates a simulation study on the Tennessee Eastman process.Finally, the conclusions are presented in Section 5.

Preliminaries and Problem Formulation
2.1.Process Descriptions.The form of the state space representation of a discrete time LTI system is given by where  ∈   ,  * () ∈   ,  * () ∈   , and () ∈   denote the vectors of the state variables, noise-free inputs, noisefree outputs, and process noise, respectively.The available input measurements () and output measurements () are described by [13]  () =  * () +  () where () ∈   and V() ∈   represent the input and output noise.The following assumptions are introduced: (i) The system is observable and controllable.
(ii) (), V(), and () are assumed to be zero-mean, normally distributed white noise; that is, (iii) All the noise is assumed to be independent of the past noise-free input  * () and initial state (1).They satisfy (iv) System matrices , , , and  and system order  are unknown.
We define the following vectors and Hankel matrices.The subscripts  and  represent the past and the future, where  ⩾  ⩾ .
By iterating ( 1) and ( 2), we can get The vectors V  () ∈   ,   () ∈   ,   () ∈   , and   () ∈   have similar structures to   ().If we use Hankel matrix instead of the vectors to describe the system, (6) can be written as The matrices   and   are defined similarly to   .

Residual Generation and Evaluation.
Residual generation and residual evaluation are essential to design a fault detection system.The main idea of this paper is to generate residual from input and output data.How to generate and evaluate residual is introduced in the following section.
Moving     to the left of the equation, ( 7) can be rewritten as Taking Γ ⊥  ∈  ×(−) as the orthogonal complement of Γ  , we can obtain Since () is unknown, we multiply (Γ ⊥  )  on both sides of the equation to eliminate the effect of the unknown state [14].Then, we can get So, the residual signal We noticed that the residual is closely related to the extended observability Γ  and a block triangular Toeplitz matrix   .Therefore, if we want to obtain the residual, the key step is to identify (Γ ⊥  )  and −(Γ ⊥  )    .In Section 3, how to identify these parameterized matrices will be introduced in detail.
Once these parameterized matrices are identified, we can get residual signal.Then, the fault detection can be completed by residual evaluation.Let () be the residual vector at the instant , defining the testing statistic [15] whereΣ res is the covariance matrix of the residual matrix in the fault-free case.Note that when there is no fault,  ∼  2 (ℎ), where  2 (ℎ) is the chi-squared distribution with ℎ degrees of freedom.We set  th =  2  (ℎ) as threshold and define the detection logic as Besides the chi-squared distribution, kernel density estimation is also utilized to determine the threshold [16].It is assumed that the measurements follow a Gaussian distribution.The threshold  th can be calculated by ( <  th ()) = , where  is confidence level.
where   is the th sample of ,  is the number of samples, ℎ is a smoothing parameter, and (⋅) is a kernel function.
Given a proper confidence level, we can get the threshold  th .In this paper, we set  = 0.95.After the statistics and threshold are deduced, the fault detection can be realized according to the above detection logic.

The Identification of Parameterized Matrices
Based on the knowledge that has been introduced in the previous sections, we know that the identification of matrices related to residual is the key step in this approach.In this section, we discuss the issue of how to identify these matrices.Wang and Qin proposed a subspace identification approach based on PCA.We will briefly introduce it.When the process is corrupted by process noise and measurement simultaneously, they use instrumental variables   to remove the noise, because the past data   is uncorrelated with future noise.According to these assumptions, we can get If we utilize   to represent residual instead of   and   , (11) can be expressed as Multiplying (1/)   on both sides of the equation, we can get From ( 16), we know Equation (19) indicates that (1/)     has zero scores.From the rank condition rank ( lim we know that when  → ∞, (1/)     has  −  zero scores.Performing PCA on the process data, where  ∈  (−)(−) is a nonsingular matrix, P is the first  rows of P, and P is the last  rows of P.
( The nonsingular matrix  is unit matrix in our paper.The matrices Γ ⊥  and −   Γ ⊥  can be obtained according to (22), and then the residual can be generated.The algorithm from data to realize fault detection can be summarized as the steps shown in Algorithm 1.

Simulation Study on the Tennessee Eastman Process
In this section, we apply the proposed approach to Tennessee Eastman (TE) process.Three indices are used-fault detection rate (FDR), false alarm rate (FAR) [17], and fault detection time (FDT)-to demonstrate the efficiency of the proposed approach.The fault detection time is the first time instance with the testing statistic  above the thresholds  th .(23) 4.1.TE Process.TE process model is a realistic simulation program of a chemical process that is widely accepted as a benchmark for control monitoring studies [18].The TE model that we used was downloaded from http://depts .washington.edu/control/LARRY/TE.The flow diagram of the process is shown in Figure 2. The process contains five major units named reactor, condenser, compressor, separator, and stripper.There are 53 measurements, 12 of which are manipulated variables and 41 are process variables.The 12 manipulated variables are listed in Table 1.In this paper, the other 9 manipulated variables except for XMV (5), XMV (9), and XMV (12) are taken as inputs.The dimension of the TE dataset is very large.To overcome the curse of dimensionality, the 41 process variables are divided into eight blocks [19].We choose the input feed block as outputs, which are shown in Table 2. Table 3 shows the 20 process faults of TE process.
In our simulation, these data are acquired in the case of Mode 1.The operating time is set to be 36 h.The total number of samples is  = 3601.The faults are introduced at 8 h, which  means that the fault occurs after 800 samples. and  are both set as 10.

Simulation Results.
The process is simulated with 20 different faults.For the sake of simplicity, we only take some typical faults to show.The first type of fault is step change in  and  feed ratio (IDV1).Figure 3 shows the testing statistic  based on residual vectors for input feed block.The blue line is the testing statistic  and the threshold  th is shown by a red line.The second type of fault is a random variation change in , , and  feed composition (IDV8).The simulated result is  shown in Figure 4. Figure 5 shows the testing statistic of the third type of fault which is a sticking fault in reactor cooling water valve (IDV14).The last fault (IDV17) shown in Figure 6 is an unknown fault.The type and process variables of fault are not known.Based on the fault detection method, if the testing statistics exceed the threshold, the fault is detected.We can see from Figures 3-6 that all the testing statistics  exceed the threshold  th after 800 samples.The four faults have been detected.
In addition to the simulation results, the FDR, FAR, and FDT shown in Table 4 are also obtained to demonstrate the efficiency.Besides the comparison between SMI-PCA and SMI-PCA KDE, we also compare them with the classic PCA in order to better reflect the advantages of this proposed method.According to the study of Mahadevan and Shah [20], the FDR of  2 and  for PCA are listed in the table.The maximum fault detection rate has been highlighted in boldface.We can see that most of the fault detection rate of our proposed method is higher than PCA.The fault detection rate of using kernel density estimation to calculate the threshold is slightly higher than the chi-squared distribution.Most of the faults can be detected well and they have high FDRS and low FARS.But for some faults, that is, IDV(3), (5), (9), and (15-16), the efficiency of fault detection is very poor.This may be caused by a very small change in the variables, which is also a common problem in the TE process fault detection.There is no meaning to calculate the fault detection time if the faults cannot be detected, so the fault detection time is indicated with " * " in the table.The fault detection time of SMI-PCA and SMI-PCA KDE is the same because of their similar detection rate.

Conclusions
In this paper, subspace method aided data-driven fault detection based on PCA has been presented.This method is to identify the parameterized matrices using PCA and construct residual generators with the input and output data.A simulation study on the TE process demonstrates the availability of this method.It indicates that the method proposed in this paper has better effects than PCA on fault detection rate and is suitable for linear systems which are observable and controllable.Moreover, the problem of threshold is also discussed in this paper.The fault detection rate of using kernel density estimation is slightly superior to the chi-squared distribution. feed temperature Random variation IDV (10)  feed temperature Random variation IDV (11) Reactor cooling water inlet temperature Random variation IDV (12) Condenser cooling water inlet temperature Random variation IDV (13) Reaction kinetics Random variation IDV (14) Reactor cooling water valve Sticking IDV (15) Condenser cooling water valve Sticking IDV (16) Unknown Unknown IDV (17) Unknown Unknown IDV (18) Unknown Unknown IDV (19) Unknown Unknown IDV (20) Unknown Unknown

Figure 1 :
Figure 1: Model-based approach and the proposed approach.

Table 2 :
Input feed block process measurements.