Fault Diagnosis of Complex Industrial Process Using KICA and Sparse SVM

New approaches are proposed for complex industrial process monitoring and fault diagnosis based on kernel independent component analysis (KICA) and sparse support vector machine (SVM). The KICA method is a two-phase algorithm: whitened kernel principal component analysis (KPCA). The data are firstly mapped into high-dimensional feature subspace. Then, the ICA algorithm seeks the projection directions in the KPCA whitened space. Performance monitoring is implemented through constructing the statistical index and control limit in the feature space. If the statistical indexes exceed the predefined control limit, a fault may have occurred. Then, the nonlinear score vectors are calculated and fed into the sparse SVM to identify the faults. The proposed method is applied to the simulation of Tennessee Eastman (TE) chemical process. The simulation results show that the proposed method can identify various types of faults accurately and rapidly.


Introduction
The demand for safe operation in the industry has propelled research into fault detection and diagnosis methods.Based on principal component analysis (PCA), independent component analysis (ICA), partial least squares (PLSs), and canonical correlation analysis (CVA), the statistical method can represent high-dimensional process data in a reduced dimension, which contains most of the variance of the original data.These methods have been used in various applications [1][2][3][4].
ICA is a statistical approach that has the potential ability for blind source separation (BSS) without the prior information about the mixtures under the source signals that are statistically independent [5].However, ICA fails to separate the nonlinearly mixed source due to its intrinsic linearity.One new nonlinear ICA technique for tackling the nonlinear problem has been in development, which is called KICA [6].KICA can efficiently compute ICs in high-dimensional feature spaces using the kernel matrix .However, KICA algorithm is difficult to diagnose faults since the mapping function is unknown.
Sparse SVM is a learning strategy based on the Newton method and conjugate gradient method.It is proposed in this paper to diagnose faults, which is able to accelerate not only the model selection process but also the training speed.
This paper is organized as follows.In Section 2, the kernel ICA algorithm is introduced.In Section 3, sparse SVM is proposed.The experimental results of this methods proposed in this paper are also presented in Section 4. Finally, our work of this paper is summarized in the last section.

KICA Algorithm
Let  1 ,  2 , . . .  ∈   be the  training samples for kernel PCA learning.By the nonlinear mapping Φ :  ∈   →  ∈   , the measure inputs are extended into the hyperdimensional feature space.The mapping of   is simply noted as Φ(  ).The sample covariance in the feature space is given by Let  = [Φ( 1 ), . . ., Φ(  )], then   can be expressed by   = (1/)  .Let us form the Gram matrix  =   .The matrix is an  ×  matrix, and its elements are as follows: []  =   = ⟨Φ (  ) ⋅ Φ (  )⟩ =  (  ,   ) . (2) We can center the Gram matrix  by Normalized K using the following formula: Calculate the eigenvectors Then, the  largest positive eigenvalues of   are  1 /,  2 /, . . .,   /, and the associated orthonormal eigenvectors V 1 , V 2 , . . ., V  can be expressed by Denote  = [ 1 ,  2 , . . .,   ], Λ = diag ( 1 ,  2 , . . .,   ) and its matrix form is  = Λ −1/2 .The mapped data in feature space can be whitened by the following transformation: Specifically where Centering and whitening are carried out in the KPCA feature space.We designate the data after preprocessing as  normal , , as well as  normal are obtained from the fast ICA algorithm in the feature space.Data dimension can be reduced by selecting a few rows of  based on the assumption that the rows with the largest sun of squares coefficient have the largest effect on the variation of .The detailed algorithm for fast ICA can be easily found in the literatures [7].The selected  row of  constitutes a matrix   (dominant part of ), and the remaining rows of  constitute a matrix   .For the new sampling data  new (), independent vector that corresponds to the principal component space and residual space is Ŝnew  () =    new (), Ŝnew  () =    new (), respectively [8].
We defined three process monitoring statistics [8]: And their confidence limits can be obtained by kernel density estimation.For more details about kernel density estimation, it can be found in the literature [9].Specific implementation steps are shown in Figure 1.

Sparse SVM
The optimization problem can be written as an unconstrained one [10,11]: Let us now consider nonlinear SVMs with a kernel function  and an associated the reproducing kernel Hilbert space .The optimization problem (11) We have made a change of variable by introducing the regularization parameter  = 1/.We have also dropped the offset  for the sake of simplicity.
Suppose now that the loss function  is differentiable with respect to its second argument.Using the reproducing property, (12) becomes For a given value of the vector , we say that a point   is a support vector if   (  ) < 1 and if the loss on this point is nonzero.Let us reorder the training points such that the first  sv points are support vectors.Finally, let  sv be the  ×  diagonal matrix with the first  sv entries being 1 and the others 0. The gradient of (13) with respect to  is And the Hessian is  = 2( +  sv ).Each Newton step consists of the following update:  ←  −  −1 ∇, where the step size is found by one-dimensional search method.
Algorithm 1: SVM based on the modified Newton optimization.
Combining ( 14) and  as ∇ =  − 2 sv , we find that after the update: Using the fact that the lower left block   +  sv  is 0, the update (15) turns out to be  sv = ( sv +  sv ) −1  sv .
Therefore, the algorithm of SVM based on the modified Newton optimization is performed as shown in Algorithm 1.
We now consider the optimization parameter is [,   ].The augmented Hessian is where 1 should be understood as a vector of all 1.
We still use the former method of choice to choose the right "basis functions" and realize the purpose of building sparse SVM.Assume that the choice of "basis functions" collection is ; then the selection of the next "basis function"  is equivalent to ensuring that   is constant and optimizing the following questions: Through the introduction of the Cholesky decomposition and the corresponding rank 1 update strategy, based on unconstrained optimization SpSVM with the following steps to achieve the following: (1) update matrix  sv +  sv using the Cholesky decomposition; (2) optimize   using Newton's method (Algorithm 1).If sv changes, update matrix  sv +  sv using rank 1 update strategy; (3) if the "basis function" collection  is equal to the size of preset value, end of the program, otherwise, turn to step 4;

Experimental Results
The proposed method is applied to the simulation of the Tennessee Eastman chemical process to achieve fault detection and fault diagnosis.SVM training involved in the QP and the QCQP problem can be by optimizing the package MOSEK method based on the interior point, and the SVM training data are preprocessing.The process has 12 manipulated variables, 22 continuous process measurements, and 19 composition measurements sampled less frequently.The process contains the Gaussian noise.TE process includes 20 predefined failure modes, representing a step, random changes, slow drift, and other fault type, its process flow diagram and detailed process, see the text [12].We use the control strategy in paper [13] to conduct closed-loop control.Simulation data for a 960 × 52 matrix are obtained as follows: 52 represent the number of process variables, and 960 represent the sampling points of each variable.The simulation is running 48 hours.The failure was in the introduction at the first 8 hours.
In order to verify the effectiveness of the KICA monitoring algorithms, KPCA and KICA were applied to detect fault 5.The monitoring charts of KPCA for fault 5 are shown in Figure 2. The figures indicate that  2 and SPE statistics can also detect the happening of faults, but there will be false positives and false negatives in some sampling points.The monitoring charts of KICA for fault 5 are shown in Figure 3.In contrast to KPCA, it can be seen that the  2 ,  2  , and SPE statistics all respond strongly to the faults.
If the statistics exceed the control limits, it indicates a failure occurrence.We select 52 variables for the condition attributes and select 15 groups for the decision attribute, respectively, on behalf of 15 different types of failures.The training set is composed of 600 data, 40 samples data of each category.Test set is composed of 1200 data, 80 samples data of each category.We use 1-a.-r-mix strategy to achieve the multiclass classification SVM.The factor  is then set to 1000.The size of the alternative set  in SpSVM is 20.Specific diagnostic results are shown in Table 1.

Conclusions
In this paper, KICA and sparse SVM are proposed for the complex industrial process monitoring.KICA can efficiently compute ICs in high-dimensional feature spaces by means of integral operations.The method using sparse SVM can reduce the number of support vectors without reducing the classification accuracy.The application results show the effectiveness of proposed method.

Table 1 :
Results of fault identification.