Monitoring of Nonlinear Time-Delay Processes Based on Adaptive Method and Moving Window

A new adaptive kernel principal component analysis (KPCA) algorithm for monitoring nonlinear time-delay process is proposed. The main contribution of the proposed algorithm is to combine adaptive KPCA with moving window principal component analysis (MWPCA) algorithm, and exponentially weighted principal component analysis (EWPCA) algorithm respectively. The new algorithm prejudges the new available sample with MKPCA method to decide whether the model is updated. Then update the KPCA model using EWKPCA method. And also extend MPCA and EWPCA from linear data space to nonlinear data space effectively. Monitoring experiment is performed using the proposed algorithm. The simulation results show that the proposed method is effective.


Introduction
Fault detection and diagnosis are very important aspects in modern industrial processes because they concern the execution of planned operations and process productivity.In order to meet the need of production, data-based methods have been deeply developed, such as principal component analysis (PCA), partial least squares (PLS), and independent component analysis (ICA) [1][2][3][4][5].
PCA as a multivariable statistical method is widely used for damage detection and diagnosis of structures in industrial systems [6][7][8].Because PCA is an orthogonal transformation of the coordinate system, it is a linear monitoring method.However, most industrial processes have strong-nonlinearity characteristics [9,10].Document [11] shows that application of linear monitoring approaches to nonlinear processes may lead to unreliable process monitoring, because a linear method is inappropriate to extract the nonlinearities within the process variables.To solve the nonlinear problem, several nonlinear methods have been proposed in the past decades [12][13][14][15][16][17].
Kernel principal component analysis (KPCA), as a nonlinear version of PCA method which is developed by many researchers, is more proper in nonlinear process monitoring [18][19][20].KPCA can efficiently compute principal components (PCs) by nonlinear kernel functions in a high-dimensional feature spaces.The main advantages of KPCA are that it only solves an eigenvalue problem and does not involve nonlinear optimization [21].However, a KPCA monitoring model requires kernel matrix, whose dimension is given by the number of reference samples.In addition, the KPCA with fixed model may lead to large error because of the gradual change of parameters with the operation of process [22][23][24][25][26].Because the old samples are not representative of the current process status, the adaptation of KPCA model is necessary.
To date, MWPCA algorithm and EWPCA algorithm are two representative adaptive methods [27][28][29][30][31]. Performing a moving window approach, which was proposed by Hoegaerts et al., overcomes this problem and produces a constant scale of the kernel matrix and a fixed speed of adaptation [32,33].Choosing a proper weighting factor is an important issue, which determines the influence that the older data has on the model [31].Paper [34] has proposed a moving window KPCA formulation, which has some advantages such as using the Gram matrix instead of the kernel matrix, incorporation of an adaptation method for the eigendecomposition of the Gram matrix.
In this paper, a new algorithm combining these two representative methods is proposed.The proposed algorithm mapped the sample set in feature space and prejudges the new available sample with MKPCA method to decide whether the model is updated.Then update the KPCA model using EWKPCA method.The proposed algorithm can reduce negative impact of outliers on model; the updating after prejudgment can reduce computational complexity and be more efficient.The remaining sections of this paper are organized as follows.The KPCA method based on loss function and the online monitoring strategy are introduced in Section 2. The iterative KPCA algorithm with penalty factor is presented in Section 3. The proposed adaptive KPCA method combining two representative methods is described in Section 4. The simulation results are presented in Section 5. Finally, the conclusion is given in Section 6.

Kernel Principal Component Analysis
Based on Loss Function data set X = {x 1 , x 2 , . . ., x  } ∈ R  is mapped into a potentially much higher dimensional feature space ; the mapped data set Φ(X) = {Φ(x 1 ), Φ(x 2 ), . . ., Φ(x  )} is obtained.Φ(X) is gotten by centering Φ(X) with the corresponding mean s.t.Consider A principal component V can be computed by solving the eigenvalue problem where is the sample covariance matrix of Φ(X) and  > 0 is the eigenvalue corresponding to V.
By defining the kernel matrix K of dimension  × , the eigenvalue problem can be put in the form Regarding the kernel function, radial basis function (RBF) is chosen in this paper as where  is the width of the Gaussian kernel; it can be very small (<1) or quite large.
Let  1 ≥  2 ≥ ⋅⋅⋅ ≥   and  1 ,  2 , . . .,   be the eigenvalues and the corresponding eigenvectors of the kernel matrix K, respectively.As V  = ∑  =1    Φ(x  ), the score of the th eigenvector V  can be calculated as follows: where W is the transformation matrix, with ‖W‖ = 1.
Between the original sample data set and the reconstructed data set exists the reconstructed error Define the loss function in feature space as follows: where ∑  =1 ‖Φ(x  )‖ 2 is a constant value.Thus, when the transformation matrix W maximizes W  Φ(x  ), the loss function  1 (W) reaches its minimum value.Similarly, the score of the th PC W  is calculated as follows:

Online Monitoring Strategy of KPCA.
A measure of the variation within the KPCA model is given by Hotelling's  2 statistic.The measure of goodness of fit of a sample to the model is the squared prediction error (SPE), also known as the  statistic. 2 is given by where  is the PC number,  is the number of samples, t  is the th nonlinear principal component which can be calculated with (11), Δ −1 is the inverse matrix of the diagonal matrix composed of eigenvalues, and Control limit of Hotelling's  2 is as follows: Once a new sample Φ(x new ) is available,  2 new can be calculated with the following equation: The SPE and the corresponding control limit are obtained using the following equations: where  and ℎ are two relevant parameters.For new sample Φ(x new ), SPE is given by where ∑  =1   and  new, is the th element of k new .

Iterative Kernel Principal Component Analysis
KPCA algorithm based on eigenvalue decomposition is a batch learning method.This algorithm is not available for online monitoring as all sample data has to be known before modeling.In addition, no outlier is allowed in modeling process; this requires all sample data to be normal.But the requirement is hard to reach in actual production process.Outliers always exist in sample set even after mapping in feature space.In the sense of least squared error, an iterative KPCA algorithm is proposed in this paper.Furthermore, penalty factor is added to solve the outlier problem.
Use stochastic gradient descent algorithm to solve the problem proposed in (10): Iterative formula is given by where   is the iteration step, s.t.0 <   < 1, and W  converges to the first nonlinear principal component.
Because nonlinear principal components are mutually orthogonal, the Schmidt orthogonal method is used to calculate the other nonlinear principal components: where W  is the th nonlinear principal component ( = 2, 3, . . ., ) and  is the PC number.
To solve the outlier problem, penalty factor is added in (10): where  is the predefined threshold,  > 0, and   satisfies the following formula: As is shown above, when ‖Φ(x  ) − WW  Φ(x  )‖ 2 > , Φ(x  ) is regarded as outlier.The value of  2 (W) is set to  to reduce impact on KPCA model.Note that   in the penalty factor is discrete; thus the continuous Sigmoid function is used to approach   .Iteration formula with penalty factor is shown as follows: where   is the iteration step, 0 <   < 1, and 1/(1 +  (‖e  (Φ(X))‖ 2 −) ) is the continuous Sigmoid function.As (16) shows, the number of outliers is determined by the value of  in the penalty factor.The smaller the threshold value is, the more the sample points which will be treated as outlier are.Therefore, the value of  can be determined by the proportion of outliers.Sort the sample reconstruction error in descending order, and set the percentage of outliers to .
Select the largest  percent reconstruction error data to be outliers.Thus, the value of  can be set to the smallest reconstruction error value in all the outliers.Similarly, the other nonlinear principal components are given by The iterative kernel principal component analysis algorithm with penalty factor is summarized as follows.
(1) Given an initial standardized block of sample data set Φ(X) and maximum iterations  and , set initial iteration steps  = 1 and principal component number  = 1.
(2) Construct the kernel matrix and scale it to obtain K 0 . (

Adaptive Kernel Principal Component Analysis
In practice, the size of sample data set is gradually increased during dynamic process.Using the obtained KPCA model for online monitoring may cause large error.Therefore, in order to improve the ability to adapt new samples, dynamic performance is added in KPCA algorithm.
In this paper, an adaptive kernel PCA algorithm combining two representative methods is proposed.Kernel matrix and corresponding control limits of two statistics can be updated in real time.
In MKPCA, once a new sample data is available, a data window of fixed length is moving in real time to update the KPCA model.
In EWKPCA algorithm proposed by Li et al., the updating of the covariance matrix is as follows: where x  is a new sample and   is the corresponding weighting factor, 0 <   < 1.As is shown above, the weight on the latest observation is increasing with the decrease of   .So the latest observation contributes most to the model.Introduce the idea to KPCA algorithm.Once a new sample is available, a new kernel vector k  can be calculated.In this paper, k  is used to update the kernel matrix with the following formula: where K −1 rec is the kernel matrix calculated at time  − 1,   is the weighting factor, 0 <   < 1, and k  is the standardized kernel vector at time .As  increases, old samples affect the model less and less until the affection can almost be negligible.Therefore, old samples can be abandoned automatically instead of manually.
One important step of the EWKPCA algorithm is determination of the weighting factor   .Fixed forgetting factor is not applicable because industrial process does not change in a constant rate.In the algorithm proposed by Choi et al., two forgetting factors  and  are used to update the sample mean and covariance matrix, respectively.The forgetting factor used to update the covariance matrix is given by where  max and  min are the maximum and minimum values of forgetting factor, respectively, ‖ΔR‖ is the Euclidean matrix norm of the difference between two consecutive correlation matrices, and ‖ΔR nor ‖ is the average ‖ΔR‖ obtained using historical data. and  are two parameters which control the sensitiveness needed to be determined.Let X −1 = {x  , x +1 , . . ., x +−1 } ∈ R  be the sample data set, let K −1 rec be the kernel matrix and K 0 rec = K 0 , let W −1 be the transformation matrix, let Δ −1 be the principal eigenvalue diagonal matrix, and let  2 lim,−1 and SPE lim,−1 be the control limits at time  − 1, respectively.
The adaptive kernel principal component analysis algorithm can be concluded as follows.
(1) At time , a new sample is available.Φ(x + ) is obtained after scaling it with the mean and covariance gotten at time  − 1.

Electrofused Magnesium Furnace
Process.The purpose of the section is to test the performance of the proposed algorithm.We choose the electrofused magnesium furnace (EFMF) process to be monitoring object, because the working conditions of the EFMF are complex and changed frequently such as strong nonlinearity [35].The shell of EFMF is round and slightly tapered, which can facilitate melting process.There are rings on the furnace wall and trolley under the furnace.When melting process has completed, move the trolley to cool.The control object is to ensure the temperature of EFMF can meet the set value.The average time of the whole EFMF multimode processes is 10 h.The current value and voltage value of three phases and the temperature of furnace all can be online measured, which provide abundant process information.The "healthy" process data is used for modeling.
The "faulty" process data is used for monitoring.
In this experiment, the data set for training has 600 × 6 sample points and the data set for testing has 1000 × 6 sample points, which contains the current value and voltage value, respectively.The two data sets are printed in Figures 1 and 2. The size of the moving window is 50; the length of every step is 1.According to the steps which are given in Section 4, Hotelling's  2 and SPE statistics are shown in Figures 3 and 4. The results indicate that the algorithm which this paper proposed can follow the variation of the object well.rolling in steel works.As there are furnace zones, stable operation is necessary to product quality and continuous processing of the upstream and downstream.Process monitoring and fault diagnosis have always been the primary concern [36].

Continuous Annealing
The strip in the continuous annealing line is heated in order to arrange the internal crystal of the strip.The material for annealing is a cold-rolled strip coil, which is put on a payoff reel on the entry side of the line.The head end of  the coil is then pulled out and welded with the tail end of the preceding coil.Then the strip runs through the process with a certain line speed.On the delivery side, the strip is cut into a product length by a shear machine and is coiled again by a tension reel.The schematic diagram of continuous annealing process is shown in Figure 5, in which some abbreviations are listed in Table 1.
The strip-break is a frequent fault in continuous annealing process [37].So a strip-break fault is considered here.Four  current data sets with faults are measured from SF3R-SF6R; every data set contains 800 sample points, which is shown in Figure 6.And we choose 300 normal data (Figure 7) to train the algorithm.For the sake of easiness to calculate, normalization has been performed before.The size of the moving window is 60; the length of every step is 1.Hotelling's  2 and SPE statistics are shown in Figures 8 and 9.

Conclusion
This paper has studied an algorithm for monitoring nonlinear time-delay process.In the paper, we propose an adaptive KPCA algorithm, which is combined with MWPCA and EWPCA.The proposed algorithm mapped the sample set in feature space and prejudges the new available sample with MKPCA method to decide whether the model is updated.
Then update the KPCA model using EWKPCA method.This algorithm can reduce negative impact of outliers on model; the updating after prejudgment can reduce computational complexity and be more efficient.The experimental results show that the proposed algorithm is effective.

( 4 )
If |W  +1 − W   | >  as well as  < , make  =  + 1 and go back to Step (3) to do the next iteration; otherwise, W  is obtained.(5) Make  =  + 1; go back to Step (3) to calculate the next PC.

( 5 )
Update KPCA model, and transformation matrix W  is gotten.Control limits of Hotelling's  2 and SPE statistics are updated at the same time.Then go back to Step (1) for the next adaptation.

Figure 1 :
Figure 1: Sample points for training the proposed algorithm.

Figure 2 :
Figure 2: Sample points for testing the proposed algorithm.

( 6 )
Model updating will not conduct if the control limits are exceeded.According to the number of continuous transfinite samples, they are judged to be outlier or fault.Let  be the number of continuous transfinite samples, initially  = 0.The value of  keeps adding by 1 until the next normal sample is obtained.Transfinite samples are judged to be outlier if  < 3; otherwise, they are judged to be fault.After the judgment, go back to Step (1) for the next sample.

Figure 4 :
Figure 4: SPE statistics of testing data.

Figure 6 :
Figure 6: Data sample points for monitoring.

Figure 7 :
Figure 7: Data sample for training the algorithm.

Figure 9 :
Figure 9: SPE statistics for the algorithm.

Table 1 :
Meanings of some abbreviations in Figure5.