Quality-related Process Monitoring Based on Total Kernel Pls Model and Its Industrial Application

Projection to latent structures (PLS) model has been widely used in quality-related process monitoring, as it can establish a mapping relationship between process variables and quality index variables. To enhance the adaptivity of PLS, kernel PLS (KPLS) as an advanced version has been proposed for nonlinear processes. In this paper, we discuss a new total kernel PLS (T-KPLS) for nonlinear quality-related process monitoring. The new model divides the input spaces into four parts instead of two parts in KPLS, where an individual subspace is responsible in predicting quality output, and two parts are utilized for monitoring the quality-related variations. In addition, fault detection policy is developed based on the T-KPLS model, which is more well suited for nonlinear quality-related process monitoring. In the case study, a nonlinear numerical case, the typical Tennessee Eastman Process (TEP) and a real industrial hot strip mill process (HSMP) are employed to access the utility of the present scheme.


Introduction
Multivariate statistic process monitoring (MSPM) is effective for detecting and diagnosing abnormal operating situations in many industrial processes, which helps by improve products' quality a lot.In MSPM, projection to latent structures (PLS) model pays more attention to quality-related faults while principal component analysis (PCA) considers all faults in a process [1][2][3][4][5][6][7].The major advantage of PLS is its ability to capture the relations of a large number of highly correlated process variables and few quality variables.By building a PLS model on process variables and quality variables, the process data can be projected onto two low-dimension subspaces [1,8].Then some statistics can be calculated in these subspaces separately.It should be noted that PLS is a linear algorithm; thus, it performs well in linear or approximately linear data.However, when the process data have strong nonlinearity, PLS will give unsatisfactory results [8].
For many physical and chemical processes, the nonlinearity lying in the process data and quality data is too obvious to be neglected.To deal with this problem, many nonlinear PLS methods have been proposed [6,9].Generally, PLS can be improved by two ways for nonlinear cases, which are the modification of inner model and the modification of outer model, which reflects the relation between process variables and quality variables.A method called kernel projection to latent structures (KPLS) proposed by Rosipal and Trejo is developed successfully as a nonlinear PLS model [10].In KPLS model, the original input data are transformed into a high-dimensional space via nonlinear mapping, and then a linear PLS model is created between the feature data and quality data [11][12][13].KPLS takes the advantage over other nonlinear PLS approaches as it avoids the nonlinear optimization [14,15].In fact, it just uses the linear algorithm of PLS in the high-dimensional feature space.
In the aforementioned literature [16,17], Li et al. revealed the geometric properties of PLS for process monitoring and compared monitoring policies based on various PLS, which indicates that the standard PLS model divides the measured space into two oblique subspaces.One includes the quality-related variations; another subspace contains the quality-unrelated variations.Two statistics are usually utilized for fault detection separately [3,18].Although PLS-based methods work well in several cases, there are still some problems.In regular PLS, there are usually many components extracted from process variables X for predicting quality variables Y.As a result, the PLS model is complex to interpret [16,[19][20][21].These PLS components still include variations orthogonal to Y which have no contribution for predicting Y. On the other hand, the X-residuals from PLS model are not necessarily small in covariances.This makes the use of  statistic on X-residuals inappropriate.The KPLS model space decomposition is similar to PLS model, with the above-mentioned defects.
In order to improve the KPLS model, a new total kernel PLS (T-KPLS) is proposed for nonlinear quality-related process monitoring in this paper.First of all, we reveled and summarized the existing KPLS model and corresponding process monitoring techniques.Then T-KPLS is developed.The properties of the new model and the process monitoring strategies are discussed then.T-KPLS model can describe the nonlinear process according to quality data effectively and also give a further decomposition on the feature spaces in KPLS.Actually, besides nonlinearity, traditional MSPM approaches also possess the assumption that the processes operate under a Gaussian distribution and in a single mode.Also, increasing number of studies can be found in this area.However, due to the scope in this paper, these issues will be considered in the subsequent researches [14,15,[22][23][24][25].
This paper is organized as follows.KPLS-related algorithm and process monitoring methods are introduced in Section 2. Section 3 proposes the algorithm of T-KPLS, discusses its properties, and constructs T-KPLS-based process monitoring policy.Section 4 provides a numerical simulation example and TEP benchmark to illustrate the feasibility of T-KPLS-based approaches.Furthermore, the new method is also implemented to a real industrial hot strip mill process in Section 5. Finally, this paper is concluded in Section 6.
Notation.The notation adopted in this paper is fairly standard.All vectors and matrices are presented in a bold fashion and written in a vector-matrix style.The symbols for scalars and functions are regularly formulated throughout this paper.

KPLS Model for Process Monitoring
2.1.KPLS Model.For a nonlinear process, the input matrix can be defined as X = [x 1 , x 2 , . . ., x  ]  ∈ R × , which consists of  samples with  process variables, and output matrix with  quality variables can be denoted by Y = [y 1 , y 2 , . . ., y  ]  ∈ R × .Define  as a nonlinear map which maps the input vector from the original space into the feature space , in which they are related linearly approximately.After the nonlinear map, the original input matrix X is changed to Φ = [(x 1 ), (x 2 ), . . ., (x  )]  ∈ R × .Note that the dimensionality of the feature space  can be very large and even infinite.Define K ∈ R × as the kernel matrix to represent ΦΦ  , where K  = (x  , x  ) = ⟨(x  ), (x  )⟩, ,  = 1, 2, . . ., , where (⋅) is an inner product operator in feature space.With the kernel trick, one can avoid performing explicit nonlinear mapping [10].Similar to PLS, KPLS algorithm sequentially extracts the latent vectors t, u and the weight vectors w, q from the Φ and Y matrices [12].To eliminate the mean effect, mean centering in the highdimensional space is performed.In order to center the feature data to zero mean, the following preprocessing for normal training data is necessary [10,12,13]: where Φ raw is the directly mapped matrix, Φ raw denotes the mean of Φ raw , and 1  represents the -dimension column vector whose elements are all one.So the centered K can be calculated as follows: For a test sample x new ∈ R  , the directly mapped feature vector is (x new ) raw ∈ R  ; then the inner product is calculated by The algorithm of KPLS modeling has been illustrated in Appendix A. After that, Φ and Y can be represented as The derivation of (4) is presented in Appendix B. The determination of kernel function (⋅) is very important.According to Mercer's theorem, there exists a mapping into a space where a kernel function acts as a dot product if the kernel function is a continuous kernel of a positive integral operator.Hence, the necessary condition for the kernel function is to meet Mercer's theorem [10,27].A specific choice of kernel function implicitly determines the mapping Φ and the feature space .The most widely used kernel functions include Gaussian, polynomial, sigmoid function.In this study, the Gaussian kernel function is considered where the parameter  is the width of a Gaussian function.It plays a crucial role in process monitoring.In general, when  becomes large, the robustness of this model increases whereas the sensitivity decreases.Namely, false alarms decrease while missing alarms increase.In [28] The residuals of (x new ) are represented as   (x new ) = (x new ) − Pt new , which cannot be calculated directly.Further, two statistics  2 and  can be calculated [3,19] as follows: where Λ = (1/( − 1))T  T. where

T-KPLS Model for Nonlinear Data
KPLS divides the feature space  into two subspaces.One is the principal space which is monitored by  2 , reflecting the major variation related to Y.The other is the residual space which is monitored by , reflecting the variation unrelated to Y.However, the principal part Φ contains variations which do not affect output Y and is useless for predicting Y.For the residual part Φ  , as the objective of KPLS is to maximize the covariance between Φ and Y, it does not extract the variance of Φ in a descending order.So the latter KPLS score may capture more variance in Φ than the previous one.After the score vectors have been extracted, Y is best predicted, but the residual of Φ may still contain the large variability.Therefore, it is not suitable to use  statistic to monitor the residual part in KPLS.In this part, a T-KPLS model is proposed to improve the original KPLS model.Following that, the T-KPLS-based process monitoring strategy is established.

T-KPLS Model.
The T-KPLS model is a further decomposition on the KPLS model.It can be thought as a postprocessing method to decompose the Φ and Φ  further in KPLS.The detailed algorithm for T-KPLS can be found in Algorithm 1.
In step (4) of Algorithm 1, loading matrix o corresponding to its   largest eigenvalues.In The residual part which is not excited in Φ step (5), corresponding to its   largest eigenvalues [27].As (⋅) is unknown, the algorithm in Algorithm 1 cannot be implemented intuitively, while the calculable steps are shown in Algorithm 2. In Algorithm 2, where In T-KPLS model, we can model Φ and Y as follows: The meanings of different sections of Φ are listed in Table 1.Compared with KPLS, T-KPLS is clearer for describing Φ and more suitable for monitoring different parts of (x).T-KPLS does not change the prediction ability of Y, but it decomposes Φ thoroughly supervised by Y. T  is the score of Φ  and completely related to Y from the original T, whereas T  is the score of Φ  and orthogonal to Y in original T. T  is the main part of Φ  .Φ  represents the residual of Φ and the noise.Note that in the T-KPLS model, all the scores T  , T  , and T  have their definite values.However, the loadings P  , P  , and P  are unknown because of the uncertain map function .
In T-KPLS, the orthogonality among all score vectors holds.Meanwhile, T  is orthogonal to output Y.The proof is omitted, and one can refer to Zhou et al. [19].

T-KPLS-Based Quality-Related Process Monitoring.
In multivariate statistical process monitoring, two types of statistics are widely used for fault detection.One is the  statistic which calculates the Mahalanobis distance between new scores and the normal scores.The other is the  statistic which represents the square predict error of the sample.As for T-KPLS, the similar statistics are constructed.After T-KPLS model is built from normal historical data, the new scores and residuals are calculated from the new sample.Then, the statistics are constructed with corresponding control limits for fault detection.According to T-KPLS model, three score vectors can be calculated as follows: Motivated by total PLS-(T-PLS-) based methods [19], four fault detection indices are constructed in Table 2.The expression of   can be calculated as follows: The detailed expression of (12) and   for calculation are shown in Appendix C.

A new sample
Step 1 Step 2 Step 3 Step 4 a testing sample is sketched in Figure 2. The whole procedure involves four steps: the acquisition of online measurement, the calculation of all scores for the new sample, the acquirement of four detection indices, and the result for qualityrelated detection.

Case Study on Simulation Examples
In this section, two detailed simulation examples are carried out to demonstrate the advantage of T-KPLS.

Simulation on a Numerical Nonlinear Example.
Firstly, a synthetic nonlinear numerical process without feedback is presented as follows: where   ∼ N (0, 0.01 2 ) ( = 1, 2, 3), V ∼ N (0, 0.05 2 ), N (,  2 ) means the normal distribution with mean  and variance  2 .From ( 14), it is obvious that the abnormal variation in x 1 can cause the disturbances in x 3 and x 4 , while x 2 just influences x 5 .As quality variable y merely relates to x 1 , x 3 , and x 4 , so the fault in x 1 will affect y, while the fault in x 2 cannot.
We used 200 samples generated from the above process as a training dataset.The faulty dataset with 400 samples was also generated according to the following faults: , respectively,  is the magnitude for step bias and slope for ramp change, and  is the sample number.Then the faulty measurements of variable x 3 , x 4 , and x 5 are generated by (14).
Training samples are applied to perform a KPLS model on (X, y).The width of Gaussian kernel  = 100 is kept for this simulation.The components number A = 2 is determined using cross validation, which provides a good prediction of y.Then T-KPLS model is constructed based on KPLS, where   = 1 for the single output, and   = 1 is chosen as the principal component unrelated to y.
According to the descriptions of Faults 1 and 2, they are quality-unrelated faults.Let  = 1; the monitoring results with KPLS model ( 2 and ) are plotted in Figure 3.It is observed that Fault 1 causes significant alarms in both two detection indices of KPLS.However, the alarms in  2 chart are false alarms for indicating a y-related fault.Thus, KPLSbased monitoring causes false alarms for this disturbance.T-KPLS-based monitoring for Fault 1 is depicted in Figure 4.Among the four detection indices,  2   is kept under the control line, which gives correct result.Also  2  and   alarm tinily.Compared with KPLS, T-KPLS provides lower false alarm rates for Fault 1.Similarly, the detection results of Fault 2 with  = 0.005 using KPLS and T-KPLS are shown in  Figures 5 and 6, respectively.It is shown that the results for Fault 2 is similar to that of Fault 1. Table 3 lists the false alarm rates under different fault magnitudes .In all simulations, we repeat 100 times and make use of the mean for conviction.
From Table 3, it is clear that T-KPLS-based method gives lower false alarm rates.The predefined Faults 3 and 4 are quality-related.For Fault 3 with  = 0.6, KPLS-based method could detect this fault as shown in Figure 7. T-KPLS-based method performs sensitively in  2  ,  2  , and   in Figure 8.That is to say, the alarms in  2 of KPLS are merely denoted by  2  of T-KPLS.Thus, for this kind of fault, when the step magnitude is small enough, T-KPLS will work better than KPLS.For quality-related Fault 4 with  = 0.005, KPLS-based method cannot detect quality-related faults by  2 as shown in Figure 9, while T-KPLS-based   statistic detects the fault sensitively in Figure 10.It means that the variations leading y to abnormality occur in the residual space.The results of simulation on Faults 3 and 4 show that T-KPLS-based policy could improve the detection rates.Moreover, Table 4 lists the detection results which show that the quality-related fault can be detected by T-KPLS using  2  and   better.independent component analysis (ICA) and PCA for TEP [26,30,33].Also, PLS-based monitoring policy has been utilized for quality-related fault detection [30].In [31], Chiang et al. compared the fault detection and diagnosis method such as PCA, PLS, and Fisher discriminant analysis (FDA), according to the case study of TEP.
The TEP contains two blocks of variables: 12 manipulated variables and 41 measured variables.Process measurements are sampled with interval of 3 min, while nineteen composition measurements are sampled with time delays which vary from 6 min to 15 min.The time delay has a potentially critical impact on product quality control in this process, because the closed-loop control works when the next sample of quality variable is available [21].Thus during this interval, the products are produced with uncontrolled quality.It also implies that the fault effect on product quality cannot be detected until next measurement sampled, PLS and KPLS-based monitoring methods can detect the fault correlated to Y, thus receiving wide applications in industrial cases.There are 21 predefined faults in TEP, in which 15 of them are known, denoted by IDV (1)(2)(3)(4)(5)(6)(7)(8)(9)(10)(11)(12)(13)(14)(15).IDV (1-7) are step changes in a process variable, for example, in the cooling water inlet temperature.IDV (8)(9)(10)(11)(12) are associated with an increase in the variability of some process variables.Fault 13 is a slow drift in the reaction kinetics.IDV (14)(15) are associated with sticking valves [19,20].

T-KPLS-Based Quality-Related Detection for TEP.
In this case study, the component G in steam 9, that is, the 35th measured variable, is chosen as the output quality variable y.The process variables X consist of measured variable 1-22 and manipulated variable 1-11.The detailed X and y are summarized by Li et al. [20].We use 480 normal samples to build KPLS and T-KPLS model.The selection of kernel parameter  affects the detection results for this process significantly.According to the simulation results, the larger  is, the lower the false alarm rates and the higher the missing alarm rates will be.In this simulation,  = 5000 is chosen for the KPLS model.Eight principal components are kept according to cross validation.For T-KPLS,   is set to 1 because of the single quality variable, and   =  −   = 7,   = 6 are determined according to the KPCA-based method.TEP provides 21 faulty sample datasets, and each of them consists of 960 samples.Here, we apply 13 known fault sets to perform our simulation.First of all, these known faults should be divided into two groups including the qualityrelated faults and the quality-unrelated faults with the criteria proposed by Zhou et al. [19].Here, the IDV (1, 2, 5, 6, 8, 12, 13) are related to quality variable y; others are not.For comparison, the normal data set is also included in this simulation.As illustrated in Figures 11 and 12, the proposed approach with  2  and   can detect Fault 1 effectively, but show few false alarms for quality-unrelated Fault 3. The alarm for the quality-related fault is considered as an effective alarm, while the detection for quality-unrelated fault is thought to be a false alarm.Tables 5 and 6 list the fault detection rates and fault alarm rates of KPLS and T-KPLS.Also, the detection results by T-PLS [19] are cited in these two tables for comparison.From the detection results, it is observed that T-KPLSbased method gives a higher detection rate and lower false alarm rate than KPLS-based method.Compared with linear T-PLS, T-KPLS performs better in most cases.In Table 5, T-KPLS has higher detection rates in most cases.Meanwhile, T-KPLS gives lower false alarm rates in most cases as shown in Table 6.To sum up, T-KPLS is an improvement for KPLS, and it is effective to detect quality-related faults in nonlinear processes.

Application in Real Industrial Hot Strip Mill
Hot strip mill process (HSMP) is an extremely complex process in iron and steel industry.A schematic layout of the hot strip mill is illustrated in Figure 13 corresponding to the real industrial hot strip mill.According to Figure 13, the process generally consists of the following units: reheating furnaces, roughing mill, transfer table, crop shear, finishing mill, run-out table cooling, and coiler.The finishing mill has the most significant influence on the final thickness of steel strip, in which the controlled variables include average gap of the 7 finishing mill stands and work roll bending (WRB) force of the last 6 stands (WRB force of the first stand is not measured).The thickness and temperature of the strip after finishing rolling are around 850 ∘ C-950 ∘ C and 1.5-12.7 mm, respectively.As is well known from materials science, the kinetics of metallurgical transformations and the flow stress of the rolled steel strip are dominantly controlled by the temperature, which is mainly determined by the finishing temperature control (FTC).
The demand of dimensional precision, especially thickness precision of hot strip mill, has become stricter in recent years, which makes the improvement of thickness precision be a hot topic.In general, the thickness in exit of finishing mill is closely related to gap and rolling force and has little connection with bending force.In this paper, two classes of strips' manufacturing process are taken for this test with thicknesses, where their thickness targets are 3.95 mm and   7.In this case study, three kinds of frequently occurring faults are mainly studied, which are listed in Table 8, where all faults with the same duration time of 10 s are terminated artificially.In real circumstances, faults may occur in some driving units or sensors for measuring force, temperature, and gaps.Furthermore, malfunction of control loop in a single stand may also exist occasionally.To be summarized, three kinds of faults defined in control systems can all be found in finishing mill process.In this work, three typical faults separately selected from each type are chosen to support our study, which are tabulated in Table 8.Among all these faults, Fault 1 is a little quality-related; others are directly quality-related.Gaussian kernel parameter  affects detection results significantly.In this study, T-KPLS model is built, where  = 8 is determined according to cross validation,   = 1; because of the single output,   = 10 is obtained by KPCA-based method.In the model,  min = 0 and  max = 10000 are chosen, which yield an optimum  = 7500.
The results of thickness quality-related process monitoring are given by Table 9.As can be shown in Table 9, compared with PLS, KPLS, and T-PLS, T-KPLS-based method just gives a little false alarm rate for quality-unrelated Fault 1, while for quality-related Fault 2 and 3, it presents higher detection rates, especially in Fault 3. In conclusion, T-KPLS is an appropriate enhancement for typical KPLS model, and it is effective to deal with the quality-related disturbances in real industrial processes.
Regarding HSMP, the following should be noted.
Remark 1.We clarify that the data considered about finishing mill process are acquired from real steel industrial field, namely, Ansteel Corporation, China.The faults occur occasionally and were eliminated manually.
Remark 2. In this implementation, only thickness has been concerned as the quality variable, whereas T-KPLS model can handle multioutput cases.

Conclusion
In this paper, the T-KPLS algorithm is proposed by further decomposing KPLS.The purpose of T-KPLS is to perform a further decomposition on the high dimension space induced by KPLS, which is more suitable for quality-related process monitoring.The process monitoring methods based on T-KPLS are developed to monitor the operating performance.Both theoretical analysis and simulation results show better performance of T-KPLS than KPLS.T-KPLS-based methods can give lower false alarm rates and missing alarm rates than KPLS-based methods in most simulated cases.However, there are still some problems needed to be considered in the modeling with T-KPLS, such as how to select an appropriate kernel function for a given process data and establish a framework for precisely choosing the kernel parameters.Due to the scope of this paper, further studies for these issues will be concerned in the future.

A. KPLS Algorithm
The nonlinear iterative KPLS algorithm is shown in Algorithm 3.

B. The Proof of T = ΦR
According to the KPLS algorithm in Algorithm 3, the following equations hold: To sum up, Then,  The  statistic for T-KPLS is as follows:  (4) u  = Y  q  , where q  = Y   t  .( 5

3. 3 .
Model Implementation.Implementation of the T-KPLSbased quality-related detection scheme involves offline training model and online testing model.As sketched in Figure 1, the training model aims to obtain the model parameters.When all parameters are available, the schematic plot for Historical process and quality data: T-KPLS model Parameters needed for testing model

Figure 2 :
Figure 2: Flowchart of testing model for T-KPLS-based monitoring.

Figure 13 :
Figure 13: Schematic layout of the hot strip mill.
Usually  2 and  statistics are used in KPLS-based monitoring, where  2 is for qualityrelated faults and  for quality-unrelated faults.Given a new sample, the score t new of (x new ) can be calculated as [13]kinds of control limits are given, respectively: (( 2 − 1)/(( − ))) ,−, and  ℎ 2 , .,− is -distribution with  and  −  degrees of freedom. 2 ℎ is the  2 -distribution with scaling factors  and ℎ degrees of freedom[13].Although (x new ) is unavailable, it is able to calculate  by the kernel trick as follows:

Table 1 :
Meaning of different sections of Φ.
The Y-related part of Φ which is responsible for predicting YΦ The part of Φ that is orthogonal to Y in original T of KPLSΦ The principal part of Φ  which represents a large variation in Φ  Φ

Table 2 :
Monitoring statistics and control limits.

Table 3 :
False alarm rates of faults unrelated to y (%).

Table 4 :
False detection rates of faults related to y (%).
4.2.Simulation on TennesseeEastman Process 4.2.1.Tennessee Eastman Process.The Tennessee Eastman (TE) Process was provided by Eastman Chemical Company which is a realistic industrial process for evaluating different be openly downloaded in their website.The faults in the test dataset are introduced from the 160th sample.The TE process has been used as a benchmark process for evaluating process monitoring methods.Kano et al. applied PCAbased method for monitoring this process [32].Russell et al. compared canonical vector analysis (CVA) and PCA-based technologies, while Lee et al. reviewed the results using both

Table 7 :
Process and quality variables in finishing mill.

Table 8 :
Typical faults in finishing mill.

Table 9 :
Detection rate or false alarm rate for hot strip mill (%).
(x new )     2 =       (x new ) − P  t new     2 =    (x new )   (x new ) − 2   (x new ) P  t new + t  new P   P  t new .The first part of   is detailed in (8).And the second part is    (x new ) P  t rnew = ( (x new ) − Pt new )  Φ   W  t new =   (x new ) Φ   W  t new − t  new P  Φ   W  t new =   (x new ) Φ  (I − TT  ) W  t new − t  new T  ΦΦ  (I − TT  ) W  t new = K  new (I − TT  ) W  t new − t  new T  K (I − TT  ) W  t new .