Focusing on quality-related complex industrial process performance monitoring, a novel multimode process monitoring method is proposed in this paper. Firstly, principal component space clustering is implemented under the guidance of quality variables. Through extraction of model tags, clustering information of original training data can be acquired. Secondly, according to multimode characteristics of process data, the monitoring model integrated Gaussian mixture model with total projection to latent structures is effective after building the covariance description form. The multimode total projection to latent structures (MTPLS) model is the foundation of problem solving about quality-related monitoring for multimode processes. Then, a comprehensive statistics index is defined which is based on the posterior probability of the monitored samples belonging to each Gaussian component in the Bayesian theory. After that, a combined index is constructed for process monitoring. Finally, motivated by the application of traditional contribution plot in fault diagnosis, a gradient contribution rate is applied for analyzing the variation of variable contribution rate along samples. Our method can ensure the implementation of online fault monitoring and diagnosis for multimode processes. Performances of the whole proposed scheme are verified in a real industrial, hot strip mill process (HSMP) compared with some existing methods.
With modern industrial processes getting increasingly complex and large, prevention monitoring and fault diagnosis have become the key to ensure safe operation, improve product quality, and gain economic benefits. Due to the complex operation mechanism, sheer size, complex conditions, chaotic environment, and vague boundary conditions in complex industrial systems, it is quite tough to implement effective process monitoring. As a result, the data-driven process monitoring technology has become one of the research hotspots in the field of fault diagnosis. The core idea of this technique is to establish the data model by means of using historical data, mining useful information, and getting the features of normal and fault operation mode, so as to realize process monitoring. In the last decades, basic multivariate statistical monitoring techniques, such as principal component analysis (PCA) and partial least squares (PLS), have been established and successfully applied in practice [
However, PCA or PLS model is established with data which follow the basis hypothesis of data subject to stable single Gaussian mode. Due to the reasons of fluctuation of raw materials, product specifications, and differences among batches, process data show the characteristic of multimode in actual industrial processes especially for batch processes. Considering the problems existing in the multimode process, traditional fault detection methods and their improved algorithms are difficult to be applied directly; otherwise, the performance of data model in process monitoring will be reduced.
Many scholars have studied a lot and made some progress on those problems [
Considering the unique advantages in dealing with non-Gaussian data, the Gaussian mixture model (GMM) has not been explored in multimode process monitoring until recently. Choi et al. integrated PCA and DA with GMM to detect and isolate the faults in a process with nonlinearity, multistates, or dynamics [
The main contribution of this paper is summarized as follows.
The remainder of this paper is organized as follows. In Section
Principal component analysis model is one of the most basic projection models in multivariate statistical analysis. Let
The PCA loading matrix
Based on the projection model, monitoring statistics indexes
When the residual error is subject to normal distribution, Jackson and Mudholkar pointed out that the control limit can be calculated as follows:
Similarly, in order to apply the sample covariance information into the monitoring index, the principal component covariance matrix can be expressed as
In the actual industrial production, the changes of quality variables
PLS decomposition of
Parameter matrix
According to the iterative process of the PLS model, Peng et al. proposed a model construction method using data covariance information [
Different from the PCA projection model, the decomposition structure of space
Similar to PCA model monitoring, the monitoring sample statistics can be constructed by using the covariance matrix of the above formula as follows:
The control limit of the residual statistic can be calculated as follows:
PLS algorithm uses two variable spaces to describe process change. However, the main component of samples contains the part which is orthogonal to
By further decomposition, we can model
At the same time, based on the structure of PLS projection, Li et al. also performed a detailed analysis of the space structure of TPLS and drew a good conclusion [
For a new measurement of sample
Compared with PLS, TPLS model is easy to be explained and suitable for process monitoring. Similar to PLS in monitoring strategy, TPLS uses two statistic indexes
The four spaces in TPLS can get a more detailed description of the different relationships between
Combining with the covariance description form of PCA and PLS model, we can do space decomposition in the following form.
In PCA decomposition of
Similarly, in PCA decomposition of
According to the score and the residual structure model of new measurement samples, let
It can be easily proved that this form is equivalent to the standard one.
The following part shows the calculation process of TPLS model using covariance information.
Use GMM-PLS algorithm, and obtain parameter matrix: Calculate PCA decomposition of Calculate PCA decomposition of Calculate PCA decomposition of
According to industrial process data with the characters of multimode, we need to determine a mixed model based on historical data firstly and then design a monitoring framework. Considering covariance information required for the statistical model, multimode modeling data can be processed by GMM. It is the assumption that data are made up of different Gaussian distributions. That is, for any sample data
According to the rule of Bayes inference, the posterior probability of
However, due to factors such as production flow, batch, and specification, the quality variables of the final products have some certain degree of difference in real production processes. It may be the root cause that process data is with multimode and multistage features. Therefore, considering that the PLS algorithm is with the space decomposition under guidance of quality variables, this paper first performs mode division with principal component space
Based on advantages of GMM in processing multimode problems, we deal with principal components matrix
After mode division, principal component space model based on GMM is established, where each Gauss component corresponds to different mode characteristics. For training samples,
Taking process variables
Assuming that variable
Divide
As above, it can be noted that mode classification will be under the guidance of quality variables. Then, because the number of principal components is far less than that of process variables, this has a great advantage in the treatment of estimated parameters calculation. In addition, after the mode division of original training data, multimode information such as covariance matrices can be directly calculated, which reduce the amount of calculation and improve calculation accuracy.
According to the principle of building PLS and TPLS, the essence is to use data information, variance, and covariance to represent process characteristics. As far as PLS is concerned, the modeling process is to maximize the covariance of linear combinations of process variables and quality variables, so the modeling process can be converted into a covariance form through the initial data
Based on the above analysis, we can make a rational division of training data to obtain the multimode information in the process of fault monitoring. When the sample is collected and is ready for being monitored, it can be divided into corresponding models with the probability, using Bayes classification ability under the data pretreatment. Then, we can calculate the monitoring statistic of the sample to justify which mode it belongs to. We treat the posterior probability of the monitoring sample belonging to each Gauss component as the membership degree of the corresponding model.
By using data information of probability
For a new monitoring sample
According to PCA decomposition of
Available by
In the same way, the covariance matrices of principal components in spaces
In order to realize the multimode fault monitoring, the monitoring index based on the MTPLS model is obtained by using the probability information and Bayesian inference:
Similarly,
The threshold can be inferred by the setting in standard TPLS.
In summary, we make use of covariance information mainly to calculate and then to achieve process monitoring in MTPLS. Compared with standard TPLS, the covariance model is more suitable for monitoring multimode processes and making full use of data information in the process of model construction and fault monitoring. Avoiding direct classification on data, the covariance model reduces the effect of classification on the final performance monitoring of the process.
In TPLS based process monitoring, space
Scale factor
It is necessary to isolate the faulty variables after a fault is detected. As a common fault separation method, the contribution plot assumes that the variables which have greater contribution to the monitoring statistics are very likely to be faulty variables. According to the description framework of complete decomposition of contribution proposed by Alcala and Qin, contribution to the combined index can be described as the following form [
Traditional contribution plot method is used for analyzing a specific sample when the fault is detected, which shows the contribution value of each variable to one monitoring index in bar chart. After that, the variables with greater contribution will be selected as the possible cause of fault. Westerhuis et al. put forward a generalized contribution to statistics form and a method of obtaining the control limits for variable contributions [
For the fault diagnosis method based on traditional contribution figure for one single sample after fault occurrence, there are some flaws that cannot well describe fault source and the change of other malfunction variables caused by fault source. In order to combine the idea of analyzing the contribution rate of faulty variables along the time coordinates with the change of the variable itself, reducing the impact of variable magnitude of value on the contribution rate, we refer to the gradient contribution rate to solve the fault variable analysis.
First, we introduce a mathematical symbol
It can be seen from the first-order Taylor series expansion of
Based on the above conclusion, the contribution rate may be defined as follows.
For a monitoring sample
As described above, the contribution rate represents the gradient of each variable to detection index under the same abnormal changes. Variables which are with great contribution will be considered with great influence to index
For a new monitoring sample
As a result, the gradient contribution rate based on comprehensive monitoring index
Due to the diffusion effect of fault, the method of setting absolute control limits using absolute value of variable contribution for fault diagnosis is not with good effect. Therefore, we use relative contribution rate; namely,
As described above, in index
The schematic diagram of the proposed process monitoring and diagnosis is shown in Figure Collect a set of historical training data under all possible operating modes and determine the number of modes. Use EM algorithm to learn the Gaussian mixture model of principal component space and estimate the model parameter set Do multimode division and multimode information acquisition of process data according to Calculate local monitoring statistics for the monitored sample Integrate the quality-related monitoring statistics into a quality-related combined index Specify a confidence level Detect the abnormal operating condition at the monitored samples satisfying Calculate the relative contribution rate of variables to the combined index
Schematic diagram of the proposed PLS-MTPLS and Bayesian-based process monitoring and diagnosis method.
HSMP (hot strip mill process) is an extremely complex industrial production process. In the process of production, improving the quality of products can bring about higher economic and social benefits for the factory. Typical HSM machine production line is mainly composed of reheating furnace, roughing mill, transfer table, crop shear, finishing mill group, run-out table cooling, and coiler. Figure
The schematic of HSMP.
As shown in Figure
The structure of mill stand.
In whole FMP, it is noted that the stands are actually not working independently but are coupled with each other by different control schemes. The thickness in the exit of the last stand is the key factor which directly affects the quality of products. Whole finishing mill process is controlled by automatic thickness control system. It can be seen that there is an obvious hysteresis control of the exit thickness. Not until the abnormal value of the exit thickness is detected, caused by some fault of front stands, can the thickness control system be started. Therefore, establishing real-time acquisition of the relationship between the process variables and exit thickness and then monitoring the thickness by real-time measuring process variables become very meaningful [
Production specification can be determined by different thicknesses of the steel strip in HSMP which should meet different industrial demands. We select the steel plate data of two specifications for modeling: one is the thickness of 2.70 mm and the other is 3.95 mm. The sampling interval for the variable is 0.01 s and 4000 samples are used for training modeling.
In the actual finishing mill process, we can collect the data information including roll gap, milling force between working rolls, and bending force in every stand. Generally speaking, the exit thickness has more relationships with roll gap and milling force than with bending force. Using data collected under normal operating conditions, GMM iterative learning is performed in principal component space which is under the guidance of quality variables. With the model division result, the process of multimode parameters calculation of the original data is carried out. Figure
Principal components clustering distribution where
For different types of faults that may occur in FMP, we select three encountered faults as a detected object in this section which are shown in Table
Faults that occurred in FMP.
Fault number | Fault description | Duration | Type |
---|---|---|---|
1 | Malfunction of gap control loop in |
20 s–30 s | Quality-related |
2 | Fault of roll bending force measuring sensor in |
10 s–20 s | Quality-unrelated |
3 | 10% stiction of the cooling valve between |
10 s–20 s | Quality-related |
According to the exit thickness value of the strip steel under three types of fault condition, it is obvious that fault 1 and fault 3 are quality-related, while fault 2 is quality-unrelated. As Figures
To examine the advantages of our proposed approach, a comparison research has been done using two evaluating indexes: FDR and FAR.
False detection rates and false alarm rates are counted for three types of fault and statistical results are shown in Table
Detection performance comparison.
Fault number | Type | MPLS ( |
TPLS ( |
MTPLS ( |
MTPLS ( |
PLS-MTPLS ( |
PLS-MTPLS ( |
---|---|---|---|---|---|---|---|
1 | FDR | 0.7968 | 0.8977 | 0.8620 | 0.9995 | 0.9435 | 0.9995 |
2 | FAR | 0.3357 | 0.4335 | 0.1335 | 0.2610 | 0.0080 | 0.2605 |
3 | FDR | 0.7780 | 0.8510 | 0.8848 | 0.9729 | 0.9092 | 0.9735 |
Fault 1 represents the failure of hydraulic roll gap control structure. Fault occurs at about 20 s, namely, the 2000th monitoring sample. The values of roll gap
Curve of fault source
Detection result of fault 1.
Figure
Diagnosis result of fault 1.
Fault 2 represents the fault of sampling value of bending force in
Detection result of fault 2.
Detection result of fault 3.
Fault 3 is a kind of fault in cooling valve between
Diagnosis result of fault 3.
In this section, we focus on the research of exit thickness of the strip. Twenty variables among measured variables in FMP were selected for building PLS-MTPLS model. Based on it, a kind of comprehensive monitoring index and a kind of relative contribution rate were established for fault monitoring and diagnosis, respectively, for three common faults. Results of monitoring and diagnosis verified that PLS-MTPLS has higher FDR and lower FAR than traditional multivariate statistics methods shown in Table
In this paper, a new PLS-MTPLS method is proposed on the basis of covariance descriptions of PCA and PLS algorithm for multimode process monitoring. After mode division of quality-related principal components, multimode information is embedded into the monitoring model by integrating GMM with TPLS, which avoids the direct use of process training data for modeling. Based on the quality-related multimode monitoring model PLS-MTPLS, a kind of comprehensive monitoring index is applied to execute real-time online monitoring. Then, a combined index is constructed for improving monitoring efficiency and extended to fault diagnosis by relative gradient contribution rate calculation.
The efficiency and superiority of PLS-MTPLS are demonstrated through application to the monitoring of HSMP. As can be seen from the comparison and analysis, the proposed approach can reduce computational complexity and be more suitable for multimode processes.
The authors declare that there are no competing interests regarding the publication of this paper.
This work is supported by the National Natural Science Foundation of China (NSFC) under Grants 61473033 and 61673032 and by Beijing Natural Science Foundation (4142035), China.