Process Monitoring and Fault Diagnosis for Piercing Production of Seamless Tube

With the development of modernization, the application of seamless tube becomes widespread. As the first process of seamless tube, piercing is vital for the quality of the tube. The solid round billet will be transformed into a hollow shell after the piercing process. The defects of hollow shell cannot be cleared in the following process, so a monitoring model for the quality of the hollow shell is important. But the piercing process is very complicated, and a mechanism model is difficult to build between the qualities of the hollow shell and measurement variables. Furthermore, an intelligent model is needed. We established two piercing process monitoring and fault diagnosis models based on the multiway principal component analysis (MPCA) model and the multistage MPCAmodel, respectively, and furthermore we made a comparison between these two concepts. We took three ways to divide the period based on process, K-means, and GA, respectively. Simulation experiments have shown that the multistate MPCA method has advantage over the MPCAmethod and the model based on the genetic algorithm (GA) can monitor the process effectively and detect the faults.


Introduction
The application of seamless steel pipe has penetrated into every aspect of life, for instance, bicycle, aerospace, and boiler.In the meantime, a higher quality of the steel pipe is important.Piercing is the first step of the process.It is a fact that the quality of the hollow shell will affect the quality of the pipe directly.Piercing can be divided into 3 categories: pressure piercing, punch piercing, and oblique piercing.Pressure piercing is based on press; punch piercing depends on the push bar machine; cross piercing is a new piercing technology invented by Germany Mannesmann brothers in 1885, and a solid round steel can be perforated into a seamless steel pipe in a procedure, which has occupied an important position in seamless tube production.
Then, we make an introduction for the piercing process based on the Baosteel Tube Branch Diescher mill; as shown in Figure 1, Diescher skew rolling puncher is composed primarily of two rollers, around two guide wheels and plug.During the process of piercing, the roller is the main drive outer piercing tool, and its roll body is usually divided into three stages: the entrance cone, the exit cone, and the rolling zone.Sections of the entrance can drag the tube bill billet into the piercing zone and pierce.Exit cone can reduce capillary wall, flat capillary surface, reel shell wall thickness, and round the capillary.Rolling belt is the transition between the inlet cone and outlet cone.Guide wheels are the fixed outer perforated tool, which guide the billet and the capillary tube, stabilize rolling line, close groove ring, and limit the horizontal piercing of the capillary.The plug is the inner piercing tools.Plug keeps the axial position fixed by the support of the top rod in the piercing zone.Experiments have proved that in the process of piercing, the external diameter of work-piece changes little while the inner diameter is mainly controlled by plug stilettos in the range of zero to the required value.It can be concluded that piercing roller, guide wheel, and plug play important roles in the deformation of cross piercing.
Some problems still exist in the piercing process, such as folding, crack, chain belt, and unequal thickness.These issues are very difficult to eliminate later in tube rolling or even result in more serious faults.Therefore, we need to establish 2 Mathematical Problems in Engineering (5) ( The piercing process is a typical batch process with features such as multiperiod and dynamic multivariable, so we can establish the monitoring and diagnosis model by the batch process statistics method.The model based on the measurement data is mainly established by the principal component analysis (PCA), the partial least squares (PLS), and other projection methods.Nomikos and MacGregor proposed batch process monitoring and quality prediction algorithms [1][2][3] based on the multiway principal component analysis (MPCA) and the multiway partial least squares (MPLS) which means a lot for the batch process statistical modeling, online monitoring, fault diagnosis, and quality prediction control based on the data.But the traditional modeling method has inherent limitations for batch process.According to the multiperiod of batch process, the process can be divided into several periods, and then we can establish the statistical analysis model based on subperiods for process monitoring and fault diagnosis.Methods for dividing period of the process mainly include knowledgebased method which is applicable for a better understanding of the mechanism of the practical operation; characteristic analysis method extracts signal feature; automatic identification method identifies period automatically by the algorithm, such as clustering algorithm and MP algorithm.The twoperiod MPCA algorithm [4] proposed by Kosanovich et al. as well as the multistage algorithm proposed by Dong and Mcavoy [5] is the first step and the initial exploration to understand the concept of periods.Since then, the British Professors Martin and Morris' team [6,7] launched the corresponding statistical analysis based on the concept of "group" and Lennox et al. [8] did related research on local modeling method.Duchesne and MacGregor [9] proposed a pathway multiblock PLS algorithm and they introduced intermediate measuring of quality to build a multiblock PLS model to analyze local effects of quality affected by the process trajectory.But for the actual process, the intermediate measurement value of quality is difficult to obtain, which hinders the wide application of the algorithm.The idea of local modeling such as "group" or "block" is actually a version of the concept of period.Ündey and C ¸inar [10] proposed concept of periods directly and established a process monitoring and quality prediction model based on subperiod.
The previous work applied the method of subperiods and also proved that each period has different potential characteristics.They applied the prior traditional statistical analysis method directly, although subperiods were studied rather than a whole process.Essentially speaking, the defects of the traditional modeling method had not been cleared.In order to overcome the above disadvantages and make the best of period characteristics in the process, Lu et al. [11] believed that the MPCA/MPLS has laid an important foundation for statistical analysis, online monitoring, fault diagnosis, and even quality control of batch processes which are based on multivariate statistical method, proposed period division algorithm for batch process, and developed process monitoring and quality prediction algorithm [11][12][13][14] based on subperiod PCA/PLS model further.Zhao [15] divided the process into several stages in a soft measurement method to establish a multistage MPCA model for online monitoring, and the model was successfully applied to the injection molding process.
GA [16] is a biological evolution model that simulates Darwinian's genetic selection and the natural elimination process by selection, crossover, and mutation.It is a global optimization algorithm which has been widely used in the batch process; for example, Leardi [17] used GA algorithm and an improved algorithm for feed rate of penicillin fermentation process optimization and achieved good results.
In this paper, we establish two different models.First, we establish the capillary quality monitoring and fault diagnosis model based on MPCA model.Then, we build multistage MPCA monitoring model based on the multiperiod of batch process and we applied methods based on process, means clustering, and GA optimization, respectively, in order to make divisions.The simulation effect showed that the multistage MPCA obtains a better effect and that the division based on the GA optimization algorithm detected the faults more clearly.

Monitoring and Fault Diagnosis of Multivariate Statistical Process
2.1.Data Characteristics of Batch Process.Batch processes are repetitive production process.The corresponding data sets have one more dimension than the continuous production process data set.We can use three-dimensional data matrix (××), as shown in Figure 2, where the three dimensions , , and , respectively, represent the batch number of samples, number of process variables, and the number of measuring points in each operation.We need to unfold the three-dimensional data matrix into two-dimensional model.The unfolding methods involve six ways, as shown in Figure 3.
The discussion of methods unfolding three-dimensional data and the comparison of the corresponding process monitoring methods are introduced in [18][19][20].The MPCA model actually uses the  expansion method.The  expansion method preserves the information of the first dimension and puts each of its vertical slices ( × ) side by side to the right, resulting in (×).Here, we unfold the (××) matrix by  expansion method and standardize ( × ) by the following formula: (1)

Model of MPCA Algorithm. After 𝑋(𝐼 × 𝐽 × 𝐾
) is expanded into ( × ), normalized into X( × ), and cut into  vertical slices   ( × ), they will be decomposed into  load matrixes   , score matrixes   , and feature vectors Egen  by using singular value decomposition for each slice.The load matrix and feature can be defined by formula (2), and the number of principal components can be calculated by cumulative contribution rate based on formula (3).The load matrix P( × ) in the principal component space is first A columns of load matrix .
Principal component analysis model for process monitoring is shown in formula (4).In current monitoring and fault diagnosis, judging whether statistics  2 and SPE are over the limit is usually used to determine whether faults happen. 2 and SPE can be defined by formula (5). ( The control limits of  2 approximately obey  distribution: where  is the number of samples for the model,  is the number of the principal components, and  is the significant level.
For residual subspace, SPE  of the MPCA model approximately obeys  2 distribution at time : where  is a constant and ℎ is the freedom degree of the  2 distribution.V  and   are, respectively, the mean and variance of the square prediction error at time .But due to the fact that the MPCA algorithm regards a batch of data as an entity, it is insensitive for the small fault.Obviously the batch process has multiperiod characteristic; that is, the process has three periods: first unstable piercing process, stable piercing process, and secondary unstable piercing process.The characteristic variables of different subperiods are not the same.Therefore, we propose a subperiod MPCA for fault monitoring and fault diagnosis of batch process.

Model of Subperiod MPCA Algorithm.
Division of period is the first problem for subperiod MPCA model.We adopt the methods based on process, -means clustering, and a GA optimization method.
(1) Piercing process includes three periods: the first unstable process where the metal ahead of billet fills piercing zone gradually (i.e., billet goes through touching the roller (first bite) until the metal spread out of the piercing zone), which includes first bite and second bite; the stable process which is the main stage of piercing process, from the front of billet filling piercing zone to the rear leaving the piercing zone; and the second unstable process where the rear metal leaves the piercing zone gradually until the end.The monitoring model is in accordance with the three processes.
(2) The idea of -means clustering algorithm is that the similarity of the same cluster is large and the similarity of different cluster is small, and the algorithm is summarized as follows: (1) Selecting  objects as the initial clustering center from  data objects (2) Calculating the distance between each object and the center object, divide the objects according to the minimum distance (3) Calculating the mean value of each cluster (4) Looping ( 3)-(4) until each cluster is invariable (3) GA algorithm is a global optimization algorithm, simulating the genetic mechanism of nature.The genetic algorithm regards the binary or decimal chromosome as the solution of problems.The basic genetic operation has selection, crossover, and the mutation.Initialize the chromosome group first, and the chromosome group will converge to the optimal solution after the basic genetic operations.In general, we have several problems to solve before using GA, such as the method of "chromosome" coding [21], parameter setting of genetic algorithm, and the choosing of fitness function.
Here, we need to optimize the periods of piercing process; that is, we want to determine the sections of the optimal length of 1, 2, and 3, which belongs to integer optimization problem with constraints; thus, we choose decimal coding.
For parameter setting of the GA [22], we analyze the following: the general choice is  = 30∼160 due to the fact that it is hard to get the solution for small size and the convergence is slow for big one.Crossover probability   may be too little to search forward or too big to break the structure of high adaptive value, and a usual choice is   = 0.25∼0.75.Similarly, the common choice of mutation rate   is 0.01∼0.2,which can be too small to produce new genetic results or too big to change genetic algorithm into a random search.
The selection of fitness function is also very important.We select false positive rate as fitness function, searching for the minimum value of the fitness function.

Fault Diagnosis.
When - 2 and SPE indexes are beyond time limit control, it indicates that the fault has occurred in the production process, but the fault origin is unknown.We can get the fault information through the contribution plot and then diagnose the origin of fault.
Contribution Plot of -.According to formula (8), the contribution of the th principal component is It can be calculated by And according to formula (10), the contribution of the th variable relative to the th principal component can be

𝑝 𝑚,𝑎
) . ( Calculated by the following, one gets Contribution Plot of .It is much simpler.The contribution of the th variable can be calculated by In the contribution chart, the obvious process variables provide valuable information for fault analysis when there is a fault.

The MPCA Model and Multistage MPCA
Model for Piercing Process

Data Acquisition
(1) We collect 20 data under normal production condition by using ibaAnalyzer software and get a threedimensional data matrix ( ×  × ).The three dimensions, respectively, represent the batch number   of samples, number of process variables, and the number of measuring points in each operation ( = 20,  = 24,  = 500).
(2) We cut the three-dimensional matrix along the direction of the third dimension and get  vertical slices   ( × ).By decomposing the  two-dimensional time slice matrix, we got  load matrixes   ( = 1, 2, 3, . . ., ).The principal component analysis model widely used in process monitoring field is shown as follows:

The On-Monitoring Model
In current monitoring and fault diagnosis, judging whether statistics  2 and SPE are over the limit is usually used to determine whether faults happen.The control limits of  2 approximately obey  distribution shown as formula (5).For residual subspace, SPE  of the MPCA model approximately obey  2 distribution at time  shown as formula (6).
We judge whether the two indexes are over the limit by calculating  2 and SPE: Limit of SPE SPE   We collect a normal batch of data and get the plot of monitoring, as shown in Figure 4.
Under normal operating conditions, the SPE is over the limit.Contacting three stages of piercing process, the first instability process, stable process, and the second unstable process, the two unstable of which can generate large deviation due to mutations of the current and speed.That means that a MPCA modeling will make large monitoring error, so it is necessary to build subperiod MPCA model for piercing.

Model Based on Process Division.
The process of piercing includes three periods: the first instable process, stable process, and the second unstable process.The division is in accordance with the three processes.The plot of monitoring is shown in Figure 5, which indicates that the effect is not good enough.
From Figure 5, it can be seen that the effect of monitoring has been greatly improved, but there are still false positives.Thus, we cannot simply divide by the concept of process.Due to the fact that feature information varies in different operation period for batch process, we need to improve the method and then we try the clustering algorithm.

Model Based on 𝐾-Means Division.
The idea of means clustering algorithm is that the similarity of the same cluster is much more than that of different cluster.We get three stages by -means clustering algorithm and get the plot of monitoring, as shown in Figure 6.
It can be seen from Figure 6 that the effect of means clustering is still not ideal.The -means clustering algorithm is easy to implement and it is also efficient.But the parameters, such as the number of subsets of data, the initial cluster centers, similarity measure, and distance of matrix, are all hard to determine.The selection has no consistent standard, so the global optimal property cannot be obtained and some other optimal algorithms are needed.
It is known that genetic algorithm (GA) has advantages in nonlinear optimization and has good effects in batch process [17], so we choose GA to optimize the divisions.

Division by
Applying GA Optimization.Genetic algorithm (GA) simulates the evolution of artificial population.It will converge to the optimal state after generations by selection, crossover, and mutation.In this paper, we choose false positives as the target function and the decimal coding and set the population size for 30, the iteration for 100, and the chromosome for 2. According to the plot of GA division segmentation, as shown in Figure 7, the monitoring effect is greatly improved.
It turns out that the segmentation effect of the GA optimization algorithm is better.We can conclude from the four monitoring models that (1) the traditional MPCA model cannot adapt to the batch process for monitoring and it is necessary to make a division, (2) a simple division by process cannot involve the changes of the characteristics for the batch periods, (3) the -means clustering algorithm division has its limitations so that it cannot reach the optimal solution, and (4) GA optimization algorithm was proposed to optimize the segmentation and it succeeds.

Model for Monitoring and Fault Diagnosis
Based on Multiperiod MPCA Two kinds of faults are introduced: Fault 1: Roller speed fault, from 250th to 271th sampling time, the roller speed is 0. Fault 2: Guide plate current fault, from 250th to 271th sampling time, the current is 0.

Fault Diagnosis for the 1st
Roller Speed.We can get the monitoring plot as shown in Figure 8.
From Figure 8, we can get that the SPE index rises at 250th sampling time and falls at 272th sampling time; that is,  monitoring plots of SPE have an obvious alarm phenomenon, which  2 monitoring plots do not have.Multistage MPCA model can quickly and accurately detect the fault.For comparison,  2 contribution plots are still drawn together with the SPE contribution plots.In order to diagnose the cause of the fault, this paper, respectively, drew principal component contribution plots,  2 contribution plots and SPE contribution plots of 160th, 265th, and 460th sampling time.
From Figure 9, we see that the contribution rate of each principal component varies at different moments, and we analyze the contribution rate of each variable for the largest principal component shown in Figure 9.According to the contribution rate of variables for the largest principal component and SPE, we plot Figures 10 and 11.
From Figures 10 and 11, it can be seen that the first variable (roller speed) has a much larger contribution rate for the abnormal principal component vector at the fault moment than the other variables.The fault diagnosis can also be obtained as the mutation of the upper roller speed is the main cause of the fault through knowledge of the statistical analysis.Also, from the experimental results, although variable contribution rate for the abnormal principal component vector can diagnose the fault, its monitoring effect is not obvious.Therefore  2 contribution plots can serve as the reference.The fault can be monitored and diagnosed by SPE contribution clearly.

Fault Diagnosis for the 23th
Guide Plate Current.We can get the monitoring plot as shown in Figure 12.
Similarly, multistage MPCA model can quickly and accurately detect the fault.The principal component contribution plot is shown as Figure 13.
2 contribution plots and SPE contribution plots of 160th, 265th, and 460th sampling time are in Figures 14 and 15, respectively.
From Figures 14 and 15, it can be seen that the 23th variable (guide plate current) has a much larger contribution rate for the abnormal principal component vector at the fault moment.The fault diagnosis can also be obtained as the mutation of the guide plate current is the main cause of the fault through knowledge of the statistical analysis.
The above fault diagnosis results verify the effectiveness of the proposed methods.

Conclusions
With the increasing demand of the steel pipe, the quality of the production of the steel tube becomes more and more important.In this paper, we build effective models for monitoring and fault diagnosis for the quality of capillary, which can timely reveal fault in the production process of capillary avoiding greater failure and loss.The first step is to establish a model of monitoring and fault diagnosis based on the traditional MPCA, and a division of period is needed after evaluating the results.Then we discussed the method of division for piercing process.We build multistage model from three different methods, including division based on process, -means algorithm, and GA optimization algorithm.Finally we prove that the GA optimization has good effects.

Figure 4 :
Figure 4:  2 and SPE plots for MPCA monitoring results of the normal piercing process.

Figure 5 :
Figure 5:  2 and SPE plots for three-stage MPCA monitoring results of the normal process based on process division.
2 monitoring plot

Figure 6 :
Figure 6:  2 and SPE plots for three-stage MPCA monitoring results of the normal process based on -means division.

Figure 7 :
Figure 7:  2 and SPE plots for three-stage MPCA monitoring results of the normal process based on GA division.
2 monitoring plot