With the continuous development of the manufacturing industry, the requirement for strip steel quality is becoming higher and higher in automobile manufacturing, mechanical processing, and electronic and electrical industries. The precise control of strip quality depends on the accurate prediction of strip quality to a certain extent. However, the data collected by a large number of sensors on the complex strip production line and generated by the computer control system presents the characteristics of high dimensionality, high coupling, and nonlinearity, which brings difficulties to the prediction of strip quality. The continuous production of massive data in the production line also forces steel enterprises to seek new data mining methods, mining the relationship between sensor data to predict and control strip quality. To solve these problems, this paper proposes a GBDBN-ELM model, which is more efficient and more accurate than other algorithms. In this model, the RBM in DBN is replaced with GBRBM, so that RBM no longer depends on the binary distribution, can handle continuity values, and retain more data features. In order to solve the problem of too long DBN training time, this article replaces the BP network in DBN with an ELM regression model. The ELM model predicts the strip quality based on the extracted data abstract features, thereby improving the model’s prediction accuracy and shortening the training time. In this paper, the GBDBN-ELM model is compared with the BP neural network, ELM, and DBN, and root mean square error,
In the industrial field, the steel industry is one of the national basic industries. Most of the raw materials, resources, and equipment of other industries are provided by the steel industry. The development of the steel industry has also led to the progress of construction, machinery, transportation, and other industries. Although the current international steel production is increasing year by year, the technology for rolling high-quality steel still needs to be improved. With the rapid development of industry and technology, many industries have higher and higher requirements for the quality of strip steel, such as infrastructure engineering, automobile manufacturing, mechanical processing, and electronic and electrical industries. Therefore, the improvement of strip quality has become one of the main tasks of the hot rolling production process. The strip quality can be estimated in advance through prediction, and then, the process parameters can be adjusted in time through computer calculations to achieve closed-loop control of the system, which can maximize the strip quality. Therefore, the strip quality prediction method has gradually become a hot spot in the steel industry.
The traditional rolling mill control relies on manual operation; the strip quality at the exit is controlled by simple electric pressing or manual pressing, without the participation of many sensors. The steel industry has bid farewell to traditional production modes with the extensive application and development of modern automatic control theory in the industrial field. The combination of modern equipment and advanced technology has made the strip production process increasingly complex [
Quality of the strips at the exit of hot continuous rolling mainly depends on the finishing mill. The change of strip width and thickness is caused by the rolling force from the vertical stand and horizontal stand in finishing rolling. The main factors affecting strip steel quality include rolling force, reduction position, inlet temperature, roll bending force, roll gap width, and rack speed. Moreover, factors such as water flow, motor current, oil film compensation, and lubrication also have a certain impact on the surface quality of the strip [
Moreover, with the development of technology, the production model has spread from physical space to virtual space, and the degree of digital production has gradually deepened. More sensors, data acquisition equipment, and computer network control system are involved in the production process. A large amount of raw data are produced in the strip production line every day. How to use these data reasonably and mine more knowledge for strip quality prediction and control is also a problem that needs to be studied.
In order to achieve closed-loop control of strip quality and advance adjustment of process parameters, and to solve the poor prediction accuracy of strip quality resulting caused by high-dimensional, highly coupled, nonlinear data, the main contributions of this paper are as follows.
This paper proposes a strip quality prediction model combining DBN and ELM. In the combined model, DBN is used to extract features from high-dimensional and high coupling input data, and ELM predicts strip quality according to the extracted data features.
Based on the DBN-ELM combined model, the RBM in the DBN model is replaced by GBRBM to solve the dependence on the binary distribution of the visible layer and hidden layer of RBM. The model is improved to the GBRBM-ELM model to suit the continuous value regression problem.
The feasibility of the model is analyzed from the aspects of prediction accuracy and model performance, and the prediction effect of the model is compared with that of BP, ELM, and DBN. The results show that the GBDBN-ELM model can improve the prediction accuracy while shortening the model training time.
The rest of this article is organized as follows: The second chapter introduces the research progress of strip quality prediction technology; the third chapter introduces the principle, network structure, and training method of the strip quality prediction model; the fourth chapter validates the model through the data on the production line of a steel company; the last chapter is a summary of this article.
Quality prediction and quality control problems often use two types of methods, mathematical model methods and data mining techniques.
In traditional quality control, the mathematical model is used to predict the quality parameters, and the variables such as temperature, pressure, element, and their relationship are described by mathematical formulas [
On the other hand, the growing mass of data has made data mining methods centered on machine learning and deep learning more attention [
Data mining techniques can be divided into two parts: machine learning and deep learning. Kotkunde et al. used artificial neural networks (ANN) and support vector machines (SVM) to evaluate the thickness distribution of alloy sheets at various temperatures and blank diameters [
The above studies are based on pure machine learning predictions, but machine learning cannot handle high-dimensional problems. Before using the machine learning method, the above research often needs to select data features to reduce the dimension of the input parameters [
Liu et al. used DBN to process a large amount of real-time quality data collected by sensors and constructed a real-time quality monitoring and diagnosis plan for the manufacturing process [
It can be concluded from a large number of studies that since deep belief networks were primarily used to classify problems at first, now it is mainly used for classification problems such as defect identification and quality classification in manufacturing quality problems and less often applied to regression problems such as quality parameter prediction. However, because the deep belief network has strong high-dimensional feature extraction capabilities and good model generalization, it can be improved to suit continuous value prediction problems on the basis of maintaining the feature extraction capabilities. For example, in existing research, DBN is combined with Particle Swarm Optimization (PSO) [
Considering the complexity of the DBN network structure, this paper chooses the combination of DBN and ELM to simplify the DBN training method and shorten the training time while improving the prediction accuracy.
A deep belief network is one of the core algorithms in deep learning. The deep belief network is composed of several restricted boltzmann machines (RBM) and a BP neural network, which can solve the high-dimensional and high-coupling problem well. However, DBN has problems such as unsuitable for continuous value and too long training time. In this paper, DBN is improved to make it more suitable for the quality prediction of the strip finishing process.
The deep belief network is composed of multiple series-connected RBMs and a BP neural network. It has a powerful feature learning ability. The structure of the deep belief network for strip steel quality prediction is shown in Figure
The first visible layer
Deep belief network structure.
It can be seen from the figure that the training process of DBN is divided into two stages, namely, the forward pretraining stage and the reverse fine-tuning stage. DBN uses a greedy unsupervised learning mechanism to complete layer-by-layer forward training from bottom to top and extracts the abstract features of the bottom-level data as the high-level input, until the features are sent to the top-level regression unit. Then, it calculates the error between the regression result and the real result and uses the back propagation algorithm of the BP network to complete the reverse fine-tuning of the parameters, further reducing the model error and improving the training accuracy of the system.
DBN gives full play to and combines the advantages of RBM and BP neural network, uses multilayer RBM to extract and abstract high-dimensional data, retains important feature information as much as possible, uses the BP network to complete regression, and uses the BP algorithm to fine-tune the parameters of each layer, so as to achieve the optimal state.
Although the traditional DBN has particularly good feature extraction capabilities, after analyzing the model, it can be known that the traditional DBN model also has the following shortcomings:
The visible layer and hidden layer of traditional RBM obey the binary distribution and have a good function of extracting feature signals for discrete data. In the problem of strip quality prediction, the continuous input signals need to be digitized, which leads to the loss of information and reduces the accuracy of the model In the process of DBN training, an important parameter that needs to be adjusted is the number of neurons in each hidden layer, which directly affects the prediction accuracy and training time of the model. For the problem of strip quality prediction, the dimension of input data involved is relatively high, so it is more difficult to select the number of neurons Since the fine-tuning process of DBN is based on the gradient descent algorithm, the convergence speed of the BP network is relatively slow. In addition, the BP algorithm is a local search algorithm, which may cause the network to fall into a local optimum due to improper selection of the initial network weights, which may lead to network training failures
In order to solve the above problems, this paper introduces Gauss-Bernoulli RBM instead of RBM in traditional DBN to save the signal of continuous input data, introduces particle swarm optimization to calculate the optimal number of neurons in the hidden layer in the process of parameter adjustment, and introduces extreme learning machine to shorten the training time of the model, improve the generalization ability, and avoid falling into local optimization.
Restricted Boltzmann machine (RBM) is a shallow random generation network proposed by Hinton et al. It is an energy model for unsupervised learning. It divides all neurons into the visible layer and hidden layer. Data is input from the visible layer to express data features. The hidden layer can extract features to express the relationship between input variables, so the hidden layer is also called a feature extractor. The two layers of neurons are fully connected, and there is no connection between the neurons in the same layer.
Suppose
When the state of (
The visible layer and hidden layer of the traditional RBM are limited by the binary distribution [
Gaussian-Bernoulli RBM (GBRBM) is a restricted Boltzmann machine for nonbinomial data proposed by Krizhevsky and Hinton. GBRBM introduces Gaussian function between visible and hidden elements to process continuous numbers between 0 and 1. The energy function expression of GBRBM is as follows:
The lower the energy of the system is, the more stable the system is and the smaller the error of quality parameter prediction results is. In equation (
In equation (
Since there is no connection between neurons in the same layer of RBM, the activation states between the visible layer and the hidden layer unit are independent of each other, so when the
The purpose of RBM model training is to calculate the optimal value of parameter
In order to calculate the updated equation of each parameter, we use the contrast divergence (CD) algorithm proposed by Hinton to train the model and add the adjustment of
In equation (
The BP neural network is used in the upper layer of DBN. Although the BP neural network has better adaptive ability, it adopts the gradient descent algorithm in the training process. When the neuron is close to 0 or 1, the convergence speed is relatively slow, resulting in a longer training time for the model. Moreover, the BP algorithm may fall into a local optimum for complex nonlinear problems such as strip quality prediction. In order to solve these problems, this paper introduces the extreme learning machine model.
Extreme learning machine (ELM) is a single hidden layer feedforward neural network proposed by Huang Guangbin in 2004, including the input layer, hidden layer, and output layer. The structure is shown in Figure
Structure of ELM.
Suppose there are
In equation (
Expressed as a matrix:
In equation (
In equation (
In this article, the RBM in the traditional DBN is replaced with GBRBM to form GBDBN, and then, the GBDBN model and the ELM model are combined, as shown in Figure
GBDBN-ELM network structure for strip quality prediction.
In this model, the strip quality input data is extracted by multilayer GBRBM to form a low-dimensional feature expression, which ensures the features of the original input data set as much as possible. Then, input the extracted features into ELM for regression prediction to obtain the predicted strip quality prediction data.
For an
According to the ELM algorithm, the output matrix of the
The GBDBN-ELM model combines the unsupervised learning characteristics of DBN with high learning efficiency and strong generalization ability of ELM. It can improve the training speed and prediction accuracy.
The indexes to measure the quality of strip steel mainly include the thickness, width, and surface temperature, among which the thickness is the most important index to evaluate whether the steel is up to the standard [
The experimental data in this paper comes from a 1580 mm hot strip finishing line of a steel company. The production line consists of 7 units. After 5~7 passes of rough rolling, we can get intermediate billet of 25~60 mm thick, which can be sent to the finishing mill after the hot coil box, flying shear, and dephosphorization box. The control of strip thickness is mainly in the finishing mill. After the finishing mill, we can obtain the finished strip with thickness of 1.2-12.7 mm. The production line consists of seven finishing mills, namely, F1~F7. A work roll bending device is adopted on 7 rolling mills, among which F2~F4 are PC rolling mills with crossed rolls in pairs. Looper rolls are installed between each two rolling mills to balance the rolling tension and prevent plate stacking. The threading speed, acceleration, reduction of each stand, and bending force of each stand of the F1~F7 rolling mill are calculated and set by a computer control system according to the variety and specification of rolled strip and can be adjusted dynamically. The exit of the F7 finishing mill is equipped with rolling line detection instruments for thickness, width, temperature, and crown of strip steel quality, which can monitor the quality in real time and modify the process parameters to improve the quality of rolled products.
In this experiment, the process parameters set by the computer control system of the seven finishing mills in the finishing rolling stage and the strip quality parameters detected by the sensor at the F7 exit are collected within 8 days. The sampling time interval is 90 seconds, and a total of 3350 sets of production data are collected. Each set of data includes 7 sets of finishing mills’ reduction position, rolling force, stand speed, oil film compensation, eccentric compensation, and other process parameters, as well as their confidence and number of points, totaling 234 columns of data.
As there are 234 process parameters collected, if all these data are used to predict the strip thickness, the deep learning network will be very complex and the training time will be very long. However, some of the data are not highly correlated with the final strip exit thickness. In this paper, the importance of each element is sorted by the gradient boosting decision tree method, as shown in Figure
The importance of each element to the thickness of the strip (the first 20).
Due to the complex production environment, water vapor, and other interference factors, and the instability of the computer system and sensor itself, the data collected on-site has certain errors, missing data, and abnormal values. For the problem of missing data, this paper uses the mean method to supplement the missing value. For outliers, first, calculate the Euclidean distance between samples by the
According to the holdout verification method, 3000 groups of data are randomly selected as the training set, and the remaining 350 groups of data are used as the verification set of the model after training.
Before training and prediction, some relevant parameters need to be set in advance. These parameters cannot be updated in the training process but given in advance through the parameter setting method. These parameters have a great impact on the learning ability of the model and need to be adjusted continuously to maximize the advantages of the model.
By analyzing the structure and principle of the network model, the superparameters of the GBRBM-ELM model need to be set in advance, including the number of GBRBM layers in DBN, the number of hidden layer nodes in DBN and ELM, the number of visible layer nodes in the first RBM layer, the number of ELM output layer nodes, the size of data blocks in the network training phase, the number of training rounds, the learning rate and momentum term.
Since 69 input parameters are selected to predict the strip thickness, the number of visible layer nodes is 69 and the number of output layer nodes is 1. This paper uses different methods to set and tune different parameters.
Grid search is to use prior knowledge to specify the value range of parameters. In this range, the parameters are listed hierarchically. Based on the experimental results, the optimal parameter value with a small prediction error can be selected.
Taking GBRBM layers as an example, it is one of the important parameters of the DBN network structure. The number of RBM layers directly affects the prediction effect of the model. When the number of RBM layers is too small, the model will not be able to take advantage of deep learning, and the prediction effect will be poor. But too many layers will lead to the training time process or cause overfitting. According to prior knowledge, the change range of the number of layers is set to be between 1 and 10. The prediction effect of the model is shown in Figure
Variation of residual sum of squares with RBM layers.
According to the comparison results, when the number of RBM layers is 4, the model error is the smallest, so this paper uses a 4-layer RBM network structure.
Using the same method, after multiple comparison experiments, the number of hidden layer nodes in ELM, data block size, training rounds, learning rate, and momentum can be obtained. The optimal parameters of the network are shown in Table
Parameters of GBDBN-ELM model.
Number of RBM layers | 4 |
Number of hidden layer nodes in ELM | 60 |
Data block size | 150 |
Training rounds | 20 |
Learning rate | 0.01 |
Momentum | 0.5 |
Another main parameter of the DBN model structure is the number of nodes in each hidden layer. Because the hidden layers in the 4-layer RBM are related to each other, the number of nodes varies widely, and there are many node combinations; it is difficult to use grid search to enumerate one by one to find the optimal combination of the number of nodes. In this paper, particle swarm optimization (PSO) is used to automatically calculate the number of hidden layer nodes in each layer.
The particle swarm algorithm compares the optimized solution of each objective function to the particles in the search space. Each particle has two parameters, position and velocity, and the fitness of the particle can be calculated from the objective function. By comparing the fitness of the particle at the current time with that at the previous time, the individual optimal position
In equations (
Set the population size of PSO as 10 and the number of alternations as 10, and finally, find the number of hidden layer nodes of 4-layer DBN as
The training of the GBDBN-ELM combined model is divided into two parts:
Based on GBDBN-ELM module training, effective DBN and ELM are obtained, respectively. The test data set is preprocessed to obtain high-dimensional sample data to be detected. The trained GBDBN model is used for feature extraction to obtain better feature data. The predicted strip thickness can be obtained by the ELM module. The overall process is shown in Figure
Strip steel quality prediction process based on GBDBN-ELM.
In this paper, five indexes are used to evaluate the prediction effect of the model, including the sum of squares of residuals (SSR), root mean square error (RMSE),
The smaller SSE, RMSE, and MAE, the better the prediction effect.
350 sets of data were used in the test set to evaluate the performance of the model. In this paper, the simulation results of the prediction model are assessed by analyzing the curve of the predicted value and the true value of the strip thickness, the curve of the prediction error, and the curve of the prediction relative error.
It can be seen from Figure
The curve of the predicted value and the true value of the strip thickness.
The curve of the prediction error.
The curve of the prediction relative error.
In order to comprehensively analyze the prediction performance of the model for strip steel quality, this paper compares it with the BP neural network, ELM, traditional DBN network, and DBN-ELM model and evaluates the above models according to SSR, RMSE, and
Comparison of prediction results of different models.
Index\model | BP | ELM | DBN | DBN-ELM | GBDBN-ELM |
---|---|---|---|---|---|
SSR | 59.3560 | 78.2718 | 14.3091 | 22.398 | 8.5856 |
RMSE | 0.4118 | 0.4728 | 0.2022 | 0.2530 | 0.1565 |
0.9872 | 0.9607 | 0.9926 | 0.98854 | 0.9956 | |
3.1548 | 0.5127 | 96.9802 | 32.0120 | 33.8380 |
Comparison of relative errors of BP and GBDBN-ELM (part).
Comparison of relative errors of ELM and GBDBN-ELM (part).
Comparison of relative errors of DBN and GBDBN-ELM (part).
Comparison of relative errors of DBN-ELM and GBDBN-ELM (part).
Comparing the prediction results of the BP neural network and ELM in Table
It can be seen from Table
By comparing the results in Table
From Figures
Especially for data with large prediction errors of DBN and DBN-ELM, GBDBN-ELM can significantly reduce the error and achieve better prediction results. The advantages of the improved model are also reflected here to a large extent.
The analysis of Table
This paper proposes an improved DBN strip quality prediction method to solve the problem that the strip quality prediction accuracy is not high because there are many sensors involved in the strip production process, and most of the process parameters are coupled with each other and have serious nonlinearity. In this paper, the RBM in DBN is changed into GBRBM to eliminate the dependence on binary distribution, extract the features of high-dimensional and high coupling input data, combine GBDBN with ELM, replace the BP network in DBN with ELM, and input the extracted data features into ELM for strip quality prediction. The GBDBN-ELM model is verified by the data of the steel finishing line and used to predict the strip thickness. We can draw the following conclusions.
The simple BP neural network and ELM model cannot deal with the high dimension and high coupling nonlinear data produced by the complex production process. Due to the simple network structure, they cannot fully extract the data features and mine the knowledge contained in the data, resulting in the accuracy of strip thickness prediction being not enough.
The GBDBN model proposed in this paper can solve the problem of low prediction accuracy caused by complex input data. The GBDBN network can retain as many abstract features of input data as possible, so that ELM can obtain higher prediction accuracy.
Through the comparison with the DBN network, it can also be known that using the ELM algorithm for GBDBN network training and prediction calculations can greatly shorten the time and solve the problem of excessive training time caused by the complexity of the DBN network.
The raw/processed data required to reproduce the experiments in this article cannot be shared due to corporate confidentiality.
The authors declare that there is no conflict of interest.
This work was supported by the National Science and Technology Innovation 2030 of China Next-Generation Artificial Intelligence Major Project, Data-Driven Tripartite Collaborative Decision-Making and Optimization, under Grant 2018AAA0101801.