Silicon content ([Si] for short) of the molten metal is an important index reflecting the product quality and thermal status of the blast furnace (BF) ironmaking process. Since the online detection of [Si] is difficult and larger time delay exists in the offline assay procedure, quality modeling is required to achieve online estimation of [Si]. Focusing on this problem, a datadriven dynamic modeling method is proposed using improved extreme learning machine (ELM) with the help of principle component analysis (PCA). First, datadriven PCA is introduced to pick out the most pivotal variables from multitudinous factors to serve as the secondary variables of modeling. Second, a novel datadriven ELM modeling technology with good generalization performance and nonlinear mapping capability is presented by applying a selffeedback structure on traditional ELM. The feedback outputs at previous time together with input variables at different time constitute a dynamic ELM structure which has a storage ability to tackle data in different time and overcomes the limitation of static modeling of traditional ELM. At last, industrial experiments demonstrate that the proposed method has a better modeling and estimating accuracy as well as a faster learning speed when compared with different modeling methods with different model structures.
Blast furnace (BF) is a giant countercurrent reactor and heat exchanger in metallurgical industry and is the first step towards the production of steel [
Undoubtedly, the most crucial obstacle for closedloop control of BF is that the current regular instruments do not have the ability to feed the need of online measurement for molten iron quality, such as the silicon content ([Si]) in the final hot metal. In the past decades, through continuous efforts and attempts, a great number of models and algorithms have been developed trying to tackle the modeling problem for silicon content prediction. These existing methods include linear model based methods like ARX and ARMAX models [
The BF ironmaking process is a complicated dynamic system with many influential factors and large time lag. To capture the system dynamics, the time series and time delays of the relevant input and output variables should be taken into account during the process modeling. This also means that the existing static prediction models cannot capture the process nonlinear dynamics very well and thus do not provide much accuracy estimation. Therefore, the selffeedback structure which can construct a dynamic system may appear more important for the BF system with serious nonlinear dynamics and large time lag. Moreover, most of the existing prediction models are trained by gradientbased algorithms such as back propagation (BP) algorithm and its variants. It is clear that the learning speed of such intelligent models is insufficiently fast as larger number of training data may be required. Moreover, the BPlike algorithm usually suffers from high computational burden, poor generalization ability, and local optima and overweighting problems [
On the other hand, a new machine learning approach that is termed as the extreme learning machine (ELM) has been recently proposed by Huang et al. in [
Based on the work of ELM proposed by Huang et al. [
The BF ironmaking is a continuous production process conducted in a closed vertical furnace where materials reduction from iron ore to molten iron takes place every time using carbon coke and gas in high temperature and high pressure environment. Due to the advantages like simple technology, high productivity, and high production efficiency, at present and a long period in the future, the BF smelting will still be the most important way of ironmaking. Indeed, due to the large quantity production, even small improvements of the process can result in considerable profit. Thus the ironmaking BF is regarded as a significant item in the economic development of any country.
Figure
Schematic diagram of a typical BF ironmaking process.
For many countries, such as China, the steel industry is playing an important role in the national economy, and there are thus extensive interests in operational control and optimization of ironmaking BF for saving energy and reducing cost. Generally speaking, control of the BF system often means controlling the hot metal temperature and components, such as silicon content, sulfur content, and phosphorus content in hot metal within acceptable bounds, among which the silicon content is the most important one [
For a practical BF production process, silicon content ([Si]) is an important index indicating the chemical heat of molten iron. High silicon content means a large quantity of slag, and this would be easier to wipe off the phosphorus and sulphur in the hot metal. However, excessive silicon content will make cast iron stiff and brittle and even lead to lower yield of metal and easier splashing. In addition, high silicon content will result in a corresponding increase of SiO_{2} in the slag, thereby influencing slagging speed of calclime, extending converting time, and intensifying corrosion to furnace lining. From an energy point of view, it would be desired to operate the BF process at low molten metal silicon content, still avoiding the risk of cooling the hearth which may result in chilled hearth. Generally, the content of silicon content should be controlled in 0.5%~0.7%.
Nowadays, it is still an insoluble dilemma to realize the closedloop control of molten iron quality in ironmaking BF. The main bottleneck is that the direct online measurement on this quality parameter of molten iron is difficult to be realized with the existing conventional measuring means. Moreover, the offline assaying process for this index takes a long lag time, usually more than 1 hour. Therefore, online prediction based molten iron quality modeling must be established. Effective online prediction or estimation for silicon content not only can offer useful information for operators to judge the inner smelting state and operational condition, but also plays a key role in realizing closedloop control and operational optimization as well as energysaving and costreducing.
Datadriven blackbox model is a kind of inputoutput mode. It relies on the development of novel nonlinear signal processing and data analysis technologies along with computer hardware and software technologies and does not require any prior information about the process. The main thought of datadriven model is to approximate the inputoutput relationships using the strong nonlinear approximation power of some mathematical tools or artificial intelligence technologies, like artificial neural network, fuzzy logic, and support vector machines [
The proposed datadriven modeling strategy for silicon content prediction is shown in Figure
Strategy diagram of nonlinear intelligent modeling for silicon content prediction.
As shown in Figure
Note that, in the learning period of the proposed ELM based dynamic estimation model using the training databases,
The proposed modeling strategy has two advantages.
The dynamic property of time series and time delays is considered by introducing the output and inputs in previous time through a selffeedback structure. This selffeedback connection enables ELM to overcome the static mapping limitation of its feedforward network structure. Thus the improved version of ELM can capture the process nonlinear dynamics very well by remembering prior input and output states and using both the prior and current states to calculate new output value.
Different from the BPlike modeling algorithm usually suffering from high computational burden, poor generalization ability, and local optima and overweighting problems, the ELM based modeling benefits from much faster learning speed, higher generalization performance, and ease of implantation and use (no extra parameters need to be tuned except the predefined network architecture).
PCA is a kind of method trying to grasp the main contradiction part in statistical analysis process and analyze the main influencing factors from multiple objects in order to simplify the complex problems. Actually, the principle components conducted by PCA are the combination of column vectors picked by varimax from input matrix. Since correlations and noises always existed in practical industrial data, principle components with a small variance are usually some noisy information. Abandoning this data will not cause a crucial information loss and can even achieve denoising to some extent.
Data set as shown in the following equation is considered here:
Equation (
After data dimension reduction and noise filtering through PCA, the data measurements are represented as
A problem of the PCAbased dimension reduction is that the conducted principle components are comprehensive representation of the original higherdimension physical variables. However, by computing the
Extreme learning machine (ELM) is an algorithm for single hidden layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes whose learning speed can be thousands of times faster than conventional feedforward network learning algorithm like BP algorithm while reaching better approximation performance. In real application, net tends to be used for a finite data set. Huang and Babri prove that a SLFN with at most
The procedure of the algorithm used here can be summarized as follows: for
For additive hidden node,
For RBF hidden node,
For the prediction modeling problem considered in this paper,
In supervised batch learning, the learning algorithms use a finite number of inputoutput samples for training. For
The purpose of ELM is training the net to find a leastsquares solution
For the simplicity of the paper, the prediction modeling process based on ELM with additive hidden node is summarized as follows: giving a training set
The hidden node number
In this section, a mediumsized blast furnace (as shown in Figure
Direct detecting parameters and their instrumentations.
Variable (unit)  Notation  Instrumentation (notation) 

Flow rate of cold air (m^{3}/min) 

HHWLB differential pressure flowmeter (FT) 
Flow rate of rich oxygen (m^{3}/h) 

A+K balance flowmeter (FT) 
Blast pressure (kPa) 

DPharp EJA high accuracy pressure transmitter (PT) 
Furnace top pressure (kPa) 

DPharp EJA high accuracy pressure transmitter (PT) 
Blast temperature (°C) 

Hongguang SBW temperature transmitter (TT) 
Blast humidity (g/m^{3}) 

Air humidity sensor (HT) 
Indirect detecting parameters and their calculation formulas.
Variable (unit)  Calculation formulas 

Oxygen enrichment percentage (%) 

Gas permeability (m^{3}/min·kPa) 

Gas volume of bosh (GVB) (m^{3}/min) 

Bosh gas index (m^{3}/(min·m^{2}))  GVB/78.5398125 
Blast kinetic energy (kJ/s) 

Feed blast ratio (wt%) 

Resistance coefficient 

Theoretical burning temperature (°C) 

Actual wind speed (AWS) (m/s) 

The 2^{#} BF of Liuzhou Iron & Steel Group Co.
Schematic diagram of blast furnace system.
Considering the impact of strong correlation between the selected 16 input variables, PCA is used to determine the key input variables that influence the molten iron silicon content mostly. According to (
Eigenvalue and variance contribution rate of each component.
Figure
Data sets for prediction modeling.
Modeling results with the proposed method.
The developed ELM based prediction model has been tested on 2^{#} blast furnace in Liuzhou Steel of China for quite a long time. Figure
Estimation results of molten iron silicon content with different models.
It is well known that a good model should have its estimated error autocorrelation close to a white noise. So, in this text, we draw the autocorrelation function of estimating error of different models as shown in Figure
Autocorrelation function of estimating error of different models.
The estimation and generalization performance of the developed models can be further evaluated quantitatively by calculating the validation accuracy on the testing data set using the standard statistical measures, such as the
Table
Some data statistics of each algorithm.
Algorithm  
RMSE (testing)  Time (seconds)  

Training  Testing  
BP NN without SFB  0.0827  0.1293  0.849955  0.013285 
BP NN with SFB  0.0804  0.1272  0.926292  0.013828 
ELM without SFB  0.0770  0.1187  0.000678  0.000072 





Moreover, the results of practical application indicate that the performance of the developed model is superior to other models and can overcome the problem of “over fitting” excellently. And the gap between every training and testing is small, which enhances the reliability of the proposed method. The method can also overcome blindness of predefined parameters selection of conventional algorithm; thus convenience is provided for the operators.
This paper proposed a datadriven modeling for prediction of molten iron silicon content using PCA and ELM with selffeedback structure. Unlike other methods used for silicon content prediction, the proposed method can predict silicon content more accurately with an extremely fast speed than conventional algorithm, which feed the need for realtime control. Apart from selecting the number of hidden nodes, no other control parameter has to be chosen; thus convenience is provided for the operators. Moreover, the modified ELM with selffeedback structure can overcome the static mapping limitation of traditional ELM and so can cope with dynamic timeseries prediction problems very well. Performance of the proposed modified ELM based prediction model is compared with BP algorithm and different model structure on practical industrial data obtained from 2^{#} BF in Liuzhou Steel Company of China. The accuracy can basically meet the requirements of actual operation.
The authors have declared that they have no conflict of interests regarding the publication of this paper.
This work was supported by the National Natural Science Foundation of China (61104084, 614730646, 61290323, and 61333007), the Fundamental Research Funds for the Central Universities (N130508002 and N130108001), the IAPI Fundamental Research Funds (2013ZCX0209), and the 111 Project (B08015). The authors would like to thank the anonymous reviewers for their constructive comments and the editors for their efforts in editing and polishing the paper. The authors would also like to thank the Ironmaking Plant of Liuzhou Iron & Steel Group Co. in China for providing a lot of experimental support.