Extreme learning machine (ELM) as an emerging technology has recently attracted many researchers’ interest due to its fast learning speed and state-of-the-art generalization ability in the implementation. Meanwhile, the incremental extreme learning machine (I-ELM) based on incremental learning algorithm was proposed which outperforms many popular learning algorithms. However, the incremental algorithms with ELM do not recalculate the output weights of all the existing nodes when a new node is added and cannot obtain the least-squares solution of output weight vectors. In this paper, we propose orthogonal convex incremental learning machine (OCI-ELM) with Gram-Schmidt orthogonalization method and Barron’s convex optimization learning method to solve the nonconvex optimization problem and least-squares solution problem, and then we give the rigorous proofs in theory. Moreover, in this paper, we propose a deep architecture based on stacked OCI-ELM autoencoders according to stacked generalization philosophy for solving large and complex data problems. The experimental results verified with both UCI datasets and large datasets demonstrate that the deep network based on stacked OCI-ELM autoencoders (DOC-IELM-AEs) outperforms the other methods mentioned in the paper with better performance on regression and classification problems.
Extreme learning machine (ELM) proposed by Huang et al. [ The redundant nodes can be generated in I-ELM, which have a minor effect on the outputs of the network. Moreover, the existence of redundant nodes can eventually increase the complexity of the network. The convergence rate of I-ELM is slower than ELM, and the number of hidden nodes in I-ELM is sometimes larger than the dimension of samples for the training.
In this paper, we propose a method called orthogonal convex extreme learning machine (OCI-ELM) to further settle the aforementioned problems of I-ELM. With the rigorous proofs in theory, we can obtain the least-squares solution of
Recently, deep learning has attracted many research interests with its remarkable success in many applications [
To show the effectiveness of DOC-IELM-AEs, we apply it to both the ordinary real-world datasets with UCI datasets and large datasets with MNIST, OCR Letters, NORB, and USPS datasets. The simulations show that the proposed deep model possesses better accuracy of testing and more compact network architecture than the aforementioned improved I-ELM and other deep models without incurring the out-of-memory problem.
This paper is organized as follows. Section
In this section, the main concepts and theory of the I-ELM [
The I-ELM proposed by Huang et al. is different from the conventional ELM algorithm; I-ELM is an automatic algorithm which can randomly add hidden nodes to the network one by one and freeze all the weights of the existing hidden nodes when a new hidden node is added, until the expected learning accuracy is obtained or the maximum number of hidden nodes is reached. Thus, I-ELM algorithm can be summarized in Algorithm
Given a training dataset
increase the number of hidden nodes by one; assign random input weight calculate the residual error after adding the new hidden node; calculate the output weight calculate the residual error:
Endwhile.
The motivation for the work in this section comes from the important properties of basic ELM as follows: The special solution The smallest norm of weights: the special solution The minimum norm least-squares solution of
In this section, we propose an improved I-ELM algorithm (OCI-ELM) based on Gram-Schmidt orthogonalization method combined with Barron’s convex optimization learning method and prove the OCI-ELM algorithm in theory which can obtain the least-squares solution of
Gram-Schmidt orthogonalization process converts linearly independent vectors into orthogonal vectors [
Given an orthogonal vector set
Given the vector set
CI-ELM was originally proposed by Huang and Chen [
Given a training dataset
Increase the number of hidden nodes Randomly assign hidden node parameters and the hidden layer output matrix calculate the output weight recalculate the output weight vectors of all existing hidden nodes if calculate the residual error after adding the new hidden node
Endwhile.
The rigorous proof on the conclusion is detailedly discussed where OCI-ELM can obtain the least-squares solution of
Given a training dataset
The proof consists of two steps: Firstly, we prove And then, we further prove Here, When the output weight When the output weight When the output weight When the output weight
(a) According to the condition given above, we have the following:
So,
In this section, we tested the generalization performance of the proposed OCI-ELM with other similar learning algorithms on ten UCI real-world datasets, including five regression and five classification problems, as shown in Table
Specification of 10 benchmark problems.
Datasets | Training sample | Testing sample | Attributes | Type |
---|---|---|---|---|
Auto MPG | 182 | 165 | 8 | Regression |
California Housing | 8,800 | 8,260 | 8 | Regression |
Servo | 92 | 60 | 4 | Regression |
Concrete Compressive Strength | 975 | 830 | 9 | Regression |
Parkinsons | 4,780 | 5,120 | 26 | Regression |
Delta Ailerons | 5,080 | 4,600 | 6 | Classification |
Waveform II | 3,300 | 2,800 | 40 | Classification |
Abalone | 3,900 | 3,670 | 8 | Classification |
Breast Cancer | 539 | 490 | 10 | Classification |
Energy Efficiency | 740 | 600 | 8 | Classification |
The experimental results between OCI-ELM and some other ELM algorithms on regression and classification problems are given in Tables Convex incremental extreme learning machine (CI-ELM) [ Parallel chaos search based incremental extreme learning machine (PC-ELM) [ Leave-one-out incremental extreme learning machine (LOO-IELM) [ Sparse Bayesian extreme learning machine (SB-ELM) [ Improved incremental regularized extreme learning machine (II-RELM) [ Enhancement incremental regularized extreme learning machine (EIR-ELM) [
The comparisons of training and testing on the regression cases.
Datasets | Approaches | RMSE (training & testing) | Hidden nodes & average time | |||
---|---|---|---|---|---|---|
Nodes (fixed) | Training | Testing | # nodes | Time (s) | ||
Auto MPG (0.08) | CI-ELM |
20 | 0.1043 | 0.1035 | 66.29 | 0.1485 |
PC-ELM |
20 | 0.1014 | 0.1012 | 34.07 | 0.2783 | |
LOO-IELM |
20 | 0.1106 | 0.1104 | 49.91 | 0.3183 | |
SB-ELM |
20 | 0.1376 | 0.2307 | ≈130 |
| |
II-RELM |
20 | 0.0998 | 0.1005 | 44.17 | 0.3483 | |
EIR-ELM |
20 | 0.0893 | 0.1005 | 31.05 | 0.3283 | |
OCI-ELM | 20 |
|
|
|
0.2204 | |
|
||||||
California Housing (0.12) | CI-ELM | 150 | 0.1601 | 0.1583 | 330.09 | 1.0051 |
PC-ELM | 150 | 0.1389 | 0.1377 | 199.34 | 0.9810 | |
LOO-IELM | 150 | 0.1376 | 0.1374 | 217.08 | 0.9766 | |
SB-ELM | 150 | 0.1363 | 0.1369 | — | — | |
II-RELM | 150 | 0.1341 | 0.1339 | 192.33 | 0.9713 | |
EIR-ELM | 150 | 0.1274 | 0.1268 | 184.67 | 1.0017 | |
OCI-ELM | 150 |
|
|
|
| |
|
||||||
Servo (0.115) | CI-ELM | 100 | 0.1428 | 0.1419 | 182.63 | 0.0806 |
PC-ELM | 100 | 0.1373 | 0.1364 | 160.82 | 0.0701 | |
LOO-IELM | 100 | 0.1371 | 0.1368 | 155.72 | 0.0765 | |
SB-ELM | 100 | 0.1257 | 0.1254 |
|
| |
II-RELM | 100 | 0.1303 | 0.1307 | 157.12 | 0.0886 | |
EIR-ELM | 100 | 0.1265 | 0.1264 | 147.80 | 0.0794 | |
OCI-ELM | 100 |
|
|
|
0.0828 | |
|
||||||
CCS (0.035) | CI-ELM | 150 | 0.0611 | 0.0602 | 229.86 | 0.5893 |
PC-ELM | 150 | 0.0381 | 0.0365 | 162.79 | 1.1236 | |
LOO-IELM | 150 | 0.0372 | 0.0369 | 159.04 | 0.9427 | |
SB-ELM | 150 | 0.0366 | 0.0368 | ≈170 | 0.0872 | |
II-RELM | 150 | 0.0361 | 0.0363 | 163.82 | 0.8341 | |
EIR-ELM | 150 | 0.0348 | 0.0351 | 145.78 | 0.6305 | |
OCI-ELM | 150 |
|
|
|
| |
|
||||||
Parkinsons (0.14) | CI-ELM | 250 | 0.0913 | 0.0906 | 170.02 |
|
PC-ELM | 250 | 0.0471 | 0.0463 | 63.55 | 4.7503 | |
LOO-IELM | 250 | 0.0453 | 0.0462 | 59.78 | 4.4452 | |
SB-ELM | 250 | 0.0388 | 0.0391 | — | — | |
II-RELM | 250 | 0.0389 | 0.0383 | 77.19 | 4.4189 | |
EIR-ELM | 250 | 0.0344 | 0.0347 | 48.92 | 3.9836 | |
OCI-ELM | 250 |
|
|
|
3.0819 |
The comparisons of training and testing on the classification cases.
Datasets | Approaches | Testing accuracy | Hidden nodes & average time | |||
---|---|---|---|---|---|---|
Nodes (fixed) | Mean ( |
Std. | # nodes | Time (s) | ||
Delta Ailerons (0.035) | CI-ELM |
250 | 83.29 | 0.0036 | 369.32 | 1.3505 |
PC-ELM |
250 | 90.02 | 0.0016 | 35.19 | 0.6829 | |
LOO-IELM |
250 | 91.17 | 0.0027 | 41.12 | 0.7761 | |
SB-ELM |
250 | 91.66 | 0.0071 | ≈220 |
| |
II-RELM |
250 | 91.18 | 0.0042 | 51.16 | 0.7425 | |
EIR-ELM |
250 | 92.03 | 0.0019 | 34.29 | 1.1304 | |
OCI-ELM | 250 |
|
|
|
0.7021 | |
|
||||||
Waveform II (0.04) | CI-ELM | 100 | 84.47 | 0.0182 | 200.11 | 3.0977 |
PC-ELM | 100 | 89.81 | 0.0104 | 47.63 | 3.0954 | |
LOO-IELM | 250 | 88.93 | 0.0097 | 46.44 | 3.3437 | |
SB-ELM | 250 | 80.69 | 0.0181 | — | — | |
II-RELM | 250 | 90.64 | 0.0112 | 44.33 | 3.6603 | |
EIR-ELM | 250 | 91.15 | 0.0096 | 38.91 | 3.2267 | |
OCI-ELM | 100 |
|
|
|
| |
|
||||||
Abalone (0.05) | CI-ELM | 150 | 82.72 | 0.0022 | 150.37 | 0.4930 |
PC-ELM | 150 | 93.57 | 0.0016 | 24.62 | 0.6177 | |
LOO-IELM | 250 | 90.51 | 0.0033 | 38.43 | 0.7102 | |
SB-ELM | 250 | 86.03 | 0.0107 | ≈180 |
| |
II-RELM | 250 | 92.10 | 0.0034 | 35.74 | 0.6924 | |
EIR-ELM | 250 | 93.91 | 0.0018 | 24.16 | 0.8533 | |
OCI-ELM | 150 |
|
|
|
0.6802 | |
|
||||||
Breast Cancer (0.07) | CI-ELM | 200 | 90.06 | 0.0145 | 88.30 | 0.0804 |
PC-ELM | 200 | 93.23 | 0.0082 | 34.79 | 0.0992 | |
LOO-IELM | 250 | 93.07 | 0.0104 | 40.82 | 0.1102 | |
SB-ELM | 250 | 94.41 | 0.0075 | ≈150 |
| |
II-RELM | 250 | 92.58 | 0.0095 | 55.27 | 0.1032 | |
EIR-ELM | 250 | 94.76 | 0.0078 | 31.18 | 0.1174 | |
OCI-ELM | 200 |
|
|
|
0.1061 | |
|
||||||
Energy Efficiency (0.055) | CI-ELM | 150 | 91.78 | 0.0013 | 61.09 | 0.2966 |
PC-ELM | 150 | 96.53 |
|
41.08 | 0.3617 | |
LOO-IELM | 250 | 95.18 | 0.0024 | 27.94 | 0.3517 | |
SB-ELM | 250 | 92.16 | 0.0033 | ≈150 |
| |
II-RELM | 250 | 95.29 | 0.0011 | 46.83 | 0.3826 | |
EIR-ELM | 250 | 96.67 | 0.0012 | 22.42 | 0.4011 | |
OCI-ELM | 150 |
|
|
|
0.3979 |
In this section, datasets Auto MPG, California Housing, Servo, CCS (Concrete Compressive Strength), and Parkinsons are conducted for the regression problems. Table
In this section, datasets Delta Ailerons, Waveform II, Abalone, Breast Cancer, and Energy Efficiency are conducted for the classification problems. Table
In short, OCI-ELM can generally achieve better performance on these regression and classification problems in terms of training (and testing) RMSE for regression and testing accuracy for classification. Moreover, the compactness of network and convergence rate also display the good performance of OCI-ELM algorithm.
As an artificial neural network model, the autoencoder is frequently applied in the deep architecture approaches. Autoencoder is a kind of unsupervised neural network, where the input of network is equal to the output. Kasun et al. [ For sparse ELM-AE representations, output weights For compressed ELM-AE representations, output weights For equal dimension ELM-AE representations, output weights
In this section, we present the OCI-ELM, which is incorporated with Barron’s convex optimization learning method and Gram-Schmidt orthogonalization method into I-ELM to achieve the optimal least-squares solution as the training algorithm for an autoencoder instead of conventional autoencoders, which apply backpropagation algorithm (BP) for training to obtain the identity function and normal ELM for training the autoencoder. Because of the adoption of incremental algorithm, there is no need to set the number of hidden nodes according to the experience. With the initialization of the maximum value
As shown in Figure
Network diagram of OCI-ELM-AE.
In 2006, Hinton et al. [
Given a training dataset
Increase the number of hidden nodes Randomly assign hidden node parameters and the hidden layer output matrix, calculate the output weight recalculate the output weight vectors of all existing hidden nodes if calculate the residual error after adding the new hidden node
calculate the output weight recalculate the output weight vectors of all existing hidden nodes if calculate the residual error after adding the new hidden node
Endwhile.
The DOC-IELM-AEs algorithm inherits the advantages of incremental constructive feedforward networks model and deep learning algorithms on exactly capturing higher-level abstractions and characterizing the data representations. The implementation of autoencoders for the unsupervised pretraining of data exhibits super-duper performance on regression and classification problems. The improved method utilizes the OCI-ELM-AE as a base building layer to construct the whole deep architecture. As shown in Figure
The model structure of DOC-IELM-AEs.
In this section, we mainly test the regression performance of the proposed OCI-ELM and DOC-IELM-AEs on three UCI real-world datasets, Parkinsons, California Housing, and CCS (Concrete Compressive Strength) data, and two large datasets, BlogFeedback and Online News Popularity data. The simulations are conducted in MATLAB 2013a environment running on Windows 7 machine with 128 GB of memory and Intel Xeon E5-2620V2 (2.1 GHz) processor.
The regression performance comparisons of the proposed algorithms OCI-ELM and DOC-IELM-AEs with the baseline methods including SVM [ OCI-ELM compared with SVM, ELM, ErrCor, and PC-ELM: we perform the regression testing on the datasets described in Table DOC-IELM-AEs compared with DBN, ML-ELM, and AE-S-ELMs: the testing accuracy on UCI datasets can show that the performance of DOC-IELM-AEs outperforms the OCI-ELM. With the aforementioned comparisons between OCI-ELM and the other algorithms (SVM, ELM, ErrCor, and PC-ELM), evidenced by the same token, the DOC-IELM-AEs can achieve better testing accuracy than SVM, ELM, ErrCor, and PC-ELM; this result can also be seen in Table
Specification of 10 benchmark problems.
Datasets | Training sample | Testing sample | Attributes | Type |
---|---|---|---|---|
Parkinsons | 4,780 | 5,120 | 26 | Regression |
California Housing | 8,800 | 8,260 | 8 | Regression |
Concrete Compressive Strength | 975 | 830 | 9 | Regression |
BlogFeedback | 58,000 | 23,000 | 281 | Regression |
Online News Popularity | 36,800 | 10,000 | 61 | Regression |
Delta Ailerons | 5,080 | 4,600 | 6 | Classification |
Waveform II | 3,300 | 2,800 | 40 | Classification |
MNIST | 35,000 | 6,000 | 784 | Classification |
OCR Letters | 40,000 | 12,000 | 128 | Classification |
NORB | 20,000 | 5,000 | 2,048 | Classification |
The comparisons of training and testing on the regression cases.
Datasets | Algorithms | Training accuracy ( |
Training time (s) | Testing accuracy ( |
Testing deviation ( |
---|---|---|---|---|---|
Parkinsons | SVM | 95.42 | 282.26 | 95.43 | 0.01 |
ELM | 95.75 |
|
95.73 | 0.29 | |
ML-ELM | 97.93 | 482.7 | 97.96 | 0.08 | |
AE-S-ELMs | 98.07 | 492.21 | 98.05 | 0.11 | |
DBN | 97.51 | 5013 | 97.53 | 0.04 | |
ErrCor | 96.25 | 344.82 | 96.19 | 0.32 | |
PC-ELM | 97.16 | 362.05 | 97.18 | 0.09 | |
OCI-ELM | 97.59 | 327.2 | 97.62 | 0.05 | |
DOC-IELM-AEs |
|
440.39 |
|
0.04 | |
|
|||||
California Housing | SVM | 96.56 | 327.17 | 96.56 | 0 |
ELM | 96.79 |
|
96.72 | 0.16 | |
ML-ELM | 98.12 | 462.31 | 98.14 | 0.12 | |
AE-S-ELMs | 98.26 | 402.49 | 98.23 | 0.09 | |
DBN | 98.03 | 3989 | 98.04 | 0.08 | |
ErrCor | 97.01 | 401.66 | 97.11 | 0.25 | |
PC-ELM | 97.34 | 396.52 | 97.31 | 0.28 | |
OCI-ELM | 97.92 | 386.63 | 97.90 | 0.11 | |
DOC-IELM-AEs |
|
447.2 |
|
0.07 | |
|
|||||
Concrete Compressive Strength | SVM | 95.72 | 32.32 | 95.78 | 0.04 |
ELM | 95.91 |
|
96.02 | 0.06 | |
ML-ELM | 97.26 | 53.4 | 97.33 | 0.18 | |
AE-S-ELMs | 97.49 | 50.66 | 97.41 | 0.12 | |
DBN | 96.64 | 320.04 | 96.58 | 0.05 | |
ErrCor | 96.32 | 47.45 | 96.35 | 0.25 | |
PC-ELM | 96.55 | 43.97 | 96.67 | 0.09 | |
OCI-ELM | 97.04 | 44.33 | 97.15 | 0.06 | |
DOC-IELM-AEs |
|
51.58 |
|
0.03 | |
|
|||||
BlogFeedback | SVM | 89.75 | 3906 | 89.82 | 0.03 |
ELM | 90.12 |
|
90.14 | 0.22 | |
ML-ELM | 91.86 | 5175 | 91.82 | 0.12 | |
AE-S-ELMs | 91.83 | 5247 | 91.79 | 0.13 | |
DBN | 90.51 | 19766 | 90.60 | 0.09 | |
ErrCor | 90.39 | 4889 | 90.44 | 0.09 | |
PC-ELM | 90.54 | 4308 | 90.58 | 0.12 | |
OCI-ELM | 91.76 | 4223 | 91.82 | 0.09 | |
DOC-IELM-AEs |
|
5271 |
|
0.07 | |
|
|||||
Online News Popularity | SVM | 91.75 | 311.83 | 91.72 | 0.04 |
ELM | 91.68 |
|
91.69 | 0.34 | |
ML-ELM | 92.62 | 684.98 | 92.71 | 0.17 | |
AE-S-ELMs | 92.59 | 685.62 | 95.72 | 0.12 | |
DBN | 92.23 | 7062 | 92.26 | 0.05 | |
ErrCor | 91.61 | 541.14 | 91.77 | 0.29 | |
PC-ELM | 92.29 | 576.01 | 92.35 | 0.15 | |
OCI-ELM | 92.52 | 521.24 | 92.54 | 0.12 | |
DOC-IELM-AEs |
|
634.09 |
|
0.11 |
The classification performance comparisons of the proposed algorithms OCI-ELM and DOC-IELM-AEs with the baseline methods including SVM, single ELM, ML-ELM, AE-S-ELMs, DBN, ErrCor, and PC-ELM are shown in Table OCI-ELM compared with SVM, ELM, ErrCor, and PC-ELM: the simulation results are obtained by the average of 50 trails on datasets in Table DOC-IELM-AEs compared with DBN, ML-ELM, and AE-S-ELMs: to test these anticipated effects, we used UCI datasets and large-scale datasets to acquire the results. From the experimental results, we can see that the classification accuracies of DOC-IELM-AEs are better than others obviously. Focusing on NORB, the network structure used by DOC-IELM-AEs is 2048-800-800-3000-5; the DOC-IELM-AEs obtained the best accuracies of 93.16
The comparisons of training and testing on the classification cases.
Datasets | Algorithms | Training accuracy ( |
Training time (s) | Testing accuracy ( |
Testing deviation ( |
---|---|---|---|---|---|
Delta Ailerons | SVM | 95.41 | 219.54 | 95.42 | 0 |
ELM | 95.78 |
|
95.81 | 0.22 | |
ML-ELM | 97.32 | 317.83 | 97.34 | 0.19 | |
AE-S-ELMs | 97.75 | 279.32 | 97.71 | 0.27 | |
DBN | 98.34 | 4432 | 97.35 | 0.05 | |
ErrCor | 96.42 | 234.27 | 96.48 | 0.12 | |
PC-ELM | 96.56 | 262.69 | 96.61 | 0.08 | |
OCI-ELM | 96.73 | 249.89 | 96.69 | 0.07 | |
DOC-IELM-AEs |
|
350.17 |
|
0.05 | |
|
|||||
Waveform II | SVM | 95.45 | 247.11 | 95.44 | 0.01 |
ELM | 95.71 |
|
95.73 | 0.16 | |
ML-ELM | 97.83 | 304.28 | 97.86 | 0.11 | |
AE-S-ELMs | 98.09 | 365.83 | 98.11 | 0.15 | |
DBN | 97.59 | 4101 | 97.52 | 0.04 | |
ErrCor | 96.54 | 261.44 | 96.58 | 0.06 | |
PC-ELM | 96.46 | 293.85 | 96.47 | 0.05 | |
OCI-ELM | 97.33 | 202.08 | 97.36 | 0.08 | |
DOC-IELM-AEs |
|
332.58 |
|
0.05 | |
|
|||||
MNIST | SVM | 95.01 | 2108 | 95.03 | 0.03 |
ELM | 95.33 |
|
95.32 | 0.31 | |
ML-ELM | 96.78 | 3793 | 96.82 | 0.17 | |
AE-S-ELMs | 96.93 | 3772 | 96.94 | 0.21 | |
DBN | 96.67 | 14117 | 96.72 | 0.03 | |
ErrCor | 96.17 | 4065 | 96.26 | 0.13 | |
PC-ELM | 96.21 | 3049 | 96.24 | 0.09 | |
OCI-ELM | 96.38 | 3092 | 96.39 | 0.09 | |
DOC-IELM-AEs |
|
3985 |
|
0.06 | |
|
|||||
OCR Letters | SVM | 88.27 | 770.24 | 88.41 | 0.03 |
ELM | 89.02 |
|
89.03 | 0.22 | |
ML-ELM | 89.19 | 923.23 | 89.17 | 0.14 | |
AE-S-ELMs | 89.44 | 960.06 | 89.46 | 0.19 | |
DBN | 88.98 | 15212 | 89.02 | 0.04 | |
ErrCor | 88.81 | 1014 | 88.82 | 0.10 | |
PC-ELM | 89.17 | 979.62 | 89.16 | 0.05 | |
OCI-ELM | 89.36 | 992.08 | 89.39 | 0.08 | |
DOC-IELM-AEs |
|
1175 |
|
0.06 | |
|
|||||
NORB | SVM | 91.13 | 723.81 | 91.14 | 0.04 |
ELM | 91.42 |
|
91.45 | 0.14 | |
ML-ELM | 92.67 | 1107 | 92.70 | 0.17 | |
AE-S-ELMs | 92.81 | 698.21 | 92.82 | 0.12 | |
DBN | 92.58 | 44502 | 92.59 | 0.03 | |
ErrCor | 91.77 | 1287 | 91.81 | 0.08 | |
PC-ELM | 91.80 | 896.63 | 91.83 | 0.06 | |
OCI-ELM | 92.61 | 839.48 | 92.65 | 0.05 | |
DOC-IELM-AEs |
|
1480 |
|
0.03 |
In this section, all of the experimental results for the elongation of strips prediction are presented. The annealing treatment is considered the most important process to cold rolled strips. In this process, the cold working hardening and internal stress of strips can be eliminated; the hardness of strips can be reduced; moreover, the ability of plastic deformation, stamping, and mechanical technique can be improved. Figure
General scheme of strip steel in annealing process.
We collect the historical records in the last 16 months which can affect the position of the welding seam, including the temperature data of 5 sections, the tension data of 3 sections, and the speed data of 11 sections. We use data of 10 months for training and the following data of 6 months for testing. The comparison results of elongation of strips prediction are shown in Figure
The comparisons of prediction on 6-month strip-elongation data.
For further investigation on the prediction capabilities of DOC-IELM-AEs, the performances of algorithms are evaluated in terms of four criteria, that is, the mean absolute percentage error (
Different prediction errors of 5 models on 6-month data.
In order to demonstrate the effectiveness of the algorithm proposed in practical engineering, we have selected the successive data of 12 months from the whole data (16-month data) to conduct the comparisons, and we obtained the prediction accuracies of every month and one year. The comparisons of prediction accuracy shown in Figure
The comparisons of prediction accuracy of annual statistical data.
In this paper, we proposed a stacked architecture with OCI-ELM algorithm based on deep representation learning and added the OCI-ELM autoencoder into each layer of OCI-ELM, called DOC-IELM-AEs. The experiment results have demonstrated strongly that DOC-IELM-AEs can be suitable for solving regression and classification problems; simulations showed that, (1) compared with CI-ELM, EI-ELM, ECI-ELM, PC-ELM, and OCI-ELM, DOC-IELM-AEs can achieve the best testing accuracy with the same network size, even less hidden nodes; meanwhile, the speed of learning is also faster than other algorithms. Moreover, DOC-IELM-AEs has better performance than OCI-ELM algorithm; (2) compared with SVM, ELM, ML-ELM, AE-S-ELMs, DBN, ErrCor, PC-ELM, and OCI-ELM, DOC-IELM-AEs can also obtain the best testing accuracy with consuming more time in a certain range for the large datasets; (3) compared with SVM, ELM, ML-ELM, AE-S-ELMs, DBN, ErrCor, PC-ELM, and OCI-ELM, the DOC-IELM-AEs applied in the case of strips-elongation prediction can enhance the performance of prediction; demonstrated with the production data, the prediction accuracy based on the algorithm we proposed outperforms other algorithms. For these reasons, the OCI-ELM and DOC-IELM-AEs can further be implemented in practical engineering and have the potential for solving more complicated big data problems with further study.
The authors declare that they have no competing interests.
This work was supported by the National Natural Science Foundation of China (61102124) and Liaoning Key Industry Programme (JH2/101).