Wind signal forecasting has become more and more crucial in the structural health monitoring system and wind engineering recently. It is a challenging subject owing to the complicated volatility of wind signals. The robustness and generalization of a predictor are significant as well as of high precision. In this paper, an adaptive residual convolutional neural network (CNN) is developed, aiming at achieving not only high precision but also high adaptivity for various wind signals with varying complexity. Afterwards, reinforced forecasting is adopted to enhance the robustness of the preliminary forecasting. The preliminary forecast results by adaptive residual CNN are integrated with historical observed signals as the new input to reconstruct a new forecasting mapping. Meanwhile, simplified-boost strategy is applied for more generalized results. The results of multistep forecasting for five kinds of nonstationary non-Gaussian wind signals prove the more excellent adaptivity and robustness of the developed two-stage model compared with single models.

During the past decades, high-rise buildings, long-span roofs, and bridges were mushrooming continuously. As the main design load for high and flexibility buildings, wind load has attracted lots of attention from experts [

Abundant approaches are developed for time series forecasting, which are divided into four categories: physical methods [

Summary of time series forecasting methods.

Time series forecasting model | Advantage | Disadvantage | Reference papers | |
---|---|---|---|---|

Physical methods | Numerical weather prediction | Superior performance in long-term forecasting | High computing-consuming | [ |

Statistical methods | Time series model | Obtain good performance in linear data; require less time to build | Difficult to handle the nonlinear nonstationary data owing to the linear nature | [ |

Artificial intelligent methods | Artificial neural network | Handle nonlinear data | Fall into local optimums; be influenced by the initial parameters | [ |

Support vector machine (SVM) | Strong generalization; suitable for small and medium datasets | The performance may be influenced by the kernel function and parameters | [ | |

Extreme learning machine | Fast learning speed and good generalization performance | Instability | [ | |

Decision tree (DT) | Simple; less data preprocessing | Instability; be sensitive to the dataset | [ | |

Random forest (RF) | Stability; strong generalization | Overfitting due to data noise | [ | |

Fuzzy logic models | Robustness; strong fault tolerance | Low accuracy | [ | |

Recurrent neural network (RNN) | Suitable for time series | Computational complexity; time-consuming; overfitting | [ | |

CNN | Stronger ability and flexibility | Overfitting | [ | |

Hybrid methods | Decomposition-based methods | High accuracy | Mode aliasing | [ |

Parameter optimization-based methods | High accuracy; stability | Time-consuming; computational complexity | [ | |

Weight-based forecasting methods | Robustness | Multi-collinearity | [ | |

Error correction-based methods | High accuracy | Be influenced by the selection of error correction model | [ |

Meanwhile, time and space often exist together in engineering applications. Hence, spatiotemporal pattern-based forecasting/detection methods have been a hot topic recently [

Deep learning is another branch of machine learning broadly applied in various fields [

Due to the complex and changeable nature of nonstationary wind, common forecasting methods are generally developed for a certain kind of wind signal or wind field without universality [

Presently, many experts have focused on error correction methods for improving the forecasting accuracy. Most error correction methods built an additional model to forecast error components [

Wang et al. [

On the whole, an innovative hybrid model based on multichannel convert, adaptive residual CNN, reinforced forecasting strategy, and simplified-boost technique is proposed for multistep wind signal forecasting, aiming at promoting the robust and generalization of a predictor as well as precision.

The innovative contributions are expressed as follows: (1) Multichannel convert is first developed for preprocessing the input signals, leading to superior feature representation. (2) The developed adaptive residual CNN can be adjusted to different signals with varying complexity, greatly improving the adaptability of model. (3) For the pursuit of high precision and robustness, reinforced forecasting is developed to enhance the forecasting ability; meanwhile, simplified-boost technique is adopted for achieving more generalized results. (4) Multistep forecasting results of five kinds of complex wind signals show the advantages of the developed model. Both long-term trend and short-term precision can be guaranteed by the proposed two-stage frame, confirming that the hybrid model is the most responsive for wind signal forecasting in comparison with single models. (5) Scientific and comprehensive evaluation is conducted to testify the effectiveness of the developed model.

The remainder of this article is organized as follows.

Given a set of observed historical wind signals with time span

The whole framework of the proposed model is shown in Figure

Stage 1 (preprocessing): Divide original data into the training and testing parts before normalization. Then phase space reconstruction and multichannel convert for the network input are implemented.

Stage 2: Adaptive residual CNN is adopted for preliminary forecasting by building a mapping relationship between historical observed signals and signals to be forecasted.

Stage 3: Reinforced forecasting is implemented by reconstructing the mapping relationship. Specifically, the historical observed signals are integrated with fresh information (i.e., preliminary forecasting results) as new input.

Stage 4: Notice that the embedding dimension is

Stage 5: Evaluate the forecasting errors via the forecasting datasets.

The flowchart of the proposed model.

Time series are usually transformed into phase space to reflect the system’s dynamics information preferably. In this paper, the embedding dimension is determined by minimizing the root mean square error (RMSE) using the forecast results of training set. Take dataset A in

Effect of embedding dimension on RMSE.

Subsequently, multichannel convert is developed for the purpose of providing additional information by presenting temporal patterns at different time scales.

The original input is a one-dimensional vector with size of

The convert strategy provides additional views of the input in terms of temporal resolution, so the network can automatically capture temporal patterns at different time scales. Furthermore, different time series may require feature representations at different time scales. The multichannel input can adaptively adjust the feature representation in suitable time scale.

Compared with fully connected neural networks, the relative progress of CNN is the introduction of convolution layer and pooling layer. The convolution layer extracts features by filter translation on the original time series. Pooling layer including maximum and average pooling reduces the learning parameters and network complexity by parameters sparse representation after features are aggregated. Sparse connection, parameter sharing and equivariant representation are three important characteristics of convolution operation.

CNN mainly performs as an implicit feature extraction with the convolutional filters detecting complex patterns as feature detectors. The deeper hidden layers have the ability to dig for more complex and abstract information. The number of hidden layers in CNN depends on the complexity of the target problem. On the one hand, more convolutional layers can dig out more potential features for wind signals with large variation and complicated fluctuation; on the other hand, excessive convolutional layers may fail in the forecasting target for slightly fluctuated signal due to overfitting. Therefore, an adaptive residual CNN is proposed for adaptive wind signal forecasting inspired by the residual neural network [

As the original input of CNN, a sequence of wind signals are transmitted forward layer by layer. Data features are extracted and enhanced by each convolution layer progressively. In the following numerical examples in this study, three convolutional layers are adopted in the adaptive residual CNN for guaranteeing the expressive ability of output features. Meanwhile, in order to map weak nonlinear input-output relation, it is customary to have an additional connection from the first convolutional layer to the output layer, with an additional dense layer as shown in Figure

From another point of view, traditional CNNs only utilize the last convolutional layer for regression task without considering the information included in previous convolutional layers. Actually, different convolutional layers contain feature information in different scales. The proposed adaptive residual technique allows exploiting the information among different convolutional layers and making full use of hierarchical features in previous convolutional layers. Note that the additional connection is not merely limited to begin from the first convolutional layer.

In this section, a reinforced forecasting algorithm is utilized to optimize the adaptive residual CNN. The preliminary forecast values considered as additional fresh information are integrated with historical observed components as input variables so as to reconstruct a completely new forecasting mapping. The mapping formulation can be expressed as follows:

The reinforced strategy makes full use of intrinsic connections between the current preliminary forecast values, the historical information, and the current actual values to be forecasted, using the superior nonlinear fitting capability of the reinforced model. Reinforced forecasting is a key segment to promote not only the accuracy but also the robustness. The predictor used for reinforced forecasting can be different from preliminary forecasting. Many predictors can be options, e.g.. RNN, long short-term memory neural network (LSTM), and SVM. In the following numerical examples, adaptive residual CNN is still selected as the reinforced forecasting model, which can receive the best results by trial and error.

The two-stage model includes preliminary forecasting and reinforced forecasting, which work as two temporal dependency captures and could be regarded as encoder and decoder, respectively. It is worth mentioning that the two-stage framework has complementary properties on forecasting ability so as to enhance the robustness of the algorithm.

However, it is difficult to decide exactly how many historical observed components (i.e., the value of

RMSE with different number of historical observed signals.

In order to solve the problem, a suitable solution is to combine advantages of models with different

(1) Train reinforced forecasting model based on the subtraining set, build a regression model

(2) Calculate a loss for each training sample

(3) Calculate the estimator weight:

Normalize the estimator weight:

The simplified-boost algorithm can balance the effect of input with diverse historical observed signals. More importantly, the generalization ability has been greatly improved so as to forecast various kinds of wind signals adaptively.

In this section, multistep forecasting for five kinds of nonstationary non-Gaussian wind signals is implemented to testify the high accuracy and robustness of the proposed model. Six models are adopted for comparison, including DT, RF, back propagation neural network (BPNN), RNN [

Six evaluation indicators, consisting of RMSE, mean absolute error (MAE), mean absolute percent error (MAPE), Pearson correlation coefficient (R), symmetric mean absolute percentage error (SMAPE), and mean absolute scaled error (MASE), are chosen to evaluate the forecasting performance. The value of

Without loss of generality, five kinds of datasets are adopted for experiments. Namely, wind speed is on a super high-rise building roof near the coast of the East China Sea in Xiamen at 06 : 37-07 : 37 on 8^{th} August 2015 before typhoon Soudelor landing. The sampling frequency is 2 Hz, denoted by A. Wind pressure is on the surface of a 28-storey building, located in the west coast of Qingdao, Shandong province. The sampling frequency is 1 Hz, denoted by B. Wind pressure is that of Leqing sports center, a long-span membrane structure whose maximum cantilever span is 57 m [^{th} May 2002 to 15^{th} July 2002 [^{th} August to 06 : 00 on 25^{th} August in 2015 before and after Super Typhoon Goni passing through 10 m height of the measured base located at 31°11′46.36″ N and 121°47′8.29″ E. The sampling frequency is 1 Hz, denoted by

In total, 500 samples are selected for each dataset as shown in Figure ^{st} -400^{th} samples of original datasets are considered as training set, which are used to construct the forecasting models. The 401^{th}-500^{th} samples are utilized as testing set to verify the effectiveness of models marked in red. Statistics indicators are presented in Table

Original wind signals.

Statistics indicators of wind signal datasets A-E.

Dataset | Statistics indicator | ||||
---|---|---|---|---|---|

Mean | Std | Kurtosis | Skewness | Nonstationary | |

A | 12.2154 (m/s) | 5.4263 (m/s) | 2.8661 | 0.1211 | Yes |

B | 232.3584 (Pa) | 37.3457 (Pa) | 2.8617 | 0.3615 | Yes |

C | 86.7584 (Pa) | 11.0850 (Pa) | 2.5459 | 0.8090 | Yes |

D | 9.2556 (m/s) | 2.2242 (m/s) | 3.4244 | 0.3695 | Yes |

E | 2.9962 (m/s) | 1.0915 (m/s) | 2.9061 | 0.4937 | Yes |

The forecast results and errors criteria of datasets A and B for multistep forecasting are presented in Figures

Results of multistep forecasting of dataset A.

Results of multistep forecasting of dataset B.

Multistep forecasting performance of dataset A.

Dataset A | RMSE (m/s) | MAE (m/s) | MAPE (%) | ||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 1.103 | 1.370 | 1.623 | 0.829 | 1.128 | 1.384 | 8.435 | 12.318 | 15.935 |

RF | 0.904 | 1.307 | 1.604 | 0.684 | 1.079 | 1.282 | 7.210 | 11.770 | 14.423 |

BPNN | 1.158 | 1.370 | 1.719 | 0.930 | 1.103 | 1.398 | 8.728 | 11.659 | 15.114 |

RNN | 1.081 | 1.561 | 1.695 | 0.866 | 1.187 | 1.342 | 8.479 | 12.938 | 14.621 |

GRU | 1.050 | 1.749 | 1.906 | 0.807 | 1.366 | 1.531 | 8.341 | 15.544 | 18.497 |

LSTM | 0.955 | 1.597 | 1.793 | 0.755 | 1.272 | 1.444 | 7.931 | 14.556 | 16.377 |

Proposed | 0.766 | 1.055 | 1.092 | 0.585 | 0.776 | 0.817 | 6.089 | 8.112 | 8.564 |

Dataset A | SMAPE (%) | MASE | |||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 8.506 | 12.385 | 14.882 | 0.966 | 0.946 | 0.921 | 1.175 | 0.794 | 0.852 |

RF | 7.386 | 11.682 | 13.897 | 0.977 | 0.950 | 0.924 | 1.022 | 0.762 | 0.746 |

BPNN | 9.351 | 11.631 | 14.755 | 0.974 | 0.945 | 0.912 | 1.337 | 0.759 | 0.823 |

RNN | 8.867 | 13.610 | 14.418 | 0.968 | 0.934 | 0.915 | 1.241 | 0.835 | 0.816 |

GRU | 8.964 | 16.202 | 17.159 | 0.971 | 0.915 | 0.894 | 1.181 | 0.980 | 0.966 |

LSTM | 8.109 | 14.464 | 15.430 | 0.974 | 0.925 | 0.904 | 1.107 | 0.911 | 0.898 |

Proposed | 6.237 | 8.264 | 8.564 | 0.983 | 0.968 | 0.965 | 0.879 | 0.549 | 0.502 |

Multistep forecasting performance of dataset B.

Dataset B | RMSE (Pa) | MAE (Pa) | MAPE (%) | ||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 16.489 | 23.055 | 29.053 | 12.237 | 17.162 | 20.852 | 5.754 | 7.991 | 9.729 |

RF | 13.073 | 21.800 | 29.502 | 9.072 | 15.656 | 22.119 | 4.295 | 7.368 | 10.248 |

BPNN | 10.278 | 20.472 | 28.494 | 7.351 | 15.297 | 21.762 | 3.455 | 7.121 | 10.068 |

RNN | 11.642 | 19.547 | 27.292 | 7.652 | 14.335 | 20.234 | 3.505 | 6.590 | 9.357 |

GRU | 12.772 | 18.888 | 25.046 | 9.460 | 13.243 | 18.920 | 4.379 | 5.932 | 8.572 |

LSTM | 10.783 | 20.887 | 26.942 | 7.849 | 15.276 | 19.791 | 3.666 | 7.162 | 9.020 |

Proposed | 8.874 | 10.216 | 12.766 | 6.143 | 7.464 | 9.292 | 2.889 | 3.475 | 4.258 |

Dataset B | SMAPE | MASE | |||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 5.807 | 7.799 | 9.413 | 0.900 | 0.775 | 0.617 | 1.145 | 0.509 | 0.433 |

RF | 4.290 | 7.227 | 10.005 | 0.934 | 0.798 | 0.593 | 0.873 | 0.413 | 0.355 |

BPNN | 3.440 | 7.010 | 9.831 | 0.960 | 0.825 | 0.620 | 0.758 | 0.340 | 0.288 |

RNN | 3.564 | 6.507 | 9.045 | 0.950 | 0.842 | 0.664 | 0.764 | 0.349 | 0.287 |

GRU | 4.258 | 5.952 | 8.583 | 0.948 | 0.854 | 0.723 | 0.885 | 0.388 | 0.330 |

LSTM | 3.636 | 6.912 | 8.890 | 0.956 | 0.825 | 0.667 | 0.806 | 0.350 | 0.294 |

Proposed | 2.868 | 3.433 | 4.260 | 0.970 | 0.962 | 0.936 | 0.651 | 0.291 | 0.239 |

Multistep forecasting performance of dataset C.

Dataset C | RMSE (Pa) | MAE (Pa) | MAPE (%) | ||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 0.806 | 0.789 | 0.939 | 0.660 | 0.614 | 0.700 | 0.843 | 0.786 | 0.895 |

RF | 0.728 | 0.759 | 0.944 | 0.606 | 0.588 | 0.717 | 0.775 | 0.753 | 0.916 |

BPNN | 0.644 | 0.722 | 0.878 | 0.516 | 0.557 | 0.672 | 0.661 | 0.713 | 0.859 |

RNN | 1.209 | 0.872 | 0.924 | 0.998 | 0.673 | 0.738 | 1.279 | 0.860 | 0.944 |

GRU | 0.759 | 0.943 | 1.078 | 0.621 | 0.736 | 0.847 | 0.795 | 0.944 | 1.080 |

LSTM | 0.854 | 0.896 | 0.912 | 0.645 | 0.697 | 0.719 | 0.828 | 0.892 | 0.917 |

Proposed | 0.636 | 0.696 | 0.734 | 0.522 | 0.543 | 0.550 | 0.668 | 0.697 | 0.704 |

Dataset C | SMAPE (%) | MASE | |||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 0.843 | 0.787 | 0.898 | 0.886 | 0.888 | 0.831 | 1.338 | 0.932 | 0.937 |

RF | 0.775 | 0.754 | 0.918 | 0.903 | 0.892 | 0.829 | 1.236 | 0.906 | 0.935 |

BPNN | 0.660 | 0.714 | 0.862 | 0.917 | 0.899 | 0.847 | 1.070 | 0.870 | 0.905 |

RNN | 1.267 | 0.882 | 0.946 | 0.881 | 0.881 | 0.847 | 1.961 | 1.030 | 0.951 |

GRU | 0.797 | 0.944 | 1.085 | 0.895 | 0.836 | 0.811 | 1.293 | 1.111 | 1.193 |

LSTM | 0.831 | 0.892 | 0.919 | 0.872 | 0.870 | 0.842 | 1.315 | 1.061 | 0.980 |

Proposed | 0.668 | 0.698 | 0.705 | 0.921 | 0.907 | 0.892 | 1.080 | 0.808 | 0.694 |

Multistep forecasting performance of dataset

Dataset D | RMSE (m/s) | MAE (m/s) | MAPE (%) | ||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 0.490 | 0.607 | 0.689 | 0.375 | 0.457 | 0.534 | 5.940 | 7.188 | 8.361 |

RF | 0.468 | 0.661 | 0.766 | 0.350 | 0.466 | 0.569 | 5.511 | 7.306 | 8.911 |

BPNN | 0.431 | 0.572 | 0.690 | 0.341 | 0.436 | 0.525 | 5.525 | 6.873 | 8.214 |

RNN | 0.596 | 0.738 | 0.693 | 0.422 | 0.504 | 0.507 | 6.399 | 7.689 | 7.838 |

GRU | 0.657 | 0.803 | 0.909 | 0.469 | 0.599 | 0.647 | 7.280 | 9.472 | 10.265 |

LSTM | 0.554 | 0.938 | 0.876 | 0.383 | 0.667 | 0.670 | 5.964 | 10.608 | 10.604 |

Proposed | 0.416 | 0.477 | 0.504 | 0.325 | 0.381 | 0.411 | 5.145 | 6.218 | 6.587 |

Dataset D | SMAPE (%) | MASE | |||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 5.913 | 7.223 | 8.413 | 0.836 | 0.718 | 0.615 | 1.247 | 0.996 | 1.025 |

RF | 5.499 | 7.283 | 8.906 | 0.845 | 0.680 | 0.547 | 1.172 | 1.023 | 1.047 |

BPNN | 5.421 | 6.902 | 8.286 | 0.877 | 0.752 | 0.620 | 1.193 | 0.984 | 0.994 |

RNN | 6.666 | 7.953 | 7.992 | 0.764 | 0.603 | 0.620 | 1.419 | 1.099 | 0.970 |

GRU | 7.310 | 9.269 | 10.173 | 0.725 | 0.607 | 0.372 | 1.543 | 1.311 | 1.242 |

LSTM | 6.062 | 10.302 | 10.515 | 0.800 | 0.482 | 0.479 | 1.314 | 1.408 | 1.181 |

Proposed | 5.142 | 6.074 | 6.531 | 0.878 | 0.854 | 0.815 | 1.103 | 0.789 | 0.757 |

Multistep forecasting performance of dataset

Dataset E | RMSE (m/s) | MAE (m/s) | MAPE (%) | ||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 0.867 | 0.968 | 1.057 | 0.654 | 0.781 | 0.866 | 22.620 | 28.222 | 32.017 |

RF | 0.784 | 0.924 | 1.020 | 0.609 | 0.743 | 0.820 | 21.356 | 26.962 | 30.864 |

BPNN | 0.716 | 0.886 | 0.993 | 0.570 | 0.718 | 0.800 | 18.607 | 26.227 | 30.301 |

RNN | 0.913 | 0.929 | 0.995 | 0.722 | 0.745 | 0.783 | 21.835 | 24.968 | 29.807 |

GRU | 1.003 | 1.194 | 1.079 | 0.772 | 0.961 | 0.843 | 25.742 | 34.070 | 29.340 |

LSTM | 0.998 | 1.192 | 1.144 | 0.771 | 0.961 | 0.902 | 25.926 | 33.595 | 31.405 |

Proposed | 0.707 | 0.729 | 0.794 | 0.570 | 0.588 | 0.643 | 19.080 | 19.764 | 21.880 |

Dataset E | SMAPE (%) | MASE | |||||||

Model | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps | 1 step | 3 steps | 5 steps |

DT | 20.085 | 24.425 | 26.965 | 0.656 | 0.460 | 0.267 | 1.227 | 0.944 | 0.937 |

RF | 19.088 | 23.243 | 25.544 | 0.694 | 0.515 | 0.345 | 1.160 | 0.890 | 0.892 |

BPNN | 18.247 | 22.501 | 24.888 | 0.760 | 0.566 | 0.399 | 1.122 | 0.852 | 0.869 |

RNN | 24.788 | 24.104 | 24.423 | 0.708 | 0.542 | 0.396 | 1.370 | 0.906 | 0.858 |

GRU | 22.863 | 28.852 | 26.291 | 0.606 | 0.426 | 0.382 | 1.422 | 1.133 | 0.898 |

LSTM | 25.351 | 29.580 | 26.859 | 0.606 | 0.366 | 0.383 | 1.452 | 1.164 | 0.986 |

Proposed | 18.218 | 19.005 | 21.420 | 0.757 | 0.740 | 0.674 | 1.120 | 0.732 | 0.727 |

Based on the forecast results in Figures

The probability distributions with respect to forecast errors in Figures

According to the fitting curves of the observed and forecast values, the forecast values of the proposed model fit the observed values preferably. They are obviously closer to the observed values than the compared models, which testifies the effectiveness of adaptive residual strategy and reinforced forecasting strategy once again.

It is clear in accordance with the forecasting evaluation indicators in Tables

As clearly inferred from Tables

The proposed model outperforms the other compared models by a large margin. It can be speculated that no common single forecasting model is always suitable for any case. RNNs (including RNN, LSTM, and GRU) do not act well compared to the other models. The main reason for the fails of the advanced models may be that they are equipped with many adaptable parameters, causing complex training. Especially in the case of small sample learning, the training datasets are insufficient for training the complex construction, which would lead to overfitting easily.

We can conclude that the proposed two-stage frame method is efficient and achieves a superior performance among the compared models.

In this section, comprehensive analysis from a variety of aspects is conducted for further verification of the performance of the developed model.

As one type of hypothesis testing approaches, the DM test [

Under a given a significance level

The DM test statistics can be described as

Take datasets A and B as examples. Table

DM test between the proposed model and compared models.

Dataset A | Proposed model | Dataset B | Proposed model | ||||
---|---|---|---|---|---|---|---|

Model | 1 step | 3 steps | 5 steps | Model | 1 step | 3 steps | 5 steps |

DT | 3.078^{∗} | 2.432^{∗∗} | 4.446^{∗} | DT | 4.642^{∗} | 4.108^{∗} | 4.069^{∗} |

RF | 1.863^{∗∗∗} | 2.146^{∗∗} | 4.011^{∗} | RF | 3.533^{∗} | 3.646^{∗} | 4.283^{∗} |

BPNN | 4.380^{∗} | 2.663^{∗} | 4.452^{∗} | BPNN | 2.357^{∗∗} | 3.741^{∗} | 4.426^{∗} |

RNN | 4.389^{∗} | 3.364^{∗} | 4.249^{∗} | RNN | 1.811^{∗∗∗} | 4.061^{∗} | 4.549^{∗} |

GRU | 3.324^{∗} | 3.925^{∗} | 4.890^{∗} | GRU | 3.501^{∗} | 3.218^{∗} | 4.166^{∗} |

LSTM | 3.670^{∗} | 3.541^{∗} | 4.845^{∗} | LSTM | 2.126^{∗∗} | 4.339^{∗} | 4.168^{∗} |

^{∗}The 1% significance level; ^{∗∗}the 5% significance level; and ^{∗∗∗} the 10% significance level.

Variance of the forecast errors

Variance of forecast errors for compared models.

Dataset A | Variance of errors | Dataset B | Variance of errors | ||||
---|---|---|---|---|---|---|---|

Model | 1 step | 3 steps | 5 steps | Model | 1 step | 3 steps | 5 steps |

DT | 1.2163 | 1.8473 | 2.6153 | DT | 270.9387 | 526.4214 | 824.9468 |

RF | 0.8174 | 1.7049 | 2.5465 | RF | 170.9140 | 474.3029 | 865.8920 |

BPNN | 0.8894 | 1.8663 | 2.9253 | BPNN | 105.4442 | 418.7244 | 807.7960 |

RNN | 1.1152 | 2.2864 | 2.8394 | RNN | 127.9360 | 381.9665 | 737.4361 |

GRU | 1.0079 | 2.9900 | 3.6188 | GRU | 135.7117 | 356.7373 | 625.0089 |

LSTM | 0.9101 | 2.5201 | 3.2030 | LSTM | 113.0190 | 420.1230 | 725.8046 |

Proposed | 0.5867 | 1.1134 | 1.1924 | Proposed | 76.9971 | 98.5339 | 162.1974 |

The comparison results between adaptive residual CNN and traditional CNN for 1-step forecasting of datasets A, B, and

RMSE based on different models for 40 times experiments.

According to Figure

Model capacity refers to its ability to fit various functions. If the model capacity is insufficient, it would lead to underfitting. Quite the reverse, if the model capacity is too high, overfitting phenomenon would occur easily. The proposed adaptive residual CNN is equipped with adjustable model capacity. It can be adjusted adaptively according to the complexity of the task. Various inherent fluctuation characteristics of wind signals can be captured by the adaptive residual strategy, improving the great generalization ability of model.

The reinforced forecasting performances for 1-step forecasting of datasets A and B are indicated in Figure

Box-plot of reinforced forecasting performance based on different input variables.

It is declared that (1) not all reinforced models take effect. For instance, the reinforced model of dataset A acts awfully when

To see more clearly, the performances of simplified-boost reinforced forecasting and preliminary forecasting for 20 times experiments are shown in Figure

RMSE of preliminary forecasting and simplified-boost reinforced forecasting for 20 times experiments. (a) Dataset A and (b) dataset B.

The accuracy and robustness of models are essential for wind signal forecasting. However, the intermittency, uncertainty, and diversity of complex wind signals lead to enormous challenge for forecasting by a generalized model.

The multichannel convert provides additional views of the same input about temporal resolution, presenting additional information at different time scales. The proposed adaptive residual strategy allows forecasting for multiple kinds of wind signals with varying complexity by the same neutral network, which not only reduces the parameter redundancy and computational complexity, but also promotes the generalization capability of the model. Subsequently, the preliminary forecasting is enhanced by the simplified-boost reinforced forecasting, improving the forecasting accuracy and robustness to a great extent. The results of multistep forecasting for five kinds of nonstationary non-Gaussian wind signals verify the efficiency of the proposed model.

Though the preliminary forecasting and reinforced forecasting are two seemingly independent stages of the ensemble model, they complement each other actually. The deficiency of reinforced forecasting is that if the preliminary results are awful, the fault information or noise may be introduced into the reinforced model, resulting in unsatisfactory behavior of reinforced forecasting. Ideal preliminary results will bring quite effective information. Thus, satisfactory results could be obtained by the secondary forecasting, which creates a beneficent cycle, and vice versa. The phenomenon is particularly evident in multistep forecasting. Regarding multistep forecasting, more fresh information from preliminary forecasting will be introduced into the reinforced model, which has a more significant impact on the enhancing performance. In conclusion, the reinforced forecasting is the icing on the cake for preliminary forecasting. The complementary nature makes them interdependent. Therefore, the forecasting accuracy of preliminary forecasting is still important.

The developed hybrid model performs strong forecasting ability. It can be considered for application in long-term forecasting and other fields, such as traffic forecasting, economic forecasting, and house price tendency forecasting. In terms of long-term forecasting, the correlation between long-term data is relatively weak due to the long interval of data compared with short-term data. Thus, the proposed model can be combined with data decomposition technology as a forecasting module for the long-term data with high volatility. Further study will focus on seeking for more efficient reinforced forecasting model different from the preliminary model, so as to better compensate the inherent defects of the preliminary model. What is more, more effective boosting strategy will be studied in the future.

The data used to support the findings of this study are available from the corresponding author upon request.

The authors declare that there are no conflicts of interest regarding the publication of this paper.

This study was supported by the National Natural Science Foundation of China (Grant no. 51778354).