A Multistep Prediction of Hydropower Station Inflow Based on Bagging-LSTM Model

+e inflow forecasting is one of the most important technologies for modern hydropower station. Under the joint influence of soil, upstream inflow, and precipitation, the inflow is often characterized by time lag, nonlinearity, and uncertainty and then results in the difficulty of accurate multistep prediction of inflow. To address the coupling relationship between inflow and the related factors, this paper proposes a long short-term memory deep learning model based on the Bagging algorithm (Bagging-LSTM) to predict the inflows of future 3 h, 12 h, and 24 h, respectively. To validate the proposed model, the inflow and related weather data come from a hydropower station in southern China. Compared with the classical time series models, the results show that the proposed model outperforms them on different accuracy metrics, especially in the scenario of multistep prediction.


Introduction
For hydropower stations, power generation is the main source of economic benefits and water is the raw material of production. An accurate inflow prediction is conducive to avoiding flood disasters, reasonably arranging flood control and power generation schedules, and improving the economic benefits of power generation. In the past decades, a large number of studies are aimed at how to effectively improve the prediction accuracy. With the progress of science and technology, machine learning technologies are applied, and the prediction accuracy is further improved [1,2].
In the following, we briefly review the techniques in the existing literature for inflow forecasting. e researchers have employed traditional time series analysis methods in the field, such as autoregressive (AR) [3], moving average (MA), autoregressive moving average (ARMA), and autoregressive integrated moving average (ARIMA) [4]. e application and comparison between above methods can be found in References [5][6][7]. However, these traditional methods have the problem that the prediction trend is roughly correct, but the prediction of fluctuation is not ideal, which is mainly reflected in the amplitude difference and phase offset of fluctuation. en, some researchers have proven that those methods fail to further improve forecasting accuracy due to their stationarity and linearity [8]. In comparison with many statistical methods, artificial neural network (ANN) has proved to be more accurate in time series forecasting due to their ability to deal with nonlinearity and nonstationarity. A three-layered artificial neural network was used to forecast inflow of reservoir for 7 days of head-time [9]. It demonstrated that the ANN model has a great generalization ability over 23 dams in the U.S. with varying hydrological characteristics. Reference [7] shows a comparison about prediction effect between ARMA, ARIMA, and autoregressive artificial neural network. From the reference, we know that the reservoir inflow in the past 12 months that the ARIMA model had less error than the ARMA model, while autoregressive artificial neural network can forecast the inflow from the past 60 months well. Besides, a validation framework for ANNs is introduced to effectively evaluate the replicative and structural validity [10]. In the past few years, many machine learning algorithms have been successfully applied to solve the reservoir inflow forecasting problems, such as support vector regression (SVR) [11,12], deep belief network (DBN) [13,14], as well as some hybrid models [15][16][17]; those models are frequent to inflow prediction. Under the background of the new era, the rise of deep learning technologies provides a new solution for the prediction of inflow. When inflow forecasting is considered, the use of multilayer perceptron (MLP), recurrent neural network (RNN), and convolutional neural network (CNN) is widely observed [18]. Reference [19] represents a comparison between CNN, MLP, and support vector machine (SVM) in which the superiority of the CNN is revealed. Some variants of RNN, e.g., long shortterm memory (LSTM) [20][21][22] and gated recurrent unit (GRU) [23,24] have been proposed and found that their performance is better than the traditional RNN network.
In this study, we propose a LSTM deep learning model based on the Bagging ensemble learning algorithm in the inflow prediction of hydropower station. e contributions of this paper can be summarized as follows. (1) is paper is devoted to addressing the problem of multistep forecasting. By strictly considering the influence of rainfall, temperature, wind direction, and wind speed to inflow, a fusion matrix is constructed with those features. (2) Our method considers the Bagging strategy for model integration in the inflow prediction scenario. We use the strategy to generate multiple base learners on different resampling subsets. is also reduces the deviation by utilizing the independence between base learners. e remainder of the paper is as follows. Section 2 introduces the application context and abstract description of the problem. Section 3 describes the overall framework of the proposed model. Section 4 demonstrates the data exploratory analysis and the experimental results. Section 5 gives a conclusion.

Problem Description
In recent years, with the increasing accumulation of hydrometeorological data, it becomes possible for researchers to further explore and grasp the basic pattern of reservoir inflow variation.
rough the effective data analysis methods, we can reach a precise forecast about future reservoir stream flow according to huge amounts of historical data and accessible monitoring data from observation stations in different basins. is will bring significant safety and economic value to hydropower station production. For all that, owing to the absence of forecasting time-lag flow upstream and downstream of cascade and interval runoff and the impact of other many factors, the predicted values are often greater than actual values. is often leads to the hydropower station's operation plan suffer frequent revisions and is not conducive to the safe, efficient, and stable operation of the power station. To solve the aforesaid problem, this proposes a short-term inflow forecasting method from the side of actual real production demand. It predicts the inflow in the next 24 hours to provide a reference for irrigation, hydropower generation, domestic and industrial consumption, and flood control measures in the station. e schematic diagram of hydropower station inflow is shown in Figure 1.
Forecasting the inflow of a hydropower station is essentially a time series forecasting problem. In order to make a forecast, it is essential to find the variation pattern of historical data and explore the relationship between the forecasted variable and various related factors so as to make a scientific forecast of the future trend of the forecasted variable.
erefore, from the perspective of time series analysis, the basic principle of the forecasting problem is depicted in Figure 2. e variable t indicates the time scale, which can be minutes, hours, days, and so on, and it should be based on the specific problem. According to the collected historical sample data y t , it is now necessary to make a forecast y t for the future time period T + Q ≤ t ≤ T + Q based on the historical development pattern of the forecast object y t , where t ∈ [1, T]; the time period T is the last period of known data; P ≤ Q, T, P, and Q are the corresponding time scale constants. (1) In equation (1), t is the time series number; y is the variable to be predicted; X � x 1 , x 2 , . . . , x m is a vector of m relevant factors; and S is the parameter vector of this prediction model.

Model Framework.
e flow chart of the model framework is shown in Figure 3. Generally, model integration has a stronger generalization ability than the base learner. To make the model more reliable, the Bagging ensemble learning algorithm is introduced to enhance the independence between the base models, thereby reducing the model error. From the figure, we can see that the prediction model consists of the following sections.
(1) Data Preprocessing. Firstly, the missing value and abnormal value of the data are processed, and then the data are normalized to improve the convergence speed of the model. Due to the different temporal resolutions between the datasets, the environmental data and the rainfall data of the telemetry station are resampled to keep the temporal resolution consistent with the inflow data, and then all the features are fused. (2) Model Training. e fused dataset is divided into a training set and a test set. For the training set, Bagging uses self-sampling to generate different base learners. It introduces self-sampling to obtain a training subset for training base learners. In this model, each base learner uses a three-layer neural network. e first layer is the input layer. e second layer is the LSTM layer, with 50 neurons, tanh as the activation function, and Adam as the optimizer. e third layer is the output layer, with the activation function defaulted to sigmoid.
2 Discrete Dynamics in Nature and Society (3) Model Integration and Prediction. Bagging integrates the base learners using the classical weighted averaging method. It will perform a weighted average of the predictions of each base learner to obtain the final prediction of the model.

LSTM Neural Network
Cell. RNN is a common neural network structure, which is used to process time series data. RNN has a special network structure and reflects the impact of past moments in current projections. Meanwhile, it shares the weight matrix at different times and decreases the number of parameters so that increasingly improves the training efficiency. Besides, it has the advantages to deal with arbitrary length time series data. e recurrent cell of the standard RNN is shown in Figure 4(a). e network has same structure at each t time. X t is the input data at t time. H t is the hidden state at t time, which is the memory state of network. It also can capture the information at all times before. H t−1 is the hidden state at t-1 time. It is usually initialized to zero. H t is calculated by H t−1 and X t . e activation function is nonlinear, such as tanh or relu. O t is the output data at t time.
However, in the practical application, there is a problem of long-term dependence for the optimization algorithm of standard recurrent neural network training. As the network structure deepens, the network disables to learn the past information. When the LSTM network model was proposed, it can effectively solve the problem of a long-term dependence on information and avoid gradient disappearance or explosion. e LSTM unit is described in Figure 4(b). Compared with the traditional RNN, the uniqueness of LSTM in structure is that it cleverly designed the gate structure. LSTM uses two gates to control the content of unit state C. One is forget gate, which determines how much of the unit state C t−1 of the previous moment remains to the current moment C t . Another input gate decides how much input X t of the current network is saved to the cell state C t . As for the output gate, it is used to control how much unit state C t has output to the current output value H t of LSTM.
In equation (2), f t , i t , and o t denote the forget gate, input gate, and output gate, respectively; c t denotes the cell state; h t denotes the current output value; x t denotes the current input value; w denotes the weight matrix;b denotes the bias vector; [·, ·] represents splicing the two

Predicted result
Historical data Discrete Dynamics in Nature and Society matrices by rows; σ(·) is the sigmoid function; and tanh is the hyperbolic tangent function. In order to obtain the optimal model parameters, the model parameters are updated during training using backpropagation through time algorithm.

e Flow Chart of Model
Training. e training process of the model is shown in Figure 5. Specifically, the fused feature dataset D is first used as input, then the LSTM is selected as the base learner, the number of base learners T is set, and the training model set is further initialized. Besides, the loop count variable q is set and initialized to 1. Next, the size of q value is judged, and when q value is less than T, the training model is started by entering the loop. In the first step, m samples are randomly selected from the fused feature set D to form the subset Sub, and the samples are selected with put-back sampling, which causes some samples to be selected multiple times, while some samples may not be selected; in the second step, the subset Sub is divided into a training set Sub_Train and a validation set Sub_Valid, and the base learner is trained using the training set to obtain the trained base model M q ; in the third step, the validation set is fed into the M q model for verifying whether its prediction accuracy meets the requirements; if not, the model is retrained, and if it meets, the trained model M q is added to the model set Models; in the fourth step, the loop variable q is added by 1 for training the next base model until the training of T models is completed. Finally, the model set Models containing T base learning models is obtained.

Evaluation Metrics.
In order to comprehensively evaluate the proposed model, NSE and R 2 indicators are used to measure the accuracy of the model: In NSE index, y t represents the real value at time t, y t represents the predicted value at time t, y represents the average of the real value, and T represents the total time. In R 2 metric, n represents the number of samples, y i represents the real value at time i, y represents the average of the real value, y i represents the predicted value at time i, and y represents the average of the predicted value.   Discrete Dynamics in Nature and Society e mean absolute error (MAE), root mean square error (RMSE), and normalized root mean square error (NRMSE) were selected to calculate the model error: In the metrics, N represents the number of samples, y i represents the real value at time i, y i represents the predicted value at time i, and y max and y min represent the maximum and minimum values of a test set, respectively.

Experimental Result and Analysis
To verify the prediction effect of the proposed model, the real data of a hydropower station in southern China were selected for experiments. Using the inflow data recorded by the station from 2015 to 2017, the time resolution is 3 h, and the historical inflow data are shown in Figure 6. Given the comprehensive influence of precipitation, evaporation, soil (directly affecting surface runoff and underground runoff), upstream inflow, and many other factors, the rainfall data of local telemetry stations and environmental observation data (temperature, wind direction, and wind speed) were added as the consideration factors. In view of the security concerns, all used data have been desensitized. According to the proportion of 80% and 20%, the training set and test set are divided and the reservoir inflow of the next 3 h, 12 h, and 24 h is predicted, respectively. All relevant datasets can be retrieved from https://github.com/HYNU-WLL/ e-Inflow-Prediction.

Data Exploration.
Data exploration is an important part of data analysis, which can provide references for experiments. Usually, inflow is closely related to meteorological and seasonal changes, with a certain periodicity and continuity. Time correlation is one of the most important information in the prediction of inflow. Rainwater reaching the reservoir through infiltration and underground runoff often has a certain delay. e measured inflow data of the hydropower station in a certain period are selected for the correlation function calculation of time series, including autocorrelation function (ACF) and partial autocorrelation function (PACF). e analysis of the results is shown in Figure 7. e ACF and PACF measure the dependence of present samples on the past samples of the same series, which can be calculated by the following equations: In equations (5) and (6), N is the series length, y t is the value at moment t of the series, y is mean of the series, and φ k,j � φ k−1,j − φ kk φ k−1,k−j , j � 1, 2, . . . , k − 1 From Figure 7, it can be found that there is a specific regularity and periodicity between adjacent time nodes as time changes. However, their correlations become weaker when the time interval becomes longer. erefore, in the actual forecasting work, the selection of historical data within a reasonable time range determines the performance of the model prediction. e setting of this parameter can not only improve the prediction accuracy of the model but also reduce the unnecessary computational burden. With the graphical analysis results, 8 time steps of historical data are selected for the input of the model in this paper. e inflow of hydropower stations mainly comes from rainfall-runoff, watershed confluence, and river confluence, and water vapor evaporation and surface transpiration will reduce the amount of water. At the same time, human needs and behavior will also affect the change of reservoir flow, which often makes the inflow difficult to predict, and many factors are difficult to accurately observe. is cannot obtain effective data as a support, so it is difficult to effectively improve the prediction accuracy. Considering the influence of rainfall, temperature, wind speed, and wind direction on Discrete Dynamics in Nature and Society inflow, the correlation between inflow and other factors is shown in Figure 8. It also reveals that in the hydropower station basin, the correlation between inflow and temperature is high, followed by the correlation with rainfall. In contrast, the correlation between wind speed and wind direction and inflow is relatively low. In this model, the inflow is used as the prediction feature, and the influence of rainfall, temperature, wind speed, and wind direction on the inflow is considered to improve the prediction accuracy.  Figure 9. e proposed model has the closest prediction to the true value, and it has better stability and less volatility in its prediction results. However, it can be seen from the prediction curve of the SVR model in Figure 9 that its prediction curve is too wave-shaped and far from the  true value, which is consistent with the error calculation index of the model. e predicted values of the DBN, RFR, and GBRT models fit the true value better, but the models do not predict the extreme values well enough to capture the change of the extreme value. Meanwhile, the scatter plots of the predicted and true values of each model are depicted in Figure 10. e effect of the model prediction results can be judged from the degree of aggregation of the scatter plots. Compared with the general LSTM model, the Bagging − LSTM model uses the strategy of integrating multiple models to make predictions, and its predicted values are more closely near to the true values, and the prediction results are better than the LSTM. After comparing the calculations, however, it is found that the proposed model has a slight advantage over other models in terms of prediction errors.

Experimental Result and
With the increase in the prediction step, the prediction effect of the model is significantly improved. e proposed model significantly outperforms other models in all indicators when 12-hour ahead and 24-hour ahead forecasts are performed. e evaluation indexes of the 12-hour ahead and 24-hour ahead prediction results are shown in Tables 2 and  3, respectively, from which it can be seen that the NSE and R 2 indexes of the SVR model decrease rapidly and the error of the model becomes larger. is is due to the fact that the lack of long-term memory capability of the SVR model makes it unable to better solve the multistep prediction problem. Specifically, it will use the prediction result of the previous step as the output of the current prediction step result when the prediction step becomes longer, resulting in a single prediction value that appears as a prediction straight line in Figures 11 and 12. Similarly, the prediction distribution of the scatter plots in Figures 13 and 14 shows that the prediction results of the DBN and RFR models are limited to a single maximum value, which is shown as a sort of the withdrawal phenomenon in the plots.
is reveals the     Discrete Dynamics in Nature and Society   Discrete Dynamics in Nature and Society As a whole, the longer the model predicts, the prediction accuracy decreases and the prediction error gradually becomes larger. In conclusion, we can see that the model proposed in this paper has high prediction accuracy, lower error, and better robustness compared with the traditional model.

Conclusion
e real-time and effective prediction of hydropower station inflows can provide a favorable reference for irrigation, hydroelectric power generation, domestic and industrial consumption, and flood control. In this work, a Bagging − LSTM deep learning model has been constructed for forecasting reservoir inflows and evaluated by the historical data of real hydropower stations in southern China. Extensive test experiments showed that the proposed model yields higher accuracy and lower error compared with LSTM, SVR, DBN, RFR, and GBRT approach. Especially, the prediction effect of the proposed model becomes more significant as the prediction step becomes longer. Besides, for long-term forecasting and extreme-event analysis, the Bagging − LSTM model shows a better superiority than conventional methods. However, there is still room for improvement in the Bagging − LSTM stream-flow forecasting model. e results show that errors in peak volume forecast cannot be ignored.

Data Availability
e data that support the findings of this study are openly available in GitHub at https://github.com/HYNU-WLL/ e-Inflow-Prediction or from the corresponding author upon request.

Conflicts of Interest
e authors declare that they have no conflicts of interest or personal relationships that could have appeared to influence the work reported in this paper.