Short-Term Power Generation Energy Forecasting Model for Small Hydropower Stations Using GA-SVM

Accurate and reliable power generation energy forecasting of small hydropower (SHP) is essential for hydropower management and scheduling. Due to nonperson supervision for a long time, there are not enough historical power generation records, so the forecasting model is difficult to be developed. In this paper, the support vector machine (SVM) is chosen as a method for shortterm power generation energy prediction because it shows many unique advantages in solving small sample, nonlinear, and high dimensional pattern recognition. In order to identify appropriate parameters of the SVM prediction model, the genetic algorithm (GA) is performed. The GA-SVM prediction model is tested using the short-term observations of power generation energy in the Yunlong County and Maguan County in Yunnan province. Through the comparison of its performance with those of the ARMA model, it is demonstrated that GA-SVM model is a very potential candidate for the prediction of short-term power generation energy of SHP.


Introduction
Small hydropower (SHP) is a kind of world recognized and concerned renewable clean energies.It widely attracts attention in the whole world as its great significance for medium and small rivers management, strengthening the rural water conservancy infrastructure construction, meets rural energy demand, improves the rural energy structure, reduces the pollution of the environment, responds to climate change, promotes the development of the local economy [1][2][3][4][5], and so forth.In the past two decades, the installed capacity of SHP increases more than 2.5 GW per year because it has many advantages, such as small scale, mature technology, short construction time, less investment, and near-zero pollution emissions, and generally causes no immigration or land submersion.
Up to the end of 2012, the installed capacity of SHP in China had exceeded 65 GW and annual generation over 200 TWh, which take about 30% of hydropower installed capacity and power generation, respectively, and both rank first in the world [6].Different from other countries in the world, SHP plays an important role in China's rural electricity supply as it is widely distributed in more than 1600 mountainous counties in China; approximately half of the territories, one-third of counties, and a quarter of the total population are dependent upon SHP for rural electricity supply [7,8].However, with the fast development of SHP and large-scale access to power grid, its influence on the power grid is becoming more and more obvious, especially in southwest China which has rich SHP resource.SHP has become a major factor that affects the safe operation and development of power grid.Most of SHP plants are runoff river plant without regulation ability, so its power output is obviously intermittent and seasonal because of the uncertainty of rainfall.In particular, in flood season, the rainfall is very big and focused so that SHP plants may generate much more power output than other periods.At the same time, the big hydropower plant also generated even more power output.That can probably lead to water resource wasted and electricity dumped under the condition of current transmission capacity.Therefore, it is necessary to master short-term power generation energy (STPGE) of SHP in order to avoid the above situation through using regulation ability of big hydropower plants.

2
Mathematical Problems in Engineering However, SHP plants are generally in the small remote river basin with shortage of hydrologic station and the management is weak due to nonperson supervision for a long time, so it is very difficult for forecasting STPGE of SHP because of lack of necessary runoff data.At present, a lot of research activities in short-term forecasting models of hydropower stations have been carried out, which focus on the forecasting of inflow in reservoirs [9][10][11][12], of stream flow [13][14][15], or of precipitation [16].But there are few research works referring to forecasting the STPGE for SHP stations [17].Since the parameters will greatly affect the performance of SVM, some literatures attempted to determine the proper parameter values for their problems [18,19].However, for large scale or real-time feature practice application, the considerable search time cannot be accepted.Heuristic algorithms have been successfully used in many complex problems [20][21][22].
This paper presents a novel short-term forecasting model (named GA-SVM) for power generation energy of SHP stations.In this study, support vector machine (SVM) was used to identify power generation energy based on structural risk minimization principle [18,[23][24][25][26] and its parameters are optimized by genetic algorithm (GA) to get the optimal model structure [27,28].Considering dynamically putting into operation of SHP plant or hydrounit, the installed capacity utilization hours of SHP are selected as input and output value of the proposed forecasting model since the power generation energy of SHP is not the same at different times.This method is applied to forecast STPGE of the small hydropower stations in Yunlong County and Maguan County, Yunnan province, China.Compared with the conventional method, the proposed GA-SVM model exhibits superior performance, demonstrating GA-SVM's effectiveness as an approach to forecast STPGE of SHP.
The paper is organized as follows.In the next section "Brief Introduction to SVM and GA, " SVM and GA algorithms are briefly introduced.Then, the proposed GA-SVM forecasting method is described in the following section.In the next section, this method is applied to Yunnan province, and the results are compared with those of conventional method.The final section concludes the paper.

Support Vector Machines (SVM).
The SVM, developed by Vapnik [29], is based on statistical learning theory and implements the structural risk minimization principle rather than the empirical risk minimization principle implemented by most traditional ANN models.It seeks to minimize an upper bound to the generalization error instead of minimizing the training error and can achieve an optimum network structure.Many researchers have used SVM to implement forecasting model in every field, which mainly focuses on forecasting rainfall.Dibike et al. demonstrated the capability of the SVM in hydrological prediction, such as modeling the rainfall runoff process [30].There are other scholars who have used the SVM for rainfall forecast ranging from 1-2 days ahead to 1 h ahead [31].In this paper, the SVM model is used to forecast STPGE of SHP.And the radial basis function (RBF) is employed as kernel function which has shown to simplify the use of a mapping, because the RBF is more compact in comparison with other kernels and is able to shorten the computational training process and improve the generalization performance [30].The RBF is also computationally simpler than a polynomial kernel, which has more parameters [32].The equation for RBF is of the form

Genetic Algorithm (GA).
GA is a global optimal algorithm based on "survival of the fittest" in Darwin's theory of evolution and provides an efficient and robust optimized searching method in complex space.This is an excellent search algorithm adapted to the global probability.GA operates iteratively on a population of structures, each of which represents a candidate solution to the problem, encoded as a string of symbols (chromosome), and uses randomized technical guidance to effectively search a coded parameter space.GA makes use of coding technology to transform the solved space of problem into chromosome space and also convert the decisive variable into a certain structure of individual chromosomes.During the iteration of the algorithm, according to the rules set by the fitness function, these groups made up of individuals generated next generation through selection, crossover, and mutation.Fitness factor which is beneficial to the population will be inherited, while factors that reduce fitness will be eliminated with the operation of mutation and crossover in iterations.After continuous evolutions, the optimal individuals survive, which can be approximate optimal solution of the problem.

Short-Term Forecasting Model for Power Generation Energy Using GA-SVM
3.1.Forecasting Object.Generally, the daily power generation energy is directly selected as forecasting object for STPGE of SHP.But, considering dynamically putting into operation small hydropower plant or hydrounit in some region, there is a difference of installed capacity of SHP between one day and another day.Since the power output of SHP plant is almost close to installed capacity in flood season, the power generation energy is also very different due to the increase in installed capacity of SHP.The model prediction performance will be affected if power generation energy of SHP is only used as input and output values of the model.Therefore, the installed capacity utilization hour represents power generation energy of SHP in region.That could not only accurately reflect the characteristics of small hydropower plant without regulation ability but also alleviate short-term fluctuations in power generation curve.The installed capacity utilization hour was where Hour  is installed capacity utilization hour in region at day ; Energy  is power generation energy in region at day ; Capacity  is the install capacity of all small hydropower plants in region at day .

Short-Term Forecasting Model of SHP Using GA-SVM.
To apply SVM model to forecast STPGE of SHP plants in region, we need to know the three vital parameters RBF kernels: , , and , which respectively denote positive constant, insensitive loss function, and Gaussian noise level of standard deviation.Different values of , , and  can lead to large differences in the forecasting result.The parameters , , and  control the complicacy of the model and error of the approximation, thus reflecting the difficulty of the training and the forecasting accuracy.In order to improve the forecasting accuracy, we should confirm the three parameters.In recent years, several methods such as the genetic algorithm [33,34] and shuffled complex evolution algorithm [35][36][37] have been developed for model parameter calibration.In this paper, GA is used to optimize parameters of SVM kernel function.This approach requires no a priori knowledge and is of high stability and accuracy.Figure 1 illustrates the flow chart of optimizing the three parameters of SVM model by GA.The GA is used to seek a better combination of the three parameters in the SVM so that a bigger forecasting accuracy is obtained in each iteration.In this study, the input and output variables are normalized in the range from 0 to 1 by (3).That can minimize deformation error range and guarantee the unity of the model data in order to improve prediction accuracy.Consider (4)

Model Performance Estimation.
A lot of goodness-offit measurements have been applied to evaluate model performance.Appropriate evaluation criteria should be chosen when using multicriteria to validate model performance [38].
In this paper, the following two statistical measures, which are usually used in other researches, are chosen as evaluation criteria for model performance: where  is the total amount of observed data, Energy *  and Energy  are respective observed and forecasted value at day .
The root mean squared error (RMSE) is an arbitrary positive value and will indicate a good performance when it is close to zero.The mean absolute percentage error (MAPE) is a relative index of absolute model error and can express accuracy as a percentage [39,40].The smaller the value of MAPE is, the better performance the model shows.

Study Areas and Data.
There is extremely rich hydropower resource in Yunnan province, whose potential capacity ranks third in China.The hydropower resources of every region are extremely uneven and mainly distributed in the west and north, followed by the east and south.By the end of October 2012, the SHP plants in Yunnan had reached 1587, with 3417 units and 8453.05MW of the installed capacity, which accounts for more than 27% and 12% of hydropower capacity in Yunnan province and SHP capacity in China, respectively [41].The two typical counties, Yunlong County and Maguan County, are in Dali region and Wenshan region in Yunnan province, respectively, and are selected as study areas in this paper.The location of the two counties is shown in Figure 2.
Yunlong County is located in the west of Yunnan province with a total area of 4400.95 km 2 .And the annual average temperature and annual average rainfall are 15.9 ∘ C and 729.5 mm, respectively.By the end of 2013, there are 10 small hydropower plants with installed capacity 111.5 MW.Maguan County is located in the southeast of Yunnan province with a total area of 2676 km 2 .And the annual average temperature and annual average rainfall are 16.9 ∘ C and 1345 mm, respectively.1.In the table,  mean ,   ,   ,  min , and  max stand for mean, standard deviation, skewness coefficient, minimum, and maximum, respectively.The table indicates that the training data fully includes validation data.In addition, it can be easily found that power generation energy for the two counties both vary over a wide range and are concentrated in the flood season, much bigger than other seasons.So the data from September to October in flood season is selected for model testing and other data for model training.In addition, the dispatching personnel of power grid are more concerned about power generation energy of SHP in flood season.

Results and Discussion
. In this study, the GA is employed as parameter search scheme.In order to get better parameters of SVM, the maximum iterative time of GA is set as 50 and the population size is set to 30, 50, 80, 100, 120, and 150, respectively.And the optimal scope of three parameters (, , and ) of SVM model are [2 −5 , 2 5 ], [0, 2], and [2 −13 , 2 −1 ], respectively.The performance statistics of SVM models are given in Tables 2 and 3 for the two counties.The results from Table 2 clearly indicate that the population size (ii) for SVM models with the optimal parameters (, , ) = (5.5762,0.2275, 0.0073) can be selected as forecast model for Yunlong County.
For Maguan County, it can be seen from Table 3 that the two statistical measures of population size (i) in calibration stage are clearly better than others since those are slightly better or worse in validation stage.So the optimal parameters (, , ) = (2.3792,0.6749, 0.0058) were selected through comprehensive comparison.
In order to get a better comprehension of the GA-SVM model performance, the ARMA model was employed as a comparative purpose.The basic components to an ARMA model is autoregression (AR) and moving-average (MA).To obtain a suitable ARMA (, ) model, the two integers  and  have to be determined, respectively, by the number of autoregressive orders and the number of moving-average orders of the ARMA model.In this paper, the AIC (Akaike information criterion) value of ARMA models, for  and  ranging from 1 to 13, is calculated.
In this study, the same training and verification sets are used for the two models in order to have the same basis For Yunlong County, the model's RMSE and MAPE statistics of the calibration and validation period are summarized in Table 6.With the results shown in Table 6, the analysis can be executed crisply.The results reveal that the GA-SVM model outperformed ARMA with respect to the two measures in the calibration period.In this stage, the GA-SVM model improved the ARMA model of about 0.24 in RMSE value and 0.41 in MAPE value.For the comparison between GA-SVM and ARMA model in the validation period, the GA-SVM obtains better RMSE value than the ARMA; while the MAPE value of the two models are nearly equal to each other.Figure 3 shows the comparison of forecasted versus observed discharge using GA-SVM and ARMA model for Yunlong County.It can be seen from the residuals that the GA-SVM model performs better than ARMA.Furthermore, it can be concluded from Table 4 and Figure 3 that GA-SVM model obtains slightly better forecast precision than ARMA.
For Maguan County, the model's RMSE and MAPE statistics of the calibration and validation period are summarized in Table 7. Table 7 demonstrates that the GA-SVM model is clearly superior to ARMA in the calibration and validation period of the two measures.In the validation period, the GA-SVM model improved the ARMA model of about 7.89 and 0.41 in RMSE and MAPE values, respectively.For the comparison between GA-SVM and SVM model in the validation period, the GA-SVM model obtains slightly better MAPE value and worse RMSE value than the ARMA. Figure 4 shows the comparison of forecasted versus observed power generation energy using GA-SVM and ARMA models for the Maguan County.As can be seen from the residuals, the GA-SVM model performs better than ARMA except for a few peaks.Furthermore, it can be concluded from Table 5 and Figure 4 that the GA-SVM model overall performs better than the ARMA model.

Conclusion
In the present study, the GA-SVM prediction model comprising support vector machine with genetic algorithm has been developed for forecasting short-term power generation energy of small hydropower in region.In order to get a better comprehension of the GA-SVM model performance, the ARMA model was employed as a comparative purpose.The two models were constructed and their performances were compared crisply.The results indicated that the GA-SVM model can give slightly better prediction performance than the other model.For the less data of small hydropower in region, the GA-SVM model proposed in this paper is an effective method for improving short-term forecasting accuracy.That is useful for fully absorbing small hydropower resources and avoiding water resource wasted and electricity dumped in flood season.

Figure 1 :
Figure 1: The flow chart of optimizing SVM by GA.
Hour  = Hour  − Hour min Hour max − Hour min , (3) where Hour  is the normalization value at day ; Hour  is the original value at day ; Hour max and Hour min are the maximum and minimum of sample data sets, respectively.After training and testing the GA-SVM model, the forecast value of power generation energy is calculated by Hour  = Hour  × (Hour max − Hour min ) + Hour min , Energy  = Hour  × Capacity  .

Figure 2 :
Figure 2: Location of the study area.

Figure 3 :
Figure 3: Comparison of forecasted versus observed power generation energy using GA-SVM and ARMA model for Yunlong County.

Figure 4 :
Figure 4: Comparison of forecasted versus observed power generation energy using GA-SVM and ARMA model for Maguan County.

Table 1 :
The  mean ,   ,   ,  min , and  max of the data set of Yunlong County and Maguan County.

Table 2 :
The performance statistics of GA-SVM models for Yunlong County.

Table 3 :
The performance statistics of GA-SVM models for Maguan County.

Table 4 :
AIC value and performance indices of alternative ARMA models for Yunlong County.

Table 5 :
AIC value and performance indices of alternative ARMA models for Maguan County.

Table 6 :
Model statistics of the calibration and validation period for Yunlong County.

Table 7 :
Model statistics of the calibration and validation period for Maguan County.
The historical observed data derived from Yunlong County and Maguan County in Yunnan province in China were employed to investigate the modeling potentiality of GA-SVM.Data from May 1, 2011, to August 31, 2013, and from September 1, 2013, to October 31, 2013, are used for training and validation, respectively, in short-term power generation energy prediction.Due to the lack of small hydropower operation data, SVM is chosen as forecasting model because of its ability in solving small sample.The three parameters of SVM model are not known a priori and optimized by GA in order to get appropriate parameters for improving forecasting accuracy.