Research and Application of a New Hybrid Forecasting Model Based on Genetic Algorithm Optimization : A Case Study of Shandong Wind Farm in China

With the increasing depletion of fossil fuel and serious destruction of environment, wind power, as a kind of clean and renewable resource, is more and more connected to the power system and plays a crucial role in power dispatch of hybrid system. Thus, it is necessary to forecast wind speed accurately for the operation of wind farm in hybrid system. In this paper, we propose a hybrid model called EEMD-GA-FAC/SAC to forecast wind speed. First, the Ensemble empirical mode decomposition (EEMD) can be applied to eliminate the noise of the original data. After data preprocessing, first-order adaptive coefficient forecasting method (FAC) or second-order adaptive coefficient forecasting method (SAC) can be employed to do forecast. It is significant to select optimal parameters for an effective model. Thus, genetic algorithm (GA) is used to determine parameter of the hybrid model. In order to verify the validity of the proposed model, every ten-minute wind speed data from three observation sites in Shandong Peninsula of China and several error evaluation criteria can be collected. Through comparing with traditional BP, ARIMA, FAC, and SAC model, the experimental results show that the proposed hybrid model EEMD-GA-FAC/SAC has the best forecasting performance.


Introduction
Wind, as a kind of environmentally friendly, economically competitive, and socially beneficial energy, has become the most widely used renewable energy resource all over the world.Particularly in China, the majority of energy sources are fossil fuels such as coal, oil, and natural gas, but rapid economic growth and decrease of fossil fuel reserves compel China to find out alternatives.Wind energy with more advantages including low cost of power generation, high degree of industrial maturity, and good physical and social environmental impact becomes the first choice of renewable energy sources in China [1].Owing to the volatility and chaotic characteristics of wind speed, grid interconnection of wind farms has been a difficult and challenging task.However, in order to ensure the safe operation of the grid, the accuracy of wind speed forecasts plays a vital role in calculating the spinning reserve capacity of grid security forewarning management in wind grid.Therefore, it is necessary to forecast wind speed accurately.Generally speaking, there are two kinds of wind speed forecasts in terms of time span.One is long-term wind speed forecast, which is crucial for the sitting and sizing of wind power application [2,3].It is helpful for wind risk evaluation.The other is shortterm wind speed forecast, which is significant to improve the efficiency of wind power generation systems [4,5].The time scale of short-term forecasting range is from some seconds to minutes, hours, or several days.It can help the daily and intraday spot market, system management, and maintenance scheduling [6].
Many researchers have made efforts to develop good wind speed forecasting approaches including statistical methods, physical methods, physical-statistical models, artificial intelligent methods, and some other new hybrid methods.

Mathematical Problems in Engineering
Statistical methods include autoregressive integrated moving average (ARIMA) model and generalized autoregressive conditional heteroskedasticity (GARCH) model.Kavasseri and Seetharaman [7] used f-ARIMA model to forecast wind speed of four sites in North Dakota on the day-ahead and two-dayahead horizons.Liu et al. [8] found that ARMA-GARCH (-M) models could improve the modelling sufficiency of mean wind speed as the height increases.Physical methods, like weather research forecasting model (WRF) and mesoscale models, combine multiple physical considerations and provide good forecasting accuracy.WRF model was evaluated by González-Mingueza and Muñoz-Gutiérrez [9] through different parameterization options in Peru to minimize the uncertainty of wind speed forecast.Janjai et al. [10] evaluated wind energy potential of Thailand using an atmospheric mesoscale model and a geographic information system (GIS) approach.The experimental performance presented areas in the south had high wind energy potential.For physicalstatistical methods, WRF results can usually be considered as input variables, combined with observed historical data, to train the system based on statistical theories [11].Recently, some hybrid approaches based on artificial intelligence techniques have been proposed to forecast wind speed and have got good forecasting effects.Guo et al. [12] developed a hybrid model for wind speed prediction based on the empirical model decomposition (EMD) and feedforward neural network (FNN).Wang et al. [13] applied a combined epsilon-SVR forecast model based on the history data series eliminating the seasonal variation to make shortterm prediction.Pourmousavi Kani and Ardehali [14] applied artificial neural network (ANN) and Markov chain (MC) to propose a new hybrid ANN-MC model for short-term wind speed forecast.Guo et al. [15] proposed a hybrid seasonal autoregression integrated moving average and least square support vector machine (SARIMA-LSSVM) model to predict the mean monthly wind speed in Hexi Corridor.Wang et al. [16] used another new combined forecasting method including the seasonal ARIMA forecasting model, the seasonal exponential smoothing model, and the weighted support vector machines to make short-term prediction.
Empirical mode decomposition (EMD), widely adopted in many different fields [17][18][19], is a data adoptive method for analyzing nonlinear and nonstationary data.However, EMD may not function properly if the data does not meet certain conditions.Therefore, noise is introduced during the decomposition process to prevent mode mixing from contaminating the information embedded in IMFs [20].This noise-assisted EMD method is named ensemble empirical mode decomposition (EEMD) [18].Wind is an intermittent energy which means that there exists large variability due to temperature, humidity, pressure, and weather conditions [21].Based on the above features, this work first employs EEMD method to eliminate high frequency fluctuant parts of tenminute wind speed.For the sake of improving the forecasting precision, optimized hybrid models need to be developed.A hybrid model including first-order adaptive coefficient forecasting method (FAC) and second-order adaptive coefficient forecasting method (SAC) is proposed in this paper.In addition, the genetic algorithm (GA) is a highly parallel, stochastic, and adaptive optimization technique on the basis of biological genetic evolutionary mechanisms.Its performing operation is similar to natural selection, crossover, and mutation to get the final optimization results after repeated iterations [22].
The structure of this paper is as follows.Section 2 refers to our contribution of the paper.Section 3 introduces relative methods including ensemble empirical mode decomposition (EEMD), genetic algorithm (GA), first-order adaptive coefficient forecasting method (FAC), and second-order adaptive coefficient forecasting method (SAC).In Section 4, experimental simulation and evaluation of forecasting performance can be described in detail.Finally, Section 5 concludes this work.

Our Contributions
We propose an intelligent optimized hybrid model EEMD-GA-FAC/SAC based on the EEMD, GA, and FAC/SAC model which have several advantages.To begin with, as an intermittent energy, wind is vulnerable to the impact of temperature, humidity, pressure, and weather conditions, causing its characteristic of nonstationary and high frequency.It is necessary to develop methods of eliminating the interference information that would be well practical in application.Second, through the performance of experimental simulation results, it is obviously illustrated that the proposed hybrid model EEMD-GA-FAC/SAC is suitable for the current research.Second, ten-minute wind speed data is nonlinear and nonstationary.The EEMD-GA-FAC/SAC model can effectively eliminate high frequency inference signals and determine the optimal weight parameters of FAC and SAC model.Third, in order to select the most suitable weight coefficients for the hybrid model, GA is applied to determine the weight parameters of FAC or SAC.Finally, from the case study it can be concluded that the MAPE, MRE, and MAE of the proposed EEMD-GA-FAC/SAC model are smaller than the ones of the EEMD-FAC/SAC, FAC/SAC, BP, and ARIMA model.To sum up, the hybrid model EEMD-GA-FAC/SAC has good forecasting quality and high forecasting accuracy.For the above reasons, the proposed hybrid model EEMD-GA-FAC/SAC is more effective and adaptive to improve the forecasting accuracy than traditional BP, ARIMA, FAC, and SAC model.

Ensemble Empirical Mode Decomposition (EEMD).
Empirical mode decomposition (EMD) is an adaptive and efficient approach that is used to decompose nonlinear and nonstationary signals into a series of meaningful IMFs and one residual trend from high frequency to low frequency [23,24].However, the mode mixing problem is the most important shortcoming of EMD, which indicates either a single IMF consisting of signals of dramatically disparate scales or a signal of the same scale appearing in different IMF components, and usually intermittency of analyzing signal [25].In order to eliminate the mode mixing phenomenon and get the actual time-frequency distribution of the seismic signal, a new approach called ensemble empirical mode decomposition (EEMD) was proposed [26,27].The aim of EEMD is to add white noise to the data, which distributes uniformly in the whole time-frequency space, and make the bits of signals of different scales be automatically designed onto proper scales of reference established by the white noise [25].The detailed description of the algorithm of EEMD is as follows [24].Step 1. Initialize the number of ensemble  and the amplitude of the added white noise, with  = 1.
Step 2. Perform the th trial on the signal added with white noise: (a) add a white noise series with the given amplitude to the investigated signal where   () represents the mth added white noise series and   () indicates the noise-added signal of the mth trial; (b) decompose the noise-added signal   () into several IMFs  , ( = 1, 2, . . ., ) by the EMD method, where  , denotes the th IMF of the mth trial and  is the number of IMFs; (c) if  <  then go to step (a) with  =  + 1. Repeat steps (a) and (b) again and again, but with different white noise series each time.
Step 3. Calculate the ensemble mean   of the  trials for each IMF: Step 4. Report the mean   ( = 1, 2, . . ., ) of each of the IMFs as the final IMFs.

Genetic Algorithm (GA).
The genetic algorithm (GA), a famous metaheuristic algorithm, can follow the natural evolution processes.The GA starts at defining optimization variables, objective functions, and control parameters [28].Generally speaking, the GA begins by creating a random population which composes of a certain number of individual solutions indicated by "chromosomes", and the "chromosomes" contain all the genes (i.e., variables) and are involved in each possible solution.The chromosomes are evaluated based on the "objective function, " which is the expected objective of the problem [29].The brief procedure can be seen in the following way [30,31].
Step 1. Randomly generating the initial population.
Step 2. Computing and saving the fitness function for each individual in the population.The individual fitness function can be defined as () = max  − ();  is the objective function.
Step 3. Section operation: the fitness value in the population can take part in this operation on the basis of probabilities.Define selection probabilities of each individual while maintaining the proportionality.In the selection operation, the members of the population with better fitness value can participate several times, while the members with worse value may be removed for the sake of getting a larger fitness average.Next, we can generate offspring.
Step 4. Crossover operation: it allows an exchange of the design characteristics between two mating parents.This operation is done by selecting two mating parents in which two random places are selected on each chromosome string and the strings between these two places among the mates are exchanged.
Step 5. Mutation operation: the aim is to search the minimum solution and keep population diversity and avoid the premature convergence phenomenon.It is invoked with a low probability at a randomly selected site on the chromosomal string of the randomly chosen design.The operation consists of a switching of a 0 to 1 or vice versa.

First-Order Adaptive Coefficient Forecasting Method (FAC).
The aim of the first-order adaptive coefficient forecasting method (FAC) [32,33] is to correct the coefficient values constantly on the basis of changes in data, thus making forecasting results the best.The forecasting equation is as follows: The solution of   is given below.If there is a system error during a particular forecasting time (it means all the   values are positive or negative),   should be larger.When the forecasting value x is relatively smaller, that is,   =   − x > 0, we can make   larger so as to increase x+1 according to (3).When the forecasting value x is relatively larger, that is,   =   − x < 0, we can also make   larger in order to decrease x+1 .In other words, the larger the system error is, the larger the   value is.If there is not a system error, that is, values of   are alternately positive and negative and absolute values of   are relatively smaller, thus   can remain unchanged.How to measure the system error?We can give measure methods in the following way.Supposing that  is a constant (0 <  < 1), we can make exponentially weighted average for the forecasting errors   ( = 1, 2, . . ., ) before time : |  | can reflect the situation of the forecasting system errors; when  =  − 1, Thus, based on (4) and ( 5) the recursive calculative formula of   can be got as follows: In order to make 0 <   < 1, let when  =  − 1, In a similar way, the recursive calculative formula of   can be obtained: In order to satisfy the above requirements, let   = |  |/  .Figure 1 shows the calculation procedure of firstorder adaptive coefficient forecasting method.

Second-Order Adaptive Coefficient Forecasting Method (SAC).
The principle and calculation formula of secondorder adaptive coefficient forecasting method (SAC) is the same as the first-order adaptive coefficient method (FAC), so the calculation process (shown in Figure 1) can be given directly as follows [32,34].

Hybrid EEMD-GA-FAC/SAC Model.
As exponential smoothing methods, first-order adaptive coefficient forecasting method and second-order adaptive coefficient forecasting method have been commonly used for short-term and medium-term time-series trend forecast.The exponential smoothing is compatible with the advantage of entire period average and the moving average, which utilize the history data to affect the weight gradually.It converges to zero as far away from the data.It is set equal to 0.2 or 0.1 in terms of experience that the smoothing parameter of the first-order adaptive coefficient forecasting method and second-order adaptive coefficient forecasting method under normal circumstances.However, neither of them is a universal model that is appropriate for all circumstances; therefore the parameter should be adjusted to different situations.Moreover, if the original data was preprocessed with high frequency interference information eliminated before employing exponential smoothing method, it would be more attributed to make prediction.As the topic of this paper, wind speed forecasting problem shows nonstationarity with high frequency interference information and can be settled by exponential smoothing method of which the smoothing parameter gets optimized.In view of the above two points, it is necessary to develop a hybrid methodology to make full use of the advantages of respective methods.The combining methodology consisted of four steps.
Step 1. EEMD method is used to preprocess the original data before employing model.
Step 2. FAC and SAC are employed to do forecast using the preprocessed data by EEMD method.
Step 3. Genetic algorithm is introduced to determine the optimal weight parameter instead of experiential value.
Step 4. Evaluate the forecasting performance and effects of the models.

Experimental Simulation and Results Analysis
4.1.Data Description.Shandong Peninsula (shown in Figure 2) is located in the northeast of Shandong Province in China.As an economy-developed and large energy consumption province, Shandong Province wind power industry can greatly mitigate the high pressure of energy conservation and emissions reduction.Therefore, wind energy evaluation and estimation in Shandong Province are difficult but quite important task, which contribute to the grid interconnection of wind farms, saving energy consumption, and reducing pollution.The forecasting methods, forecasting horizons, and the certain locations of wind speed properties all have impact on wind speed forecasts.As a whole, the shorter forecasting horizons generally ease the change of wind speed, thus getting smaller forecasting errors than middle-or long-term forecast [33].In this paper, the ten-minute wind speed is selected in Yantai, Weihai, and Qingdao of Shandong Peninsula.The locations of the three selected observation stations are shown in Figure 2. The collected data is from June 1, 2011, to June 6, 2011, between 00:00 and 23:50, with ten-minute interval each day for model construction and model test.Figure 3 shows the variation trend of wind speed.It can be seen that wind speed has large random fluctuation characteristics with nonlinear and nonstationary signals.Therefore, it is essential to employ EEMD method to eliminate the high frequency information of ten-minute wind speed before constructing the model, which not only reveals the inner nature of the data but also improves forecasting accuracy.

Data Processing.
In financial econometrics, the noise signal is dominant in high frequency data, so the authors prefer low frequency data rather than fine sampled data to obtain more stable estimates [35].If the data with nonstationary signals and high frequency inference information is used to establish the forecasting model, it will cause worse forecasting results.Thus, ensemble empirical mode decomposition (EEMD) is employed to eliminate the high frequency interference information in original ten-minute wind speed series.According to the algorithm of EEMD method, it decomposes the collected wind speed with noiseadded signal into nine IMFs, removes the first IMF, and calculates the sum of the rest of the IMFs.Through data processing, it can be seen that the curve of the denoising data is smoother than the original wind speed time series in Figure 4 (taking the first training set from 1 June 2011 at 0:00 to 6 June 2011 at 23:50 as an example).
Furthermore, the processed data are applied to establish model and assess the forecasting quality and effects of the models.

Statistical Measurements of Forecasting
Quality.Yokum and Armstrong concluded that the accuracy criterion was more important in comparison with cost savings generated from improved methods and execution issues; they conducted an expert opinion survey about the evaluation measurements in order to select forecasting techniques [36].
To evaluate the forecasting quality and effects of the models quantitatively, we utilize multiple statistical measurements including the mean absolute percentage error (MAPE), mean square error (MSE), and mean absolute error (MAE).When these three forecasting errors decrease, the accuracies of the forecasting results will increase [37,38] In this section, the forecasting models including FAC, SAC, BP, and ARIMA models can be compared.Tables 1 and  2 show the forecasting results and the forecasting accuracies of four different traditional models.As displayed in Table 2, in observation site 1, the MAPEs of BP and ARIMA are above 6% and the MAPEs of FAC and SAC are below 6%.The MAEs of FAC and SAC are considerably lower than the ones of BP and ARIMA model.Therefore, it is presented that FAC and SAC model perform better than BP and ARIMA model in MAPE and MAE.For observation site 3, comparing with the MAPE, MSE, and MAE of BP and ARIMA, FAC and SAC still have higher forecasting accuracies.However, for observation site 2, it is not difficult to find that BP and ARIMA show better forecasting performance.Consequently, the forecasting abilities of four traditional models exhibit different results for different sites.In addition, it can also be found that the forecasting qualities of FAC and SAC model are similar.

Results and Discussion
of Different Hybrid Models.In this section, the hybrid models including EEMD-FAC/SAC and EEMD-GA-FAC/SAC are compared.The original wind speed data can be preprocessed by EEMD method, which aims to eliminate the high frequency nonstationary information.After data preprocessing, the amount of twelve training sets is grouped in the same manner presented in Section 4.4.Correspondingly, the forecasting results of EEMD-FAC/SAC are described in Table 3.It is from 00:00 to 22:00 on 6 June with a certain interval of two hours.Then genetic algorithm is utilized to determine the optimal weight parameter to predict.
The forecasting results of the proposed hybrid EEMD-GA-FAC/SAC model based on the processed data and the values of weight parameter  from three observation sites are shown in Tables 4-6.The detailed procedures of the hybrid EEMD-FAC/SAC model and the proposed hybrid model EEMD-GA-FAC/SAC are presented by the following.
Step 1. Decompose the nonstationary high frequency information from the original wind speed series.
Step 2. Produce the forecast results of FAC and SAC models utilizing the processed data by EEMD method, which is represented by EEMD-FAC/SAC.
Step 3. Optimize the forecasting results of hybrid EEMD-FAC/SAC model by selecting the best parameter  with genetic algorithm.
Step 4. Analyze the forecasting performance among the hybrid EEMD-FAC/SAC model and the proposed hybrid EEMD-GA-FAC/SAC model.
(a) Add a white noise series with the given amplitude to the investigated signal x m (t) = x(t) + n m (t).Repeat steps (a) and (b) again and again, but with different white noise series each time.
Step 3: Calculate the ensemble mean a t of the M trials for each the final IMFs.
Step 4: Report the mean a i (i = 1, 2, . .., I) of each of the IMFs as

Conclusions
Wind speed forecasting becomes increasingly important for wind farm management and the conversion of wind power in power dispatch of hybrid system.Therefore, this paper proposes an intelligent optimized hybrid model based on EEMD, FAC or SAC, and GA to forecast wind speed in Shandong Peninsula of China.This hybrid model uses EEMD method to decompose the noise and eliminate the high frequency interference information of the original wind speed data and applies an artificial intelligent optimization algorithm GA to determine the optimal parameter of FAC and SAC model.As a case study, every ten-minute wind speed data from 1 June to 6 June in 2011 in three observation sites of Shandong Peninsula are collected and multiple errors evaluation criteria like MAPE, MSE, and MAE are chosen to validate the forecasting performance of the hybrid model.The experimental results show that the hybrid model has the best forecasting performance in comparison with traditional models like BP, ARIMA, FAC, SAC, EEMD-FAC, and EEMD-SAC, from which it can be concluded that the proposed hybrid model EEMD-GA-SAC/FAC can effectively, adaptively, and reliably improve the forecasting performance in large wind farms of China.

Figure 1 :
Figure 1: Flow charts of FAC and SAC.

Figure 2 :Figure 3 :
Figure 2: The geographical location of the three observation sites in Shandong Peninsula.

EEMD Step 1 :
the parameters of GA Randomly generating the old population End Yes N o Initialize the number of ensemble M and the amplitude of the added white noise, with m = 1.
(b) Decompose the noise-added signal x m (t) into several (c) If m < M then go to step (a) with m = m + 1.

Table 1 :
The forecasting results of four traditional models for original data.

Table 2 :
Errors comparisons of four traditional models.

Table 3 :
The forecasting results of EEMD-FAC/SAC model in three sites.

Table 4 :
The forecasting results and optimized parameters of the proposed hybrid model EEMD-GA-FAC/SAC in site 1.

Table 5 :
The forecasting results and optimized parameters of the proposed hybrid model EEMD-GA-FAC/SAC in site 2.

Table 6 :
The forecasting results and optimized parameters of the proposed hybrid model EEMD-GA-FAC/SAC in site 3.
=   − x ,  corresponds to the sample size,   represents the actual value at time , and x represents the forecasting value at time .

Table 7 :
Errors comparisons of all forecasting models in site 1.

Table 8 :
Errors comparisons of all forecasting models in site 2. FAC model in observation sites 1 and 3.It is illustrated that, for different types of data, EEMD-GA-FAC model and EEMD-GA-SAC model present different forecasting quality.However, no matter which is the best forecasting model, it is concluded that our proposed hybrid model EEMD-GA-FAC/SAC outperforms other traditional models and hybrid

Table 9 :
Errors comparisons of all forecasting models in site 3.To sum up, the proposed hybrid model EEMD-GA-FAC/SAC is suitable to forecast wind speed with a certain time interval of 2 hours in Shandong Peninsula of China.