Short-Term Wind Power Interval Forecasting Based on an EEMD-RT-RVM Model

Accurate short-termwind power forecasting is important for improving the security and economic success of power grids. Existing wind power forecasting methods are mostly types of deterministic point forecasting. Deterministic point forecasting is vulnerable to forecasting errors and cannot effectively deal with the random nature of wind power. In order to solve the above problems, we propose a short-termwind power interval forecastingmodel based on ensemble empirical mode decomposition (EEMD), runs test (RT), and relevance vector machine (RVM). First, in order to reduce the complexity of data, the original wind power sequence is decomposed into a plurality of intrinsic mode function (IMF) components and residual (RES) component by using EEMD. Next, we use the RT method to reconstruct the components and obtain three new components characterized by the fine-to-coarse order. Finally, we obtain the overall forecasting results (with preestablished confidence levels) by superimposing the forecasting results of each new component. Our results show that, compared with existing methods, our proposed short-term interval forecasting method has less forecasting errors, narrower interval widths, and larger interval coverage percentages. Ultimately, our forecasting model is more suitable for engineering applications and other forecasting methods for new energy.


Introduction
Industrialization practices are rapidly depleting fossil fuel reserves.Moreover, widespread use of fossil fuels produces large amounts of greenhouse gases and dust particles, both of which have significant negative effects on human society and the environment [1][2][3].In order to address the energy crisis and alleviate environmental pressures, many countries are researching and utilizing forms of renewable energy [4][5][6].Wind power has become especially prominent in the field of renewable clean energy because it is pollution-free, reserverich, and readily renewable [5].Continuous improvements in wind power technology have led to an increase in the number of wind-powered grids.However, wind power is also random and volatile, and any serious power disturbances can affect the safety and stability of wind-powered grids.As such, accurate wind power forecasting is necessary for creating reasonable generation plans and system backup arrangements [7][8][9].Ultimately, the key to increasing the number of wind-powered grids is to improve the wind power penetration limit of power grids.
The stochastic volatility of natural wind and its effects on wind-powered grids cannot be ignored.Interval forecasting can effectively reflect the uncertainties in the forecasting results.Deterministic point forecasting methods have some deficiencies in characterizing the randomness of actual wind power [26].Therefore, it is necessary to establish a forecasting method that is capable of efficiently providing accurate information.If we can establish a forecasting method capable of providing accurate interval forecasting, we will better

Empirical Mode Decomposition (EMD).
EMD is an efficient signal decomposition method that does not rely on any predefined basis function.The EMD reflects the dynamics of signals more accurately than other models.The modes extracted by the EMD, named the intrinsic mode functions (IMF), are defined by the following criteria: (1) the number of extrema and zero crossings must be equal or differ by no more than one and (2) the local mean of the envelope defined by the local maxima and local minima must be zero [40,41].These two criteria ensure that each IMF has a physically meaningful phase definition; however, the time invariant frequency does not necessarily have a meaningful phase definition.
Given a signal (), the EMD algorithm can be summarized as follows.
Step 3. Find out all the local minima and maxima of (), and interpolate between the local minima and maxima, respectively, in order to get an upper envelope   () and a lower envelope V  ().The mean value of these envelopes is described as Next, compute the minis of the original data and the envelope mean value as ℎ  () =   () −   () . ( Step 4. Check whether ℎ  () satisfies the two criteria for an IMF (as defined above).If it is not satisfied, make  =  + 1,   () = ℎ  (), and repeat Step 3. If it is satisfied, the first IMF can be given as The residual can be computed by Step 5. Treat   () as a new signal and repeat Steps 1-4 (in order to find more IMFs) until the residual is a constant or a monotonic function.Finally, the given () can be decomposed into  IMFs and a final residual   () as follows:

Ensemble Empirical Mode Decomposition (EEMD).
Mode mixing is the most significant drawback of EMD.Mode mixing implies either a single IMF consisting of signals of dramatically disparate scales or a signal of the same scale appearing in different IMF components.This causes intermittency when analyzing signals.
In order to solve the problem of mode mixing in EMD, Wu and Huang proposed a new noise-assisted data analysis method called ensemble empirical mode decomposition (EEMD) [42].The EEMD method utilizes recent studies on white noise which showed that the EMD method is an effective self-adaptive dyadic filter bank when applied to white noise.The results demonstrate that noise can help data analysis in the EMD method [22,43].
Two important parameters used in the EEMD method are (1) the amplitude  of white noise and (2) the total repeat number  of the EMD.At present, the determination of  and  is based on the structural characteristics of the data.Generally, the  taken is 100, and  is chosen from a range of 0.05∼0.5.Based on previous tests, we set  = 100 and  = 0.2 in this paper.
The specific steps of the EEMD can be described as follows: (1) Set the value of the amplitude  and the total repeat number M.
(2) Add a white noise series to the signal.
(3) Decompose the signal with the added white noise into IMFs by using EMD.
(4) Repeat steps (2) and (3) using different levels of white noise each time and obtain corresponding IMF components of the decomposition.Calculate the mean of all the corresponding IMF components.Take the mean as the final result for each IMF.Calculate the mean of all the residual (RES) components and take the mean as the final result for the RES component: (5) Take the   () ( = 1, . . ., ) and   () as the IMF components and RES component, respectively.

Runs Test (RT).
The runs test method [44] is defined in the following.Assume the time series corresponds to IMF  and RES as {()}  =1 , where  is the label of IMF,  is label of samples, and  is the total number of samples.The mean value of the samples is defined as Then, the timing symbol   can be defined as where   consists of a series of statistically independent randomly arranged sequences of 0 and 1. Define each sequence with successive symbols (0 or 1) as a runs test.The total runs test number of each   can be used to detect the fluctuation of each component obtained by the EEMD.Next, the high and low runs test thresholds can be set according to the runs test, and the components decomposed by the EEMD will be reconstructed into three new components (with typical characteristics based on the fine-tocoarse order) [44].This ensures the decomposition effect and significantly reduces the run time of the model.Moreover, the similar components are reconstructed, strengthening the inherent laws of these data, to improve the prediction accuracy.

Relevance Vector Machine (RVM).
Compared with other forecasting algorithms, the RVM not only has high sparsity, less optimized parameters, flexible kernels, and strong generalization abilities, but also directly implements interval forecasting [45,46].Therefore, in this study, the RVM is used to establish the interval forecasting model for the new components reconstructed by RT.
For a given set of input training samples {  }  =1 and the corresponding output sets {  }  =1 , the relevance vector machine regression model can be defined as follows: where  is the error of the independent sample (which follows the Gaussian distribution with the variance  2 ),   is the model weights, (,   ) is a nonlinear kernel function,  is relevance vector, and  is the length of the data.
In the RVM, a priori probability distribution for each model weight is given as where   is the hyperparameter of a priori distribution of model weight   .Given a training sample set {  }  =1 , assume the target value   is independent and the noise in the data follows the Gaussian distribution with the variance  2 .Then, the likelihood function of the training sample set can be represented by Based on a priori probabilities distribution and the likelihood distribution, the posterior distribution over the weight forms Bayes rule and can be written as where Σ = ( −2 Φ  Φ + ) −1 ;  =  −2 ΣΦ  ;  = diag( 0 ,  1 , . . .,   ).
The marginal likelihood distribution of the hyperparameters can be obtained by where Ω =  2  + Φ −1 Φ  .Finally, the hyperparameter  and the variance  2 can be estimated by using the maximum likelihood algorithm.
If the input value is  *  , then the corresponding output probability distribution obeys the Gaussian distribution, and the corresponding forecasting value can be derived by The RVM model can give both the mean value and the variance.As such, this model reflects the uncertainty of forecasting results and provides accurate interval forecasting (within the range of certain confidence levels).
Under the confidence level of , the interval forecasting results are as follows: where Lb denotes the lower bound of the forecasting value and Ub denotes the upper bound of the forecasting value.As is shown in Figure 1(a), the actual wind power is random and volatile.In order to improve the forecasting effect, it is necessary to reduce the complexity of the data.Compared with other decomposition algorithms, the EEMD exhibits better noise robustness and decomposing effects.In this study, we use the EEMD to decompose actual wind power and to establish specific components, in which the periodicity, randomness, and the trends of the actual wind power can be clearly seen in the components.The decomposition results of EEMD are shown in Figure 1.

Using RT to
Reconstruct the New Components.By the definition of RT, the RT value is greater, the volatility of time series is stronger, the RT value is closer, and the overall trend of time series is more similar.The RT values of each component (in Figure 1) are calculated and shown in Table 1.
From Figure 1 and Table 1, the RT values of IMF1 and IMF2 are significantly large and relatively close, while the RT values of IMF6-IMF8 and RES are too small and very close.Also, the RT values of IMF3-IMF5 are found between the two.This shows that the dispersion of IMF1 and IMF2 series is strong, while the general trend of IMF6-IMF8 and RES is similar.Moreover, the fluctuation trend of IMF3-IMF5 is between the two.Based on the above analyses and studies, we set the high runs test threshold as 100 and the low runs test threshold as 10.The composition of the three new components is shown in Table 2.
The trend graph of the new components after reconstruction is shown in Figure 2. It is evident from Figure 2  to be explicitly described.All three of the components meet the composition standard of actual wind power.
In order to further simplify the calculation and narrow the forecasting interval, we established a point forecasting for the trend component and obtained the interval forecasting for the detailed component and the random component.We obtained the overall optimal interval forecasting results (under a certain confidence level) from the superposition of each component's forecasting results.

Sample Data Normalization.
Since poor and missing data affect forecasting accuracy, it is necessary to pretreat load data obtained from measurements.In this study, we primarily used transverse and longitudinal comparisons methods for data pretreatment.In addition, we used the normalization method in order to simplify the calculation and standardization of loads, prices, and weather data (a necessary measure since the input variables have different units and values).By doing so, the value of the data can be limited to [0, 1].The specific calculation formula is where x() is the normalized value of the data and  max and  min represent the maximum value and the minimum value of the data, respectively.

Kernel Function Determination of RVM.
RVM is a pattern recognition and regression forecasting method based on kernel functions.The kernels implement nonlinear transformations among a plurality of feature spaces.The basic idea of hybrid kernels is to combine a plurality of kernels with different characteristics (in a certain proportion) in order to ensure that the combined kernel function has better performance.Importantly, RVM is less limited in kernel function selection.Moreover, RBF kernels are well-suited to solving local fluctuations, while polynomial kernels are wellsuited to dealing with global fluctuations.A combination of typical RBF local kernels and the global kernels (of polynomial kernels) is used for improved short-term wind power interval forecasting.The hybrid kernel is shown as follows: where (,   ) is the RBF kernel; (,   ) is the binomial kernel function;  is the weight of the kernel function;  is the kernel width; and  and  are the parameters to be optimized.We employed the grid search method in order to obtain the optimal values of  and .

Evaluation Indexes of the Model.
There are many indexes used to evaluate the errors of point forecasting results, such as APE (absolute percentage error), MAPE (mean absolute percentage error), and RMSE (root mean square error) [47][48][49].The smaller the error, the higher the forecasting accuracy.The assessment methods of interval forecasting differ from point forecasting (except when using the MAPE index).Other indexes used to evaluate the efforts of interval forecasting results are FICP (forecasting interval coverage percentage) [50] and FIAW (forecasting interval average width).The definitions of these methods are as follows.
(1) MAPE where  *  represents the forecasting result of the th forecasting sample;   represents the true value of the th forecasting sample; and  represents the number of the sample.MAPE is used to evaluate the error between the expected forecasting value and the actual value.The smaller the value, the higher the forecasting accuracy.
(2) FICP where FICP (1−) represents the interval coverage; and  (1−) is the number of the actual value falling within the confidence interval (at the confidence level).The index of FICP evaluates the credibility of the interval.The greater its value, the higher its credibility.
(3) FIAW where FIAW (1−) represents the average width of the confidence interval under the level 1 − ; (  ) and (  ) are, respectively, the upper and lower bounds of the th forecasting sample; and   refers to the actual value of the th forecasting sample.The index of FIAW is based on its ability to evaluate the uncertain degree of the forecasting results.

Overall Procedures of the EEMD-RT-RVM Model.
In this study, we propose a short-term wind power interval forecasting method based on the EEMD-RT-RVM model.The flow chart of our proposed forecasting model is shown in Figure 3.

Analysis Results of the Demonstration
In order to verify the interval forecasting effects of an EEMD-RT-RVM model that uses different confidence levels, we chose to use confidence levels of 90% and 60% in our example.The interval forecasting results are shown in Figures 4 and 5.The indexes of MAPE, FICP, and FIAW are used to assess the effects of the interval forecasting.Table 3 shows portions of the forecasting results and the indicator analysis results.
Advances in Meteorology    To prove the superiority of our model, we used the same wind power to obtain short-term interval forecasting from the RVM model, the EMD-RVM model, and the EEMD-RVM model.We used the indexes of MAPE, FICP, and FIAW, and their running times to assess their effects on interval forecasting.Table 4 shows the comparison results of the models (under the 90% confidence level).
Moreover, in order to further evaluate the adaptability of this proposed model, the wind power data of the actual wind farm in the other days of different seasons are chosen for the research.For example, the dates of February 12, July 22, October 15, and December 17, 2009, are chosen randomly.Based on the time scale of the original wind power data, the 15 min ahead short-term wind interval forecasting results under the 90% confidence level for these days are shown in Figure 6.The indicator analysis results with MAPE, FICP, and FIAW are organized in Table 5.In Figure 6, the interval width is narrower in July than October.It means the data fluctuation of October is stronger than the data of July.It is found in Figure 6 and Table 5 that the MAPE indicator can reflect the effectiveness of the proposal method.The smaller the MAPE, the better the forecasting accuracy, illustrating the forecasting expected value is closer to actual result.Further, the MAPE of different days are all within 6.5% and meet the actual project requirements.The interval width becomes narrower with the smaller FIAW value due to the better MAPE results and this decreases the model uncertainty.Meanwhile, the credibility of forecasting results may reduce with the smaller FICP value.Based on the above analysis, we drew the following five conclusions: (1) the forecasting results of our proposed model effectively follow the actual wind power value, and the fluctuations are consistent with changes in actual wind power.
(2) Most of the actual wind power falls within the forecasting interval with confidence levels of 90% and 60%; however, the number of forecasts falling outside the forecasting interval of the 60% confidence level is significantly larger than those falling outside the interval of the 90% confidence level.This accurately depicts the characteristics of the actual situation and reflects the effectiveness of our interval forecasting results.(3) The interval width of the 90% confidence level is significantly greater than that of the 60% confidence level.Decreases in the confidence level also decrease the interval width and the interval coverage.(4) Overall, our proposed model had minimum forecasting errors, narrower interval widths, and higher interval coverages than any of the other models.(5) The EEMD has a better theoretical foundation and noise robustness than our model; however, we overcome this weakness by using the mode mixing phenomenon of EMD.Moreover, the use of runs tests uncovers the correlation among the components and reduces the complexity of our model (which contributes to improved forecasting effects and enhanced running efficiency).
In summary, our EEMD-RT-RVM model had the better performance results in forecasting short-term wind power interval.Furthermore, our model is applicable to other practical engineering applications.

Conclusions
Owing to the volatility and randomness of the nature wind, the deterministic wind power forecasting is a difficult and complex task.In this study, we proposed an EEMD-RT-RVM model to achieve more accurate short-term interval wind power forecasting.The EEMD was used to decompose wind power sequences into IMF components and RES component, which reduced the inherent volatility of the wind power sequences.We then used runs tests to reconstruct new components.Overall, our methods improved the forecasting performance and enhanced running efficiency.The actual wind power data from November 20, 2009, to November 25, 2009 (15 min/one point), are used to verify the effectiveness and superiority of the proposed VMD-RT-RVM model, and quantitative evaluation is conducted based on comprehensive error evaluation criteria and interval evaluation criteria.Simulation results and analysis demonstrate that the volatility and randomness of wind power are reduced by the EEMD method, and the proposed EEMD-RT-RVM model performs better than conventional single models and other combined models.Our proposed EEMD-RT-RVM model not only improves forecasting accuracy, but also significantly reduces the width of the forecasting interval (under the premise of guaranteed interval coverage).Ultimately, our model is suitable for numerous practical applications, and it also serves as a good reference value for the output forecasting of other new energy sources.

Figure 3 :
Figure 3: Flow chart of the short-term wind power interval forecasting model using EEMD-RT-RVM.

Figure 4 :
Figure 4: Interval forecasting results with a 90% confidence level.

Figure 5 :
Figure 5: Interval forecasting results with a 60% confidence level.

Figure 6 :
Figure 6: Interval forecasting results of four days for the wind farm.

Table 1 :
The runs value of each component.

Table 2 :
The composition of each new component. 1 , . . .,   )  ,  = ( 0 ,  1 , . . .,   )  , and Φ is the design matrix given by  1 ,  1 ) ( 1 ,  2 ) ⋅⋅⋅ ( 1 ,   ) 1 ( 2 ,  1 ) ( 2 ,  2 ) ⋅⋅⋅ ( 2 ,   ) 1  (  ,  1 )  (  ,  2 3.1.UsingEEMD to Decompose an Original Wind Power Sequence.In order to verify the effectiveness of the forecasting model, the whole year wind power sequence (96 point one day) obtained from a wind farm in Jiangsu province is used as the research object.The installed capacity of this wind farm in Jiangsu province is 49.5 MW, which contains 33 wind turbines.In this study, actual wind power data (5 days ago) is taken as the training sample.Then we establish wind power interval forecasting model for the next day in advance 15 minutes' forecast.

Table 3 :
Interval forecasting results of the EEMD-RT-RVM model.

Table 4 :
Comparison of the forecasting effects among the six models.

Table 5 :
Comparison of index results among four different days.