A Hybrid Approach for Short-Term Forecasting of Wind Speed

We propose a hybrid method for forecasting the wind speed. The wind speed data is first decomposed into intrinsic mode functions (IMFs) with empirical mode decomposition. Based on the partial autocorrelation factor of the individual IMFs, adaptive methods are then employed for the prediction of IMFs. Least squares-support vector machines are employed for IMFs with weak correlation factor, and autoregressive model with Kalman filter is employed for IMFs with high correlation factor. Multistep prediction with the proposed hybrid method resulted in improved forecasting. Results with wind speed data show that the proposed method provides better forecasting compared to the existing methods.


Introduction
Exponential increase in energy consumption globally is leading to rapid depletion of existing fossil fuel resources [1]. This impending scarcity has led the power industry to explore renewable energy sources such as wind, solar, and tidal energies [2,3]. Renewable energy resources attract more attention owing to their pollution free energy generation capabilities. Wind as a potential source for generation of electricity on a large scale has been receiving much attention recently. In China alone, the growth rate of wind farms was reported as 114% in 2009 with the total wind generation capacity of 25805.3 MW [4]. However, stable production of electricity from wind power is a quite arduous task due to the uncertainty and intermittency of wind speed. The increasing importance of wind energy, affected by variations in wind speed, necessitates accurate forecasting of wind speed.
In recent past, significant amount of research has been focused on forecasting the wind speed. However, due to the properties of wind speed such as nonstationarity, high fluctuations, and irregularity, accurate forecasting becomes a challenge. Generally, forecasting of wind speed is classified into two types: (1) short-term forecasting and (2) long-term forecasting. Short-term forecasting of wind speed affects grid reliability and market-based ancillary service costs [5,6], whereas long-term forecasting provides an idea about a particular site location [7]. The prediction models proposed in the recent past for wind speed prediction are categorized as physical models, time series statistical models, and knowledge-based methods. Each model has its own advantages and disadvantages. Physical models such as Markov models [8] require information regarding the temperature and climatic conditions to build the models. In time series statistical modeling, various techniques such as autoregressive moving average (ARMA) [9,10], autoregressive integrated moving average (ARIMA) [11], Kalman filter [12], model-based approaches [13][14][15], and Particle swarm optimization [16] are employed for prediction. Knowledgebased methods have been the widely adapted techniques for wind speed forecasting especially artificial neural networks (ANN) [17,18], radial bias function [19], fuzzy logic [20], and support vector machines [7].
Recently, hybrid methods based on divide and conquer principle are proposed for accurate forecasting of wind speed [21][22][23]. In these methods, wind speed data is decomposed into independent components. Later, each component is predicted with adaptive algorithms such as ARMA, SVM, or ANN. In [23], empirical mode of decomposition (EMD) was employed to decompose the signal into intrinsic mode functions (IMFs), and then ARMA model with fixed coefficients was employed to forecast the IMFs individually (EMD-ARMA). Further, SVM and ANN are employed to predict the IMFs in [22] and [21], respectively, to improve the forecasting performance. Different IMFs obtained from EMD posses  different frequency bands and characteristics. For instance, IMF-n in the lowest frequency band represents the central tendency of data, and IMF-1 is the highest frequency band and it mainly contains a large quantity of noisy signals. Although regression models are effective for time series prediction, owing to the highly nonstationary characteristics of few IMFs (high frequency IMFs), the prediction with regression models is not effective. On the other hand, the performance of machine learning techniques (SVM, ANN) for low frequency components may hamper due to the over fitting of data.
To overcome the limitations, in this paper, we propose a new hybrid approach for multistep prediction of the IMFs. Instead of employing a single adaptive algorithm for predicting the IMFs, we employed the combination of LS-SVM and AR for prediction of IMFs. Based on the partial autocorrelation factor (PACF) and the frequency characteristics of IMFs, the adaptive prediction algorithm will be identified. In the proposed hybrid approach, for high frequency IMFs (weak correlation factor), LS-SVM is employed, and for low frequency IMFs (high correlation factor) AR model with Kalman filter is employed. Results show that the proposed hybrid approach provides better forecasting performance compared to the existing methods.
The paper is organized as follows. In Section 2, brief description of LS-SVM, AR model with Kalman filter, and the proposed hybrid approach is discussed. Section 3 provides wind speed data collection procedure, obtained results, and implications. Section 4 concludes the paper.

Methodology
In this section, we first discuss the formulation of all the methods (EMD, AR model with Kalman filter, and LS-SVM), followed by the proposed hybrid approach. [24]. EMD has been a widely accepted method for decomposition of nonlinear and nonstationary signals. The basic idea of EMD is to identify the steady-state intrinsic oscillatory modes by employing Hung-Hilbert transform. The detailed procedure for EMD decomposition technique is well documented; for details see [24,25]. In Figure 1, the flowchart representation of EMD process is shown.

Empirical Mode Decomposition (EMD)
The process of EMD to decompose the signal ( ) is as follows.
(i) Step 1: initially, all extrema of ( ) will be identified by a cubic spline.
(iii) ℎ1( ) can be an IMF, if it obeys the following conditions.
(a) In the whole data series, the number of extrema and the number of zero crossings in a whole sampled data set must either be equal or differ at most by one. (b) At any point, the mean value of the envelope defined by the local maxima and the envelope defined by local minima is zero.
(iv) Step 3: if ℎ1( ) does not obey the above conditions, then ℎ1( ) will be considered as new signal and the same procedure from Step 1 will be followed.
(v) Step 4: if ℎ1( ) is an IMF, then residue ( 1( )) for ℎ1( ) will be calculated, 1( ) = ( ) − ℎ1( ). Consider 1( ) as new signal and same procedure from Step 1 will be employed. [26]. AR model is a type of random process which is popular for prediction of various types of natural phenomena. It is also one of the linear prediction methods designed to predict an output of a system based on the previous outputs and the regression coefficients (weights). In this paper, Kalman filter is combined with the AR model to update the weights to enhance the prediction quality.

Autoregressive Model (AR) with Kalman Filter
The Scientific World Journal 3 The state-space model of AR model of order can be given by Measurement equation: State equation: represents delayed inputs, and and are independent white noise process with Gaussian distribution and variance ( 2 ).
The Kalman filter update equations are given by [26] where represents prediction error, K represents Kalman gain vector, P represents error covariance matrix, Q represents covariance of state noise, and represents covariance of measurement noise. [27]. LS-SVM is the least squares version of support vector machine (SVM). In LS-SVM, the regression approximation addresses the problem of estimating a function based on given training data {s , } =1 with s as a -dimensional input vector and as the corresponding output. A brief formulation for LS-SVM is provided here; for more information see [27]. The regression model for LS-SVM can be given in the form:

Least Squares-Support Vector Machines (LS-SVM)
where is the weight vector and is the bias. The optimization problem for the function estimation with LS-SVM is defined as follows: subject to the constraints = (s )+ + ; = 1, 2, . . . , , where is a regularization constant and is the estimation error.
The Lagrangian function for the optimization problem can be given as where (⋅, ⋅) represents the Kernel function. RBF Kernel employed in this paper is (s, s ) = exp{−‖s − s ‖ 2 / 2 }.

The Hybrid EMD-LSSVM-AR Model.
The procedure for forecasting the wind speed with the proposed hybrid method comprises three stages as shown in Figure 2. In the first stage, owing to the nonstationary and stochastic characteristics of wind speed time series, the signal will be decomposed into meaningful local time scales by employing EMD [28].
In the second stage, prediction of all the decomposed components will be performed individually with either LS-SVM or AR model. The selection of adaptive algorithm for an IMF is based on the obtained PACF factor and frequency components of the corresponding IMF. LS-SVM is employed for weakly correlated IMFs and AR model for the highly correlated IMFs. In the final stage, the predictions are aggregated to attain the final forecasting result.

Results and Discussion
In this section, data employed for prediction is described. Following that, the indices were employed to evaluate the forecasting performance. Finally, performance analysis with the existing methods is discussed.

Wind Speed Data.
Wind data collected from Beloit, Kansas, from a 20-meter anemometer as an integral part of the Western area power administration anemometer loan program is employed for analysis in this paper. This data contains average wind speed and the direction in Beloit for the period 2003-2004. The data was originally made available by Wind Powering America, a DOE Office of Energy Efficiency & Renewable Energy (EERE) program. For illustration, the wind speed profile (in hours) of Beloit is shown in Figure 3. Two data sets are used in this paper for analysis.
(i) Mins data: in this data set, wind speed is recorded for every 10 mins. (ii) Hours data: in this data set, wind speed is recorded for every one hour.
In this paper, single-step prediction and six-step ahead prediction were performed on the two data sets separately (six-step ahead prediction of mins data refers to the same duration of single-step prediction of hours data). Comparative analysis is performed with the existing methods for the same data sets to highlight the advantage of the proposed hybrid approach.  The indices are defined as follows:

Evaluation of Forecast
where ( ) is the actual observation value for a time period and̂( ) is the forecast value for the same time period. The MAE reveals the average variance between the true value and forecast value whereas MAPE has a good sensitivity to small changes in data.

Performance Analysis.
In this subsection, forecasting of both mins data and hours data is performed with the proposed hybrid model. Further, a comparative analysis is also performed with EMD-AR and EMD-LSSVM. In Stage 1 of the proposed hybrid approach, the wind speed data (mins data) is decomposed into nine IMFs with EMD. Further, PACF for each IMF is computed independently. IMFs and the corresponding PACF are shown in Figure 4. Based on the obtained PACF, LS-SVM is selected for prediction of first two IMFs (IMF-1 and IMF-2) and for the rest seven IMFs; AR model with Kalman filter is selected in Stage 2. Using trail-and-error method, the parameters of LS-SVM are initialized as = 100, = 50, and = 1000. Based on PACF, second order was identified for AR model. In Stage 3, aggregation of all the IMFs prediction is performed to obtain the final forecasting results.
For hourly forecasting with mins data, six-step ahead prediction is performed. Results obtained for six-step prediction and single-step prediction with the proposed method along with the existing methods are tabulated in Table 1 data for one hour ahead forecasting. The procedure employed for performing the prediction of IMFs is similar to sixstep prediction procedure. With the proposed method for single-step prediction, MAE of 0.016 was obtained. With EMD-AR and EMD-LSSVM, MAE obtained was 5.8 and 4.8, respectively. For illustration, the forecasting results for all the three methods for six-step ahead prediction are shown in Figure 5.
To highlight the robustness of the proposed method, six-step ahead prediction with the proposed method is performed. Results obtained are tabulated in Table 1. Results show that the proposed method provides better forecasting compared to the existing methods. For illustration, sixstep ahead forecasting results with hours data for all three methods are shown in Figure 6

Conclusions
In this paper, a hybrid approach that is a combination of EMD, LS-SVM, and AR model-Kalman filter is developed for wind speed forecasting. The data was first decomposed into IMFs based on the PACF, and then multistep prediction was performed with LS-SVM for some IMFs and AR-Kalman filter for the rest of IMFs. With the proposed method, six-step  The Scientific World Journal 7 ahead forecasting for both mins data and hours data was performed. A comparative analysis with existing methods EMD-AR and EMD-LSSVM highlights the advantages of the proposed method. Results show that the proposed hybrid approach provides better forecasting compared to the existing approaches. Upper envelope ( ):

Nomenclature
Meanoftheenvelope and : State noise and measurement noise K : K a l m a ng a i nv e ct o r P : Error covariance matrix : Variance of measurement noise : Number of training samples (⋅): Nonlinearmapping : Regularization constant : Estimation error with LS-SVM : P r e d i c t e do u t p u tw i t hL S -S V M .