Multistep Wind Speed Forecasting Based on Wavelet and Gaussian Processes

Accuratewind speed forecasts are necessary for the safety and economyof the renewable energy utilization.Thewind speed forecasts can be obtained by statistical model based on historical data. In this paper, a novel W-GP model (wavelet decomposition based Gaussian process learning paradigm) is proposed for short-term wind speed forecasting.The nonstationary and nonlinear original wind speed series is first decomposed into a set of better-behaved constitutive subseries by wavelet decomposition.Then these subseries are forecasted respectively by GPmethod, and the forecast results are summed to formulate an ensemble forecast for original wind speed series. Therefore, the previous process which obtains wind speed forecast result is named W-GP model. Finally, the proposed model is applied to short-term forecasting of the mean hourly and daily wind speed for a wind farm located in southern China.The prediction results indicate that the proposedW-GPmodel, which achieves a mean 13.34% improvement in RMSE (Root Mean Square Error) compared to persistence method for mean hourly data and a mean 7.71% improvement for mean daily wind speed data, shows the best forecasting accuracy among several forecasting models.


Introduction
Wind power is one of the fastest-developing renewable energy sources of which the current total capacity around the world is approximately 282 gig watts (GW) till the end of 2012, with a growing rate around 20% [1]. However, the variable and uncontrollable characteristics of wind pose several operational challenges. Thus, the wind power forecasting is an essential process for the wind farm units' maintenance and energy reserves scheduling [2,3].
Since wind power is a function of wind speed, the wind power forecasts basically depend on wind speed forecasts. The short-term wind speed forecasting, of which the prediction horizon is from 1 hour to 3 days, is critical to minimize scheduling errors which impact grid reliability and market based ancillary service costs. Broadly speaking, there are two statistical approaches for short-term wind speed prediction: regression models contain Numerical Weather Prediction (NWP) data as inputs and time series based methods which only use historical data to obtain prediction results. The former downscales information from global meteorological model to wind farm location therefore got inputs to regression model which estimate the future wind and have advantages in multihour (from several hours to dozens of hours) ahead prediction [4][5][6], while the latter use only historical wind speed data to build models and perform better in multistep ahead (usually 1-4 hours ahead for mean hourly wind data or several days ahead for mean daily data) prediction. Since the combination of NWP data requires long computational time (usually several hours) and does not show advantage in multistep ahead forecasts, models built only based on historical data are preferred for this prediction horizon. Time series based models for multistep wind forecasting have been investigated deeply and developed based on different methodologies, such as autoregressive integrated moving average [7,8], Kalman filter [9,10], artificial neural network [11,12], and support vector machine methods [13].
Recently, wavelet decomposition method has been applied to establish different hybrid wind speed forecasting models. The main contribution of wavelet transform is to decompose and reconstruct a wind speed series into a set of better-behaved constitutive series. Then each sub-series can be separately predicted by a suitable model according to its feature; hence, the new hybrid model improves the forecasting accuracy. Wavelet combined methods can be found in [14][15][16]. Liu et al. proposed a hybrid method based on the methods of wavelet and classical time series analyses to predict short-term wind speed and wind power, which gained more accurate simulation results than classical time series method and BP network method, especially with multistep forecasting and jumping data [14]. Catalão et al. presented a novel forecasting method by combining artificial neural networks with wavelet transform. Results from a realworld case study, which used wind speed data in Portugal, show the efficiency of this model [15]. An et al. proposed a prediction model for wind farm power forecasting by combining the wavelet transform, chaotic time series, and GM(1,1) method [16].
As an effective statistical method, Gaussian Process (GP) has been applied broadly in many domains, including wind energy prediction. Jiang et al. focused on very short-term (<30 min) wind speed prediction using GPs [17]. They evaluated their model on real-world datasets and found that the GP performs better than ARMA (a simpler variant of ARIMA) and Mycielski algorithms [18].
In this paper, a novel hybrid forecasting approach is proposed based on wavelet method and Gaussian Processes (GPs) for multiple steps ahead wind speed prediction. Compared with earlier work, this paper has the following contributions. First, the novel combination of wavelet method and GP (W-GP model) managed to improve forecasting accuracy, especially when forecast step grows. Second, lots of simulation work was done to determine the best W-GP model by comparing the forecasting error of several W-GP models decomposed by different levels. Third, not only mean hourly wind speed but also mean daily wind speed is forecasted by proposed model in this paper.
Actual wind speed data of 1 year from a wind farm in southern China is used to examine the proposed W-GP model. The proposed model is compared with persistence, MLP (Multilayer Perceptron) neural network, and the original GP approaches to demonstrate its effectiveness regarding forecast accuracy. The forecasting results are given and discussed hereinafter. This paper is organized as follows. Section 2 presents the proposed W-GP approach to forecast wind speed. Section 3 first presents a case study of detailed forecast process based on W-GP model and then introduces three different criterions used to evaluate the forecasting accuracy, based on which the most suitable model for our database is finally chosen. Section 4 presents the large-scale simulation results for a real wind speed data set from a wind farm in southern China. Finally, Section 5 outlines the conclusions.

The Proposed W-GP Model
The proposed W-GP approach to forecast short-term wind speed is based on the hybrid of GP with wavelet method. The wavelet method is used to decompose the original wind speed series into a set of sub-series which can be analyzed easier. Then, GP method is used to forecast the future values for all those sub-series. In turn, through the inverse wavelet decomposition, finally the wind speed forecast value can be obtained by aggregating the forecast value of sub-series.

Wavelet Method.
The wavelet method used here is to decompose a wind speed series into a set of sub-series. With the filtering effect of the wavelet decomposition, these subseries present a better behaviour than the original wind power series and therefore can be analyzed clearer and predicted more accurately.
Wavelet method can be divided in two categories: continuous wavelet transform (CWT) and discrete wavelet transform (DWT) [19]. In CWT, a wavelet can be defined as function ( ) with a zero mean: A signal can be decomposed into many series of wavelets with different scales and translation : Thus, the wavelet transform of a signal ( ) at translation and scale is defined by The original signal ( ) can be reconstructed by inverse wavelet transform: Different from CWT, when the mother wavelet is scaled and translated using certain scales and positions, it is known as the DWT, which is more efficient and just as accurate as the CWT. The definition is as follows: where is the length of the signal ( ). The functions of the integer variables and ( = 2 , = ⋅ 2 ) are scaling and translation parameters, and is the discrete time index.

Gaussian Process.
Recently, there has been much activity concerning the application of Gaussian process to machine learning tasks. The systematic and detailed explanation of Gaussian process regression can be found in Rasmussen's book [20]; here we only provide a brief illustration on GP applied in regression. A Gaussian process ( ) can be completely specified by its mean function and covariance function, written as ( ) ∼ GP( ( ), ( , )), where the mean function ( ) and covariance function ( , ) are defined as follows: Usually, the mean function is assumed to be zero, and the target variables are normalised to have zero mean.
Consider GP in a classic regression problem, assume represents a feature vector of input space with dimension , represents the output value to be estimated, based on the training vectors' set = {( , ), = 1, . . . , }, and the key point of GP regression method is to model the relationship between inputs and targets, that is, to build a function to satisfy The observed value is assumed to be different from the function value ( ) by additive noise , which is assumed to be an independent and identically distributed Gaussian distribution with zero mean and variance 2 , that is, ∼ (0, 2 ). Note that is a linear combination of Gaussian variables and, hence, is itself Gaussian. The prior on becomes as follows: where is a matrix with elements = ( , ), which is also known as the kernel function.
Given a training set = ( , ), our goal is to make predictions of the target variable * for a new input * . Since we already have ( | , ) = (0, + 2 ), the distribution with new input can be written as follows: where ( , * ) = [ ( 1 , * ), . . . , ( , * )], and it can be shortened to * . Then according to the principle of joint Gaussian distribution, the prediction result of target is given by Now, the whole regression model based on Gaussian process is completed.
In the GP model for wind speed forecasting, we define the wind speed series as { , = 1, . . .}, then the input vector is constructed as { , +1 , . . . , + −1 }; correspondingly the output value is + . Therefore, the GP model can estimate the wind speed value at next time period based on historical data.

Process of W-GP Modelling.
Most researches use symmetric WTs such as Symlet or Morlet for decomposition. However, this type of WT is not suitable for forecasting problem because in symmetric wavelet future information is also needed as well as previous information [19]. In this paper, a wavelet function of type Daubechies of order 4 (abbreviated Time series data Wavelet decomposition  as Db4), which is an asymmetric WT, is used to solve this problem. As shown in Figure 1, the modelling steps of the proposed W-GP method are described as follows.
(1) Use wavelet method to decompose original wind speed data series into a number of different sub-series (depending on level of decomposition) which can be analyzed and separately predicted. Denote these subseries as 1 , . . . , .
(2) Build the prediction models for each sub-series based on GP method and estimate the multistep forecasting results.
(3) Through the inverse wavelet decomposition, attain final forecasting results for original wind speed series by aggregating the forecasting value of sub-series.
(4) Calculate the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE) of forecasting results.

A Case Study of W-GP Model
A real-world dataset based on wind farm is used in this paper to evaluate our approach. The dataset is from a wind farm in Fujian province, a coastal area located in very southern China, where wind source is sufficient and variable, and the integration of wind farm is important. As shown in Table 1 turbine height is 80 m. With wind speed being measured consistently on different heights (70 m, 35 m, and 10 m) by a wind tower located in the centre of the wind farm, the data from the 70 m high sensor is chosen for calculation, since it is the nearest to turbine height. The original measured data has an interval of 10 mins, which we averaged into mean hourly and daily data for different forecast targets.
To illustrate the specific process of modelling and analyze the performance of W-GP model, in this section, we use the 1st-400th hours' data of each month as training set to build the model, the later ones as test set, and we obtain multihour ahead forecasting results.

Wavelet Decomposition.
According to Section 2, if Daubechies-4 wavelet is employed to do level discrete wavelet decomposition in simulation, the original wind speed series would be decomposed to + 1 sub-series: one low-frequency section and other high-frequency series. To facilitate the latter modelling, the low-frequency section is recorded as { 1 } series; correspondingly, the other high-frequency sub-series are recorded as Here, we decompose the actual wind speed data in February 2012 to the 3rd level as an example. The original wind speed series is shown in Figure 2, while the sub-series ({ 1 }, . . . , { 4 }) are plotted in Figure 3.
As shown in the figures previously, the original wind speed series is decomposed into a set of better-behaved constitutive series. Therefore, it is easier for sub-series to obtain better performance in forecasting and eventually get results with higher accuracy.

Wind Speed
Forecasting. According to Section 2.3, after obtaining the sub-series by wavelet method, the prediction models for each sub-series should be built based on GP method. Then the final forecast value of original wind speed series can be attained by aggregating the forecasting results of sub-series.
Meantime, the basic GP model is applied on the same original wind speed data to obtain 60 forecast values. By subtracting from the actual wind speed value, the absolute value of forecasting error can be calculated. The comparison of the GP model and W-GP model's forecasting errors at each hour is shown in Figure 4(b).
As shown in Figure 4, it can be observed that W-GP model performs a better wind speed forecasting than the basic GP model at most of the time.
Since now we have one-hour-ahead forecasting value, it could be useful in the case of attaining several-hour-ahead forecasting value, based on { 1 (2), . . . , 1 (400),̂1 (1)} series to rebuild the GP model and calculate 1-hour-ahead forecasting again, that is, the 2-hour-ahead forecasting for likewise.
The performance of multihour forecasting by proposed W-GP model can be observed in Figure 5, of which the upper part displays 2-hour-ahead forecasting curve, while the lower part presents the corresponding forecast error (absolute value).
It is clearly seen that at most of the forecast points, the W-GP model decreases forecast error of GP model, even more obvious than 1-hour-ahead prediction displayed previously. Therefore, it is reasonable to say that the hybrid W-GP model is effective and applicable in wind speed forecasting problem and has greater advantage when forecast step grows.

Forecasting Accuracy Evaluation.
Clearly, accuracy is the most important criterion to compare the efficiency of alternative forecasting approaches. Therefore, different criteria are used here to evaluate the accuracy of the proposed approach. This accuracy is computed in function of the actual wind speed that occurred.
Three forecast error measures were employed for model evaluation and model comparison: the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE). The error is defined as follows:  where represents the actual observation value at hour and represents the forecast value for the same period. is the number of forecasted hours.

Comparison of Different Decomposition Levels.
Currently, there is still no specific principle of determining wavelet decomposition level of W-GP model yet, though various decomposition levels of proposed forecast model may lead to different forecasting performances. Therefore, we should recognise the most adequate decomposition level for our database by analyzing simulation results.
As shown in Figure 6, it is obvious that the forecasting error is highly relevant to decomposition level. Furthermore, the error is the minimum while the discrete wavelet decomposition level of Daubechies-4 wavelet is 3. Therefore, in this paper, the 3 level decomposed W-GP model is applied through the whole simulation process.

Wind Speed Forecasting Results
The actual measured wind speed data from a wind farm in southern China through a whole year has been applied in simulation. Since sometimes the wind speed data may be invalid because of anemometer fault, a preprocess was taken to modify invalid speed data by interpolation, after which the whole dataset contains 2 sub-sets: D1, 8700 points of mean hourly data and D2, 362 points of mean daily data. In order to reasonably testify the performance of proposed model, the sub-set D1 was divided into 12 parts; each part was separately modeled, using the first 400 data as training set and the rest as test set. Similarly, the sub-set D2 was divided into 2 parts, using the first 160 data of each part as training set, with the rest as test set. The short-term forecasting results, of both mean hourly and mean daily wind speed, are presented and analyzed in the following. Besides the basic GP method, the other models represented as comparison in this paper are persistence and MLP methods. Persistence method is a quite simple method which only uses the current value as forecast and is impressively effective for short-term prediction and therefore is considered as the most classical benchmark in wind forecasting area. The MLP network is a very popular machine learning method and has been applied in wind power forecasting widely. We established MLP model based on the same wind speed dataset and obtained simulation results to compare with the proposed W-GP model. According to Section 3.3, the evaluation index applied in this paper is the root mean square error (RMSE), the mean absolute error (MAE), and the mean absolute percentage error (MAPE). The comparison among the forecasting accuracy of persistence method, MLP, basic GP, and W-GP models is shown in Tables 2 and 3.  Table 2 shows the forecasting errors of persistence method, MLP, basic GP, and proposed W-GP models for mean hourly wind speed data. It can be observed that though Mathematical Problems in Engineering 7 forecast performance gets worse when prediction time grows, normally basic GP method shows a better forecast accuracy than MLP model, and comparing with GP method, the proposed W-GP model presents positive error improvements over the entire forecast horizon: With respect to the basic persistence method, the improvement of W-GP model in RMSE, computed by formula (13), ranges from a maximum of 17.02% to a minimum of 4.22% with a mean value of 13.34%. Table 3 shows the forecasting errors of the same models for mean daily wind speed data. Here, the MLP model works the worst, which is understandable because of the small scale of training set. Fortunately, Gaussian process based models maintain a stable performance without needing largescale dataset to train appropriate parameters that is why the proposed model still shows an obvious advantage than the other forecasts. Comparing with the basic persistence method, the improvement of W-GP model in RMSE, computed by formula (13), ranges from a minimum of −0.26% to a maximum of 11.16% with a mean value of 7.71%. Although for 1-day-ahead forecast, the W-GP model is slightly worse than persistence method, the other results with larger forecast step strongly reveal the efficiency of proposed W-GP model.
Since the wavelet method decomposes the original wind speed series into a set of better-behaved constitutive series, the proposed W-GP model achieves a higher level of accuracy at short-term wind speed forecast. What is more, the improvement approached by wavelet decomposition is getting more obvious as forecast step grows. However, as the prediction time grows, the forecast accuracy of each model decreases severely. Anyway, it is completely natural that models based only on historical data would behave this way considering the uncontrollable and unstable inhesion of wind.
Though due to the variable nature of wind, the forecast performance of all the models listed in the tables decade as the predict time grows, the proposed W-GP model continuously shows a better forecasting accuracy and, eventually, represents an obvious advantage.

Conclusion
In this paper, first, a novel historical forecast model is proposed based on the wavelet method and Gaussian Process (GP) method in order to predict multistep ahead wind speed. Second, based on analysis of W-GP model's forecasting error series, an appropriate level of wavelet decomposition is chosen to get the most accurate model. Finally, real-world dataset is applied with the proposed model to validate its efficiency.
The simulation results convincingly reveal the effectiveness and accuracy of proposed model for short-term wind speed forecasting, which achieves a mean 13.34% improvement in RMSE comparing to persistence method for mean hourly data and a mean 7.71% improvement for mean daily wind speed data.
Considering the unstability of wind, there must be a limit on single historical methodology. A tendency of future method is combinatorial models, of which the effective way of combination between methods is worth more research.