Short-Term Wind Speed Forecasting Using the Data Processing Approach and the Support Vector Machine Model Optimized by the Improved Cuckoo Search Parameter Estimation Algorithm

Power systems could be at risk when the power-grid collapse accident occurs. As a clean and renewable resource, wind energy plays an increasingly vital role in reducing air pollution and wind power generation becomes an important way to produce electrical power. Therefore, accurate wind power and wind speed forecasting are in need. In this research, a novel short-term wind speed forecasting portfolio has been proposed using the following three procedures: (I) data preprocessing: apart from the regular normalization preprocessing, the data are preprocessed through empirical model decomposition (EMD), which reduces the effect of noise on the wind speed data; (II) artificially intelligent parameter optimization introduction: the unknown parameters in the support vector machine (SVM) model are optimized by the cuckoo search (CS) algorithm; (III) parameter optimization approach modification: an improved parameter optimization approach, called the SDCS model, based on the CS algorithm and the steepest descent (SD) method is proposed.The comparison results show that the simple and effective portfolio EMD-SDCS-SVM produces promising predictions and has better performance than the individual forecasting components, with very small root mean squared errors and mean absolute percentage errors.


Introduction
The demand for clean and renewable energy resources has increased significantly since the acid emissions and air pollution caused by burning fossil fuels have heavily polluted the world environment.As a clean and renewable resource, wind energy plays an increasingly vital role in energy supply and wind power generation becomes an important way to generate electrical power.However, the stochastic fluctuation of wind makes it problematic to forecast [1][2][3].Therefore, effort to improve the accuracy of wind speed forecasting continues so as to lower the possibility of the power-grid collapse accident occurrence.
Wind speed forecasting is an important foundation and prerequisite for the prediction of wind power generation.The more accurate wind speed forecasting result can reduce wind rotating equipment and operation cost and improve limitation of wind power penetration.At the same time the precise prediction of wind speed helps dispatching department timely adjustments to the program, so as to reduce the impact of wind power on the grid and effectively avoid the adverse effect of wind farm on the power system, enhancing the competitiveness of wind power in the electricity market.
In literature studies, statistically based and neural network-based methods are two models pervasively used to forecast the wind speed [4][5][6][7].With the development of artificial intelligent techniques, some artificial intelligent methods have been presented, such as Artificial Neural Networks, fuzzy logic methods, and support vector machine.Guo et al. [8] presented a wind speed strategy based on the chaotic time series modeling technique and the Apriori algorithm.Barbounis et al. [9] employed three different types of neural network (NN) models to forecast the hourly wind speed (up to 3 days) in a wind park located on the Greek island of Crete.However, there are several unknown parameters in the NN model.Thus, many researchers have indicated the need to optimize the parameters in the NN model to improve wind speed forecasting accuracy.Wang and Hu [10] improved the performance of the back propagation (BP) NN model in the wind speed forecasting field by optimizing the parameters in the BP model.Both models, that is, the statistical and the NNbased models, have been used by Azad et al. [11] to solve the long-term wind speed forecasting problem for two stations in Malaysia.However, wind speed forecasting results obtained by the neural network models are not always superior to those obtained by other models.Chen and Yu [12] developed a new model by integrating the unscented Kalman filter with the support vector regression-based state-space model.Comparison results indicated that the new proposed model outperforms the NN model.Apart from the NN models, the parameter optimization strategy has also been applied to other wind speed forecasting models.Gani et al. [13] proposed that firefly algorithm combines with SVM algorithm for a problem of short-term wind speed forecast, where firefly algorithm is used to optimize the parameters of SVMs and successfully obtain the accuracy forecasting result.Compared with artificial intelligent models, statistical approaches are less expensive and intrusive and, hence, more practical in forecasting wind power generation.Statistical models are widely used to forecast model for short-term wind forecasting, predicting wind conditions several hours in advance, which is particularly useful for wind power generation [14].But for the nonlinear wind speed time series is often not satisfactory, especially in multistep prediction, and the error will be significantly increased with the extension of the prediction time.The new paradigm of big data stream mobile computing is quickly gaining momentum [15], while wind speed forecasting results have been applied to many different areas [16].
It is found that the existing wind speed forecasting models have the following disadvantages: (1) some of the existing models have taken no account of the randomness, instability, and the large fluctuation of the wind speed data, which may lead to a high forecasting error.Therefore, in this research, a model based on the ensemble empirical mode decomposition (EEMD) technique is utilized to adaptively decompose the original wind speed data into a finite number of intrinsic mode functions with a similarity property to modeling.(2) The existing traditional parameter estimation methods, such as the moment estimation or the likelihood estimation, are not dynamic and need to solve some equations with a great deal of calculations.Therefore, the artificial intelligent parameter estimation method named the cuckoo search (CS) algorithm is used in this paper to estimate the unknown parameters in the forecasting model.(3) Though some researchers applied the artificial intelligent parameter estimation approaches to the parameter estimation, they just adopted the original approach without considering the deficiency of the approach.Thus, in this paper, the steepest descent (SD) method is used to optimize the CS algorithm so as to enhance the convergence rate.Based on the above motivations, in this research, a new short-term wind speed forecasting portfolio which not only can maintain the characteristics of the wind speed data but can also automatically estimate the unknown parameters in the forecasting model with a considerable convergence rate has been proposed through the following three procedures: (I) data preprocessing: apart from regular normalization preprocessing, the data are preprocessed through the EMD model, which reduces the effect of the noise on the wind speed data; (II) artificially intelligent parameter optimization introduction: the unknown parameters in the support vector machine (SVM) model are optimized by the cuckoo search (CS) algorithm; (III) parameter optimization approach modification: although the original CS algorithm is simple and efficient, it has disadvantages such as insufficient search vigor and slow search speed during the latter part of the search.Therefore, this paper proposes an improved parameter optimization approach based on the CS algorithm and the steepest descent (SD) method, which is abbreviated as the SDCS model.The performance of the developed EMD-SDCS-SVM model has been compared with those obtained by the individual forecasting components using the following two error evaluation criteria: the root mean squared error and the mean absolute percentage error.
The paper is organized as follows: Section 2 introduces related methodologies, Section 3 presents the simulation examples and discussions, and the last section presents concluding remarks.

Related Methodologies
2.1.Data Preprocessing Approach.Data preprocessing is a common way to improve forecasting accuracy, especially for data with high noise and different scales.This paper focuses on handling these two problems by using the EMD model and the normalization preprocessing approach, respectively.

Empirical Mode Decomposition
Model.The EMD model is an adaptive decomposition approach proposed by Baccarelli et al. [15].It is used in a wide range of applications, especially in dealing with nonlinear time series.The EMD model decomposes the original time series into several different sequences with different scales (also called the intrinsic mode function (IMF)) as well as a residual sequence.All IMFs must satisfy two requirements: (a) The number of extreme points (all maximum and minimum points are included) must be equal to the number of zero crossings or differ by no more than one.
(b) In all cases, the average of the envelopes defined by the local maxima and minima must be zero.
With the above two limitations, a signal sequence () can be decomposed with the assistance of the EMD method [16] through the following steps.
Step 1. Calculate all the local extrema (including all the minimum and maximum values).
Step 2. Connect the local maxima by a cubic spline line to generate the upper envelope and similarly produce the lower envelope by connecting all the local minima with a cubic spline interpolation, represented by  upper and  lower , respectively.
Step 3. Calculate the average value of the two envelopes  1 by Step 4. Calculate the difference (ℎ 1 ) between the data and  1 by Step 5. Judge whether ℎ 1 satisfies the two requirements of the IMFs.If not, regard ℎ 1 as the original signal sequence; then ℎ 11 = ℎ 1 −  11 .Repeat this process  times until ℎ 1 which is calculated by ℎ 1 = ℎ 1(−1) −  1 is an IMF.The first IMF sequence is obtained by Step 6. Calculate the first residual sequence according to Step 7. Regard  1 as the raw data and return to Step 1 to repeat this procedure unless the final residue   turns into either a monotonic function or a function from which no more IMF sequences can be extracted.Finally, the original signal sequence is decomposed into

Normalization Preprocessing.
To improve the training efficiency and the generalization ability of the SVM model, normalization preprocessing is used to address the IMF sequences obtained by the SVM model.Normalization preprocessing is defined as follows: where  and  processed represent the original data sequence and the preprocessed data sequence, respectively, and  min and  max denote the minimum and the maximum data in the original data sequence, respectively.

Support Vector Machine
Model.The SVM model is the core of statistical machine learning theories.It can surmount difficulties that appear in the traditional machine learning methods, such as the curse of dimensionality, easily falling into local optima and overlearning.In addition, it has great generalization ability [17].Therefore, the SVM model has long been an attractive tool with powerful capabilities in solving classification and regression problems.In this paper, we mainly focus on the SVM model for regression.Suppose that there are  in-sample data points (or training samples) ( 1 ,  1 ), ( 2 ,  2 ), . . ., (  ,   ) where   ∈   denotes the input vector and   ∈  is the targeted output corresponding to the input vector   .The main purpose of the SVM for regression is to find a function () which satisfies the following two requirements: (a) the deviation between (  ) and   is no greater than a given positive real number , for all  = 1, 2, . . ., , and (b) () is as flat as possible.In the SVM algorithm,  is defined by the formula where  : subject to This calculation results in where (⋅, ⋅) is called the kernel function.The following four types of kernel functions are commonly used [18,19] [20] in 2007.It is derived from the action of cuckoos laying their eggs in the nests of other birds to let those birds hatch eggs for them.However, once the host birds discover the cuckoo eggs, these eggs will be thrown away or the host birds will abandon their nests and rebuild a new nest elsewhere.The CS algorithm is constructed based on three assumptions: (a) Only one egg is laid by each cuckoo in a randomly selected nest; (b) The best nests will be carried over to the following generations; and (c) The number of available host nests is a constant, and the probability value of an egg laid by a cuckoo being discovered by the host bird is   which has the range of 0 to 1.
In the CS algorithm, each nest represents a solution.The pseudo code of the CS technique [21] presented in Algorithm 1 can aid in understanding the CS process.
The Lévy flight mentioned in the pseudo code of Algorithm 1 is generated according to: where  > 0 is the step size, which should be related to the scale of the problem of interest.The product ⊕ indicates entry-wise multiplication location.A Lévy flight is considered when the step-lengths are distributed according to the following probability distribution: which has an infinite variance.Here, the consecutive steps of a cuckoo search essentially form a random walk process that obeys a power-law step-length distribution with a heavy tail.

Modified Cuckoo Search Method.
Similar to other metaheuristic algorithms, the original CS algorithm is simple and efficient; however, it has disadvantages such as insufficient search vigor and slow search speed during the latter part of the search.As one of the oldest optimization algorithms, the steepest descent (SD) method [22] is simple and intuitive.Currently, there are many effective optimization algorithms established on the basis of this algorithm.In order to overcome the CS's shortcoming of slow convergence rate, the SD method is used to modify the CS algorithm, and the modified model is abbreviated as the SDCS model.In the SDCS model, the following equation substitutes for (13): where    is defined by The SDCS process can be expressed by the following procedures.
The step size and step-length distribution function of the CS algorithm can be improved by using steepest descent due to its simplicity and flexibility.The final optimal solution can be obtained by modifying the step size and step-length distribution function constantly.

Proposed Novel Model.
Based on the above methodologies, we propose a novel short-term wind speed forecasting portfolio with three steps (Figure 1): (I) data preprocessing: both the regular normalization preprocessing model and the EMD approach are used for data preprocessing, which reduces the effect of noise and different scales on the wind speed data; (II) artificially intelligent parameter optimization Choose the optimum from the random solution introduction: the unknown parameters in the SVM model are optimized by the CS algorithm; (III) parameter optimization approach modification: although the original CS algorithm is simple and efficient, it has disadvantages such as insufficient search vigor and slow search speed during the latter part of the search.Therefore, this paper proposes an improved parameter optimization approach based on the CS algorithm and the steepest descent (SD) method, which we call the SDCS model.The final forecasting model is called the EMD-SDCS-SVM model.
The performance of SVM depends on a good set of parameters, including the penalty parameter and the parameter of the kernel function.The parameter adjustment and selection of support vector machine is still a difficult issue in the research field.The generalization performance of support vector machine is closely related to the selection of specific parameters in the model.The parameter of penalty coefficient  and kernel parameters  must be selected by the users.However, in practical applications, the forecasting complexity control is more difficult, because the parameters of  and  must be adjusted simultaneously.
(1) The Penalty Coefficient .The penalty coefficient  is to balance the model between the complexity and the training error, so that the model has better extending ability.Furthermore, the parameter  can control the robustness of the forecasting model.The different training groups have different optimal values.For forecasting problems, if the parameter  is smaller, the punishment for miscalculation samples in the sample data is smaller.As a result, the training error becomes larger, and the system's generalization ability is poorer.When new data is forecasted by the model, the fitting error will be very high, and the phenomenon of "less learning" will appear.On the contrary, if the parameter  is too large, the weight of (1/2)‖‖ 2 will be smaller.Although the fitting error of the available data is very low, the fitting error of the new data is also very high.It is the so-called "overlearning" phenomenon.The generalization ability of the model is still very poor.Each sample data group has at least one suitable , which makes the SVM generalization performance the best.Therefore, the correct choice of parameter  can improve the prediction accuracy of the model.
(2) The Kernel Function .For the kernel function of the SVM, the linear kernel function, polynomial kernel function, radial basis kernel function, and sigmoid function are usually the most used.The width  of radial basis function is the same to all kernel functions, and  can reflect corresponding width of inner product kernel for input.If  is too small, it will lead to overfitting or memory of the training group.If  is too large, it will make SVM discriminant function too gentle.Width of kernel function  and the penalty coefficient  affect the shape of prediction curve of the support vector machine from different angles.In practical applications, too large or

Modified
Step 4 Get the optimal solution Step 2 Calculate ∇f(x k ) < Step 3 Take p k = −∇f(x k )

Steepest descent
Step too small penalty coefficient  and kernel function  will make the generalization performance of the support vector machine worse.
Based on the analysis of influences of each parameter on the performance of SVM, we put forward the time series forecasting model by using modified cuckoo search (SDCS) algorithm to optimize SVM parameters.It not only maintains the characteristics of time series, but also can select the parameters of SVM automatically, which eliminates the blindness and randomness caused by artificial selection.The main procedures of this EMD-SDCS-SVM are as follows.
Procedure 1. Collect wind speed time series data.Use the EMD to preprocess the wind speed data and reconstitute the new wind speed time series, which will be treated as the training sample of the SVM model.Procedure 2. Determine the range of  and , the maximum step (step max ), the minimum step (step min ), and the maximum number of iterations  max .Set the probability of an egg laid by a cuckoo being discovered by the host bird as   = 0.25, and initialize the number of the host nests as  = 25.Each nest corresponds to a two-dimensional vector (, ).Procedure 3. Search the optimum value of the two-dimensional vector (, ) according to the SDCS algorithm, and the detailed steps that need to be implemented in this procedure are shown in Figure 2.

Data Division and Parameter Initialization.
Wind speed data recorded by four wind turbines (numbered #1, #2, #3, and #4) during the period from Jan 2, 2011, to Jan 6, 2011, with a time resolution of 10 minutes are used to verify the effectiveness of the new proposed hybrid model.The data from Jan 2 to Jan 5 are adopted as the in-sample data (i.e., training data), while those on Jan 6 are used as the out-of-sample data (i.e., testing data).
Step 1.The original wind speed series are decomposed into a high-frequency component and a low-frequency component, Step 4 Get the optimal solution Step 2 Calculate Step 3 Take p k = −∇f(x k )

Steepest
Step   which represents the noise signal and main features of the wind speed series (see Figures 3(a)-3(c)).
Step 2 (data splitting and normalization).The available wind speed series after noise reduction are split into the training set and the test set, which are denoted by including input sets and output sets for training parameters of SVM and consisting of inputs and outputs for the testing model's forecasting effectiveness, respectively.To establish the model, the training datasets and the input test sets are normalized with the same setting (see Figure 3(d)).
Step 3 (initialization: a SVM with two parameters).The penalty coefficient  and the kernel function  are shown in Figure 3(e).The number of connection weights of the SVM is the size of the cities in the SDCS algorithm, namely, the dimension of the optimized parameters.
Step 4 (optimization).The objective function of the SDCS algorithm is given as follows: Step 5 (SVM construction).The best solution obtained by the SDCS algorithm is set to be the final connection weights of SVM training and construction.The terminal condition of network training is set to be the reach of maximum iterations or no further improvement (see Figures 3(d)-3(e)).
Step 6 (EMD-SDCS-SVM construction for the test dataset).The forecasting data of the output test sets are generated by importing the input test sets based on the established optimal SVM (see Figure 3(e)).
Step 7 (evaluation).The quality of the EMD-SDCS-SVM is assessed by the indices SDCS and SVM, which presents the validity and informativeness of EMD, respectively.With the aim of comprehensive evaluation, MAPE is calculated as well.
To employ the methodologies introduced in Section 2 of this paper, the parameters contained in the models are initialized as follows: in the CS algorithm, the number of the host nests is initialized as  = 25, and the probability of an egg laid by a cuckoo being discovered by the host bird is given as   = 0.25.The Gaussian kernel function is chosen for the SVM method.In the GA algorithm, the maximum number of iterations is initialized as 50, and the population size is 100.The probability of cross is 0.3 and the probability of mutation is 0.1.When the CS algorithm and GA are adapted for SVM

Data Preprocessing Results.
Wind speed data are first preprocessed by the EMD method.Figure 4 shows the IMFs and residue results obtained by the EMD method for the four wind turbines.As indicated in Figure 4, for the #2 and #3 wind turbines, 7 IMF sequences are extracted from the original wind speed training dataset, while 6 IMF sequences are extracted for the other two wind turbines.According to the principle of denoising, eliminating the high-frequency sequence from the IMF sequences can assist in obtaining cleaner data sequence, that is, data sequence with lower noise.For this paper, the first IMF sequence obtained by the EMD method is eliminated from the original data sequence to improve the accuracy of wind speed forecasting.The visualization of the denoise preprocessing of the EMD method of the four wind turbines is shown in Figure 5.The final results after denoise processing with the EMD method and the normalization operation are also presented in Figure 5.

Forecasting Results.
To validate the effectiveness of the EMD-SDCS-SVM model in wind speed forecasting, the model is used to forecast wind speed with four horizons: 1-step-ahead, 2-step-ahead, 4-step-ahead, and 6-step-ahead.The forecasting results obtained by this model are compared with those obtained by the nonparameterization method EMD-SVM, the unmodified parameterization method EMD-CS-SVM, and another parameterization method, EMD-GA-SVM, where GA is the abbreviation for the Genetic Algorithm [23].Figure 6 presents the forecasting results of the four EMDbased models.In this figure, the wind speed data in the center of the circular rings with the value of 0 is the smallest, while the bigger the radius, the larger the wind speed value.The difference of the radius between each adjacent two circular rings is 5.As shown in Figure 6, the forecasting results obtained by these EMD-based models fit the actual wind speed data best when the forecasting horizon is 1-step-ahead, while the fit is the worst in the 6-step-ahead situation; that is, the deviation between the wind speed data forecast by the models and the actual wind speed data becomes larger as the forecasting horizon increases.In addition, the EMD-SVM and the EMD-GA-SVM methods deviate much more significantly from the actual data when compared to the other models.
In addition, the forecast results obtained by these models are analyzed according to the Quantile-Quantile (Q-Q) plot.The  quantile corresponding to a datum () means that approximately a decimal fraction  of the data can be found below the datum.The  quantile is calculated in the following manner: sort the data in a sequence {  } =1,2,..., in an ascending order.The sorted data { ⟨⟩ } =1,2,..., have rank  = 1, 2, . . ., .Then, the quantile value   for the datum  ⟨⟩ is computed by   The 0.25, 0.5, and 0.75 quantiles are called the lower quantile, the median, and the upper quantile, respectively.The Q-Q plot is used to compare the quantiles of two samples.If the two samples come from the same type of distribution, the plot will be a straight line.A straight reference line that passes through the lower quantile and the upper quantile is helpful for assessing the Q-Q plot.The greater the distance from this reference line, the more likely it is that the two samples come from populations with different distributions.
The vertical and the horizontal axes of the Q-Q plot are the estimated quantiles from the two samples, respectively.If the sizes of these two samples are the same, the Q-Q plot is just a plot of the sorted data in the first sample against the sorted data in the second sample.As an example, Figure 7 provides an empirical Q-Q plot of the quantiles of the actual wind speed sequence versus the quantiles of the forecast data for the #4 wind turbine, where  represents the actual wind speed data sequence, and 1, 2, 3, and 4 denote the wind speed data sequences forecasted by the EMD-SVM model, the EMD-CS-SVM model, the EMD-GA-SVM model, and the EMD-SDCS-SVM model, respectively.The straight line shown in each subplot is just the extrapolated line which joins the lower and the upper quantiles, and the vertical axis and the horizontal axis in each subplot are the estimated quantiles from the corresponding forecast data sequence and the actual data sequence.As observed from Figure 7, the forecast values sometimes are larger than the actual values (corresponding to the plus symbol located above the straight line), while sometimes they are smaller than the actual values (corresponding to the plus symbol located below the straight line).Figure 7 also reveals that the EMD-SDCS-SVM model fits the actual wind speed data best when compared to the other three models.

Forecasting Error Comparison.
Results presented in Section 3.2 provide graphical visualization of the performance of the different forecasting models.In this section, the superior performance of the EMD-SDCS-SVM model is shown quantitatively.To do this, two error evaluation criteria named the root mean squared error (RMSE) and the mean absolute percentage error (MAPE) are adopted and defined as follows: where  is the number of data points in the out-of-sample data and   and x are the actual value and the forecasted value, respectively.
From Table 1 and Figure 8, it can be seen that compared to the ANN forecasting models, the SVM models perform favorable forecasting accuracy; in particular in four-and sixstep-ahead forecasting result, the SVM is superior to the ANN model of BPNN, Elman NN, and WNN.
The forecasting error results with different forecasting horizons of these 4 models are given in Table 2 and Figure 9.As observed from Table 2 and Figure 9, the forecasting error values become larger as the forecasting horizon increases.For the #1, #2, and #4 wind turbines, the EMD-SDCS-SVM model always obtains more accurate wind speed forecasting results than the other three models.In addition, for the #3 wind turbine, the EMD-SDCS-SVM model is superior to both the EMD-SVM and the EMD-CS-SVM models, which means that the proposed novel model EMD-SDCS-SVM has made promising predictions and has better performance than its individual forecasting components.

Conclusions
Wind speed forecasting plays a significant part in the economy and security of wind farm systems' operation.Accurate forecasting results have significant influence on the economy.Recently, academia and industry have paid more attention to wind speed forecasting.More accurate forecasting could reduce costs and risks, improve the security of power systems, and help administrators develop an optimal action program,

Training set
Test set The result of forecasting accuracy Multi-ANN compared with SVM

Support vector machine
The forecasting accuracy of each turbine (MSE) The    Mathematical Problems in Engineering thereby enhancing the economic social benefits of powergrid management.Therefore, it is highly desirable to develop techniques for wind speed forecasting to improve accuracy.However, individual models do not always achieve a desirable performance.The proper selection method of a hybrid model can reduce certain negative effects that are inherent to each of these individual models; moreover, the hybrid forecasting model can make full use of the advantages of each of the individual models and is less sensitive, in certain cases, to the factors that make the individual models perform in an undesirable manner.In this paper, to enhance the forecasting capacity of the proposed combined model, consisting of three procedures, the data preprocessing procedure, the artificial intelligent parameter optimization introduction procedure, and the parameter optimization approach modification procedure were integrated.The SVM model used in this paper can handle data with nonlinear features, and the SD technique is adopted to enhance the convergence speed of the CS algorithm, which is utilized to optimize the parameters in the SVM model.The effectiveness and robustness of the proposed approach has been successfully tested by the real wind speed data sampled at four wind turbines.Based on the Q-Q plot and the error comparison, results show that the developed portfolio EMD-SDCS-SVM has made promising predictions and has better performance than its individual forecasting components despite very small MAPE and MSE values.For instance, the average MAPE values of the combined model were 0.7138%, 1.0281%, 4.8394%, 0.9239%, and 7.3367%, which are lower than those of BPNN, WNN, and Elman NN.By improving forecasting accuracy and stability, in the wind farm, a large amount of money and energy could be saved.The hybrid model can be applied to forecast the wind speed that can be used in wind power scheduling to produce various benefits, saving on economic dispatching, reducing production costs, and reducing the spinning reserve capacity of electrical power system.This model is also useful for supporting wind farm decision making in practice.The combined forecasting model, which has high precision, is a promising model for use in the future.In addition, this hybrid model can be utilized in other forecasting fields, such as product sales forecasting, tourism demand forecasting, early warning and flood forecasting, and traffic-flow forecasting.
⊕ L é vy() <  Select the initial points x 0 L é vy(); get the optimum solution∇f(x k ) = −∇g(x k i ) =iterating Constantly modify the step size  and the distribution of step-length
e r w i s e g o o n t o fi n d t h e o p t i m a l p a r a m e t e r Modified Combine Optimization of the step size  and the probability distribution of step size L é vy() M o d i fi e d m o d e l S D C S

Figure 2 :
Figure 2: Process of objective model.

Procedure 4 .
Use the optimum parameter values obtained in Procedure 3 and the processed data obtained in Procedure 1 to construct the forecasting model and obtain the forecasting results.

Step 3 :Figure 3 :
Figure 3: Procedures of the proposed forecasting portfolio.

Figure 5 :
Figure 5: Denoised and normalized results of the original data.

Figure 7 :
Figure 7: Empirical Q-Q plot of the quantiles of the actual wind speed sequence versus the quantiles of the forecasted data for the #4 wind turbine.

Figure 8 :
Figure 8: Forecasting results of SVM and others ANN.

Figure 9 :
Figure 9: MSE and MAPE visualization of the four wind turbines.

Table 1 :
Forecasting error results between SVM and ANN forecasting model.

Table 2 :
Forecasting error results with different forecasting horizons.