New PSO-SVM Short-Term Wind Power Forecasting Algorithm Based on the CEEMDAN Model

Accurate wind power forecasting can help reduce disturbance to the grid in wind power integration. In this paper, a short-term power forecasting model is established by using complete ensemble empirical mode decomposition adaptive noise (CEEMDAN) and nonlinear fitting characteristics of support vector machines (SVM) to accurately predict wind power. First, the wind power data are preprocessed and decomposed to 6 stable power components using CEEMDAN, thus reducing the impact of excessive forecasting errors of oscillatory points at peaks and valleys. Then, particle swarm optimization (PSO) based on improved empirical mode decomposition is designed to optimize the kernel function and penalty factor of the SVM. It establishes a new short-term power forecasting CEEMDAN-combined model. Finally, each stable component data is processed using the power forecasting model, and then, the results are combined to get the final power forecasting value. Analysis of test results and comparative studies show that the RMSE and MAPE of the new model are only one-third of that of the traditional SVM algorithm. The forecasting accuracy and speed meet the requirements for safe operation of wind farms.


Introduction
Wind energy is one of the most valuable renewable energy sources for large-scale development and utilization [1]. Due to volatility and instability of wind energy, large-scale grid connection of wind power may a ect the security of the grid. At the same time, when the wind power system is connected to the power grid, it becomes vulnerable and often su ers some attacks, such as advanced persistent threats (APTs). For the weakness of the Internet of ing (IoT) system in such wind power systems, researchers proposed some methods to improve security, such as constructing the environment for threat detection and situational awareness [2][3][4]. Accurate wind power forecasting will facilitate the adoption of e ective control measures when wind power grid integration encounters disturbance [5]. Both pros and cons display the necessity of wind power. e main focus of our work is the research on the algorithm of wind power prediction. For the algorithm of wind power prediction, researchers have made great e orts. According to the classi cation of prediction models, wind power prediction models can be divided into physical methods, statistical methods, learning methods, and the above model combination methods. e numerical weather forecast (NWP) model is used to predict wind speed according to the contour line, roughness, obstacles, air pressure, temperature, and other information around the wind farm. e results are usually used as input for other statistical models or power prediction of new wind farms.
is method is a relatively mature and accurate medium-and long-term wind power prediction method. Most wind farm prediction systems at home and abroad are based on physical methods. At the same time, in the speci c prediction process, the corresponding sample analysis and learning methods are combined to optimize NWP data and improve the accuracy of prediction. Because the physical method does not need a large amount of historical data accumulation, it is mainly used in new wind farms at present, but the calculation process is relatively complex and requires the assistance of a large computer, so it has certain limitations for a wide range of practical applications.
Statistics and learning methods usually do not consider the physical process of wind speed change, but build a mapping relationship between historical statistics and wind farm output power for prediction. e prediction accuracy of this kind of method decreases with the increase of the prediction time, so it is mostly used for short-term prediction. e latest statistical and learning methods include the Kalman filter method, artificial neural network method, wavelet decomposition method, support vector machine method, probability prediction method, and chaos prediction method. e combination method makes use of the information provided by different models and gives play to their respective advantages and selects the appropriate weighted average form to get the combination prediction model. e combination forms usually include the combination of physical and statistical methods, the combination of shortterm and medium-term forecasting models, and the combination of statistical models. Compared with the single model prediction, the wind power prediction using the combination method can reduce the occurrence of large errors and improve accuracy. At present, the latest method is to combine the advantages of multiple models to improve the accuracy of prediction.
To accurately forecast wind power, researchers first proposed a statistical forecasting method. Landberg proposed a Prediktor forecasting system [6] to analyze and process wind speed and wind direction data and determine the corresponding relationship between wind speed and output power, thus deriving a wind power curve. e statistical method has a simple principle, but low accuracy. Reference [7] used a mixed forecasting model whose forecasting effect was superior to a single forecasting model.
In recent years, a type of forecasting method using machine learning and artificial intelligence has been widely used. Reference [8] established a hybrid RT-ELM model embedded with CEEMDAN, variational mode decomposition (VMD), and AdaBoost in view of the nonlinear problem of wind speed time series. is model can well capture the nonlinear characteristics of wind speed time series but lacks forecasting accuracy. Reference [9] proposed a wind speed forecasting model based on ensemble empirical mode decomposition (EEMD) and autoregressive integrated moving average model. Reference [10] proposed a short-term wind speed forecasting method based on a hybrid particle swarm algorithm, which realized short-term wind speed forecasting under different quantiles, but the original power data forecasting had an excessive gap between the peak and valley errors. Reference [11] proposed a wind speed forecasting method based on CEEMDAN, FPA with chaotic local search (CLSFPA), and neural network classification tree (NNCT). e CEEMDANcombined model had the advantages of individual models. e algorithm was effective in high-precision wind speed predictions. Reference [12] proposed a particle swarm optimization-based support vector machine for power forecasting, but there were problems with excessive errors. e Danish RISO laboratory had perfected and applied Prediktor [13]. Imm Research Institute developed a Wind Power Prediction Tool (WPPT) wind power forecasting system, which could forecast multiple wind farms and regional wind farms. e DTU School of Modeling and Mathematics developed a short-term wind power forecasting system Zephry by using a combination of physical methods and statistical methods to improve forecasting accuracy [14]. By integrating multiple linear regression, neural networks, fuzzy logic clustering, and other statistical models, AWD Truewind designed the eWind forecasting system, which had been applied in wind power plants [14].
Reference [15] established the EEMD autoregressive model (AR) wind speed prediction model and preprocessed the wind speed sequence data through EEMD. Reference [16] established an ultrashort-term wind speed prediction method based on the CEEMDAN. CEEMDAN was used to decompose the wind speed time series into several subcomponents to reduce the nonstationary characteristics of the wind speed time series, then calculated the sample entropy of each component, and constructed a combination prediction model based on the no negative constraint theory for the components with higher sample entropy.
is paper proposes a new PSO-SVM short-term wind power forecasting algorithm based on CEEMDAN mode to solve the problem of low forecasting accuracy. Compared with some other power prediction algorithms, the algorithm is simpler and has higher power prediction accuracy. e structure of the paper is as follows. First, the original wind power sequence is preprocessed to obtain 6 relatively stable power components; second, particle swarm optimization is used to optimize the kernel function and penalty factor of the support vector machine to establish a shortterm power forecasting model; finally, the decomposed stable component is processed by forecasting algorithm, and the results are superimposed and summed to obtain the final power forecasting value.

CEEMDAN Mode
e CEEMDAN [14,15] is developed on the basis of empirical mode decomposition (EMD) [17][18][19][20][21][22][23][24] and EEMD, which reduces the reconstruction error of EEMD and increases the completeness of signal decomposition. As it implants adaptive white noise in each process of signal decomposition and then solves the undetermined and only one residual signal, the remaining modal components can be obtained by further analysis of the process on this basis. e specific decomposition process of the CEEMDAN algorithm is as follows: Step 1. Add K times of Gaussian white noise with a mean value of 0 to the signal x(t) to be decomposed to construct a sequence to be decomposed x i (t)(i � 1, 2, . . . , K) involving a total of K experiments.
where ε is the weight coefficient of Gaussian white noise; δ i (t) is the Gaussian white noise generated in the i-th processing.

Journal of Electrical and Computer Engineering
Step 2. Perform EMD decomposition on the above sequence x i (t) to get the first intrinsic modal component (IMF), then take its average value as the first IMF obtained by CEEM-DAN decomposition.
where IMF i 1 (t) represents the first modal component obtained by CEEMDAN decomposition; IMF i 1 (t) represents the first IMF obtained after EMD decomposition; r 1 (t) represents the residual signal obtained after the first decomposition.
Step 3. After decomposing the residual signal obtained from the stage j to the specific noise, proceed with EMD decomposition.
where IMF j (t) represents the j-th modal component obtained by CEEMDAN decomposition; E j−1 (g) represents the j − 1-th IMF component obtained after EMD decomposition of the sequence; ε j−1 represents the weight coefficient of the noise added by CEEMDAN to the residual signal at the stage j − 1; r j (t) represents the residual signal at the stage j.
Step 4. Iteration stops. If the EMD stop condition is met and the residual signal r n (t) of the n-th decomposition is monotonic, then iteration stops and the CEEMDAN algorithm decomposition ends. Take the No. 23 wind turbine of Dabanliang Wind Farm as the object for simulation analysis. e sampling time interval is 10 min, with a total of 200 sampling points. e first 160 sampling points are selected as the forecasting model training set, and the latter 40 sampling points are used as the test set for rolling forecasting. e forecasting time is 20 minutes. e wind power curve after preprocessing of the historical power data is shown in Figures 1 and 2. Figure 1 shows the relationship between data on power and wind speed. e mean values are 709.69 and 7.44 and the variance values are 354.24 and 2.12.
Using the CEEMDAN algorithm, the original power sequence is decomposed into 6 sets of output submodes, as shown in Figure 3.
By the CEEMDAN algorithm, the original wind power signal is decomposed into 5 IMF components with different frequencies and a residual component. From these 6 sets of data, it can be seen that the first and second sets of signals have the largest fluctuation frequency, which means they can better reflect the original signal information, so the forecasting error is too big. e latter four groups of signals have relatively stable changes, which represent the low-frequency part, that is, the part where the wind power changes slowly, so the forecasted value will approach the true value.

PSO-SVM Wind Power Forecasting
Model Design 3.1. Support Vector Machine. During the support vector machine [25,26] training, the collected samples are often nonlinear data, and it is impossible to find a hyperplane to classify nonlinear data. Classification requires that it must be converted to linear data. By kernel function, SVM can map the nonlinear sample space to the high-dimensional linear space. en, different types of sample segmentation are possible by finding the best segmentation hyperplane in the linear space. ere are linear and nonlinear support vector machines. For linear classifiable training sample sets, the classification equation is For the nonlinear classifiable training sample set, the decision function is In the decision function, let all x i meet the absolute value of f(x i ) ≥ 1, so that the distance between the sample and the optimal classification plane is minimal. e minimum distance d between the two is expressed as follows: To let the samples correspond correctly on the optimal classification plane, the constraint function is limited as follows: Support vector sample meets Introduce the Lagrangian function to solve the optimization problem [27]: In the formula, a i represents the Lagrangian coefficient and its maximum value is usually greater than 0; respectively, take the derivative of w and b in the above formula; when the partial derivative is 0, there is Substitute formulas (6) and (9) into formula (7) to transform the optimization problem into a dual problem. Its maximization function is Journal of Electrical and Computer Engineering In the formula, a sample with nonzero a i is a support vector. us, there is a discriminant function f(x): When solving nonlinear problems, the essential solution idea is similar to that of linear problems. e best way to deal with such a problem is to establish a high-dimensional mapping. By mapping to the high-dimensional space for classification, classification of the original samples can be obtained [28]. e regression function f(x) is where w is the weight vector, b is a constant, and minimization of w and b is estimated by the following formula: e constraints are In the formula, C is the penalty factor, ξ i , ξ * i are the relaxation factors, and ε is the loss function.
Due to the high dimensionality of the feature space, the Langrangian multiplier method is generally adopted to solve high-dimensional quadratic programming problems in practical applications: where x i,j is the input variable, y i is the output variable, and a i and b i are the Lagrangian multipliers.

Particle Swarm Algorithm.
e particle swarm algorithm [29] is initialized as a group of random particles. In the D-dimensional space solution, by following the optimal value of the current optimal particle search, the global optimal solution can be found after continuous iterations. e PSO algorithm is simple and easy to implement, with  not too many parameters to be adjusted, so it has the advantages of fast convergence and high accuracy. e particle swarm algorithm finds the optimal solution in the continuous iterative process. In each iteration, the particle updates its position by tracking two extreme values. e first is called individual extreme value, which is the optimal solution found by the particle itself. e other is called global extreme value, which is the optimal solution currently found by the entire population.
Suppose that the total supply of a certain population contains n particles. In a D-dimensional search space, the speed and position of the i-th particle are denoted as: v k+1 id , x k+1 id . Each particle maintains its optimal position, and the updated equations for the velocity and position of the particle i during the k-th iteration are as follows [30]: In the formula, c 1 and c 2 are the normal numbers known as acceleration factors, which are used to adjust the maximum step length of the local optimal particle and the global optimal particle in flight, respectively; w is the inertia weight. When the inertia weight is fixed in the range [0.9, 1.2], optimization can reach the best result so that the global optimal solution can be easily found; k is the number of iterations; r 1 , r 2 are the random numbers distributed in the interval [0, 1]; the search space dimension is d � 1, 2, 3, . . . , D.

PSO-SVM Forecasting Model.
e support vector machine forecasting effect is subject to the influence of its kernel function parameters and penalty coefficient parameters. In forecasting, the traditional support vector machine selects parameters using the cross-validation method, which often fails to achieve the desired effect. Moreover, affected by other objective factors such as wind speed, wind power uncertainty is high. In dealing with multiobjective optimization, the particle swarm optimization algorithm can find the global optimal solution to the problem with a higher probability. Moreover, compared with the traditional random method, it has high computational efficiency and robustness, which can effectively adapt to sample sequences with high uncertainty. e main idea of PSO-SVM in wind power forecasting is to randomly generate C and g, use them as the initial position of the particle swarm, and then search for the optimal SVM parameters using the particle swarm algorithm, thereby forecasting the wind power. e specific process is shown in Figure 4.

CEEMDAN-PSO-SVM Forecasting
Algorithm. It can be seen from the 6 sets of signals decomposed in Figure 1 that, compared with the original wind power input signal, the decomposed intrinsic mode function IMF is relatively stable. In stable signal forecasting, the forecasting results are often superior compared to the original signals. Owing to great fluctuations and instability of wind power, if the original input-output power is directly forecasted, the forecasting error will often be high, and the power forecasting requirements for accuracy will not be met. If the wind power is decomposed by CEEMDAN, the decomposed data is, respectively, forecasted by PSO-SVM, and then each group of forecasted values is combined to establish a CEEMDAN-PSO-SVM combined forecasting model, it will further improve the forecasting accuracy and achieve better results.
e forecasting model steps are as follows:

Simulation Analysis
With the preprocessed wind power sequence as input, SVM, PSO-SVM, and CEEMDAN-PSO-SVM forecasting models are established, respectively, for error analysis. e forecasted results of each model are compared with the actual values, as shown in Figure 6.
By comparing the difference between the actual value and the forecasted value of the above model, it can be known that the forecasting result of CEEMDAN-PSO-SVM has a forecasting curve more consistent with the actual value compared to SVM model, PSO-SVM model, and PSO-SVM model has a forecasted value closer to the true value than SVM model. e mean average percentage error (MAPE) and root mean square error (RMSE) of the above three models are comparatively analyzed to obtain the following error analysis in Table 1. As shown in the table, the best result of RMSE is 32.34 for CEEMDAN-PSO-SVM, and the worst is 118.97 for SVM. A smaller RMSE means higher prediction accuracy, which indicates that CEEMDAN-PSO-SVM has the highest prediction accuracy. From the perspective of MAPE, the best result is 4.28% for CEEMDAN-PSO-SVM and the worst is 12.33 for SVM. is also means that the prediction accuracy of the CEEMDAN-PSO-SVM is the highest. From this aspect, it can be considered that the CEEMDAN-PSO-SVM has a certain superiority in the comparison of prediction accuracy of similar models.

Conclusion
It designed a new PSO-SVM short-term wind power forecasting algorithm based on CEEMDAN mode. e new CEEMDAN-combined algorithm has a stable component reducing the impact of excessive forecasting errors of oscillatory points at peaks and valleys. Compared with some similar traditional power prediction algorithms, it has higher prediction accuracy and faster speed. e RMSE and MAPE of the new model are only one-third of that of the traditional SVM algorithm. e error analysis has proved that it is effective and feasible. After the CEEMDAN of the original data, the power prediction value is more stable and closer to the real value. e CEEMDAN-PSO-SVM prediction model is of great significance to deal with fluctuating and unstable wind power prediction.

Data Availability
All data, models, and code generated or used in this study are included within the article.

Conflicts of Interest
e authors declare that they have no conflicts of interest.