Short-Term Wind Speed Forecasting Using Decomposition-Based Neural Networks Combining Abnormal Detection Method

and Applied Analysis 3 for the decomposition process. After addressing the decomposed IMFs by removing the subcomponents with high frequencies, the first goal of the hybrid method is to model and forecast each IMF using a NAR neural network. Then, the forecasted signal is reconstructed as the final result. The secondmethod is to reconstruct the signal first by adding the noise-filtered IMFs, followed by themodeling and forecasting of the reconstructed signal with a NAR neural network to obtain the final forecasting result. Both approaches to model combination enhance the model’s performance significantly with respect to short-term wind speed forecasting, although the second combination performs better in most cases. The major highlight of this paper is its novel mixture of abnormal detection, signal decomposition, and AI-based training and forecasting processes. As the first hybrid stage approaches, the 53H method contributes to the detection and modification of the abnormal values in the raw datasets; this strategy is quite effective when considering greatly fluctuating series, such as short-term wind speed observations. In addition, the use of two combined approaches to the decomposition and forecasting sectors enables the proposed hybrid methods to ably capture the complex characteristics of the original wind speed series, thus promoting their applicability to different datasets. The data sampled in this paper represent wind speed observations at ten-minute intervals thatwere collected from three stations in western China. For each station, observations in January, April, July, and October were chosen as the representations ofwinter, spring, summer, and autumn, respectively.The simulated results indicate that the developed hybrid models have a strong capacity for successful shortterm wind speed modeling and forecasting, vastly outperforming the methods chosen for comparison. The rest of this paper is organized as follows. Section 2 introduces the related works, and two proposed hybrid models. A case study and model discussion are provided in Section 3 in detail. Then, Section 4 reaches the conclusions. 2. Methods This section reviews the methodologies and works related to the proposed hybrid models in the following text. 2.1.Medians of Five-Three-HanningWeightedAverage Smoothing. The medians of five-three-Hanning weighted average smoothing, presented by Tukey [24, 25], are a statistical method of abnormal value removing. It is also termed the 53H method, where “5” is a method for a median of five smoothing, “3” is for a median of three smoothing, and “H” denotesHanning smoothing.Thismethodholds the principle that the median is a robust estimator of the mean to produce a smooth time sequence that can be extracted from the raw signal. Let the raw time sequence be x i , i = 1, 2, . . . , n, where n is the total number of points in the signal. Then, the 53H method can be expressed as the following steps. (i) Five-point moving average smoothing: construct a sequence y i from the median of five original data points from x i−2 to x i+2 , i = 3, 4, . . . , n − 3, n − 2. For a five-point smooth, y (1) i = 1 9 (x i−2 + 2x i−1 + 3x i + 2x i+1 + x i+2 ) . (1) (ii) Three-point moving average smoothing: construct a sequence y i from the median of five original data points from y i−1 to y i+1 , i = 4, 5, . . . , n−4, n−3. It can be defined as y (2) i = 1 3 (y (1) i−1 + y (1) i + y (1) i+1 ) . (2) (iii) Hanning moving average smoothing: take Hanning filter to obtain the final smoothed signal, as y (3) i = 1 4 (y (2) i−1 + 2y (2) i + y (2) i+1 ) , i = 5, 6, . . . , n − 5, n − 4. (3) (iv) Construct the sequence z i = |x i − y (3) i |, i = 5, 6, . . . , n − 5, n − 4, and reject the point if z i > kσ, where k is a predetermined threshold and σ is the standard deviation of x i ; replace the abnormal values in the raw time sequence by y i . 2.2. Empirical Mode Decomposition (EMD). In the EMD process, the raw signal can be decomposed into locally narrow band components called intrinsic model functions (IMFs), which can be regarded as hidden oscillation models embedded in the original data series [26]. Given a raw data series x(t), the procedure of EMD method can be expressed as follows. (i) Identify all the local extrema and determine the upper and lower envelops. Their mean is denoted as m 1 (t), and the first component is defined as h 1 ( t) = x (t) − 1 ( t) . (4) (ii) Ideally, h 1 (t) should satisfy the definition of an IMF. However, that is usually not the case. Therefore, it is necessary to repeat the above sifting [27]. Treat h 1 (t) as a proto-IMF and define h 11 ( t) = 1 ( t) − 11 ( t) . (5) (iii) The above sifting process will be repeated k times, as h 1k ( t) = 1(k−1) ( t) − 1k ( t) . (6) Until h 1k (t) satisfies the stoppage criterion, let c 1 (t) = h 1k (t). The stoppage criterion is determined by Cauchy type of convergence test, defined as SD k = T ∑ t=0 󵄨 󵄨 󵄨 󵄨 h 1(k−1) (t) − h 1k (t) 󵄨 󵄨 󵄨 󵄨 2 h 2 1(k−1) (t) . (7) Separate c 1 (t) from the rest of the data by r 1 ( t) = x (t) − 1 ( t) . (8) 4 Abstract and Applied Analysis Neuron 1 Neuron 2 Neuron n wij


Introduction
Renewable energy is considered to be the most promising alternative energy resource and plays a significant role in securing a long-term sustainable energy supply while reducing global atmospheric emissions [1].Renewable energies are highly expected to develop as clean, alternative energy resources.As the most active member of this group, wind power demonstrates considerable benefits and good potential.Because the development of wind power generation requires accurate information regarding wind resources, especially regional wind speeds, wind-related forecasting techniques have become a focal point for many studies.The literature shows that the error prediction costs can reach as high as 10% of a wind farm's annual total income from selling energy [2].On the other hand, advanced estimates of wind speeds can provide useful information for the dispatching sector of a wind-power-connected system.Such a capacity may be able to influence the wind power connection and enhance grid stability.Due to these improvements, the penetration of wind power could be increased, and the existing energy structures would be greatly changed in the future.
In its actual generation, the main obstacle to the development of a wind-based industry is the variability of the turbines' output power, which seriously limits wind power penetration and threatens grid security.As one of the most important factors in estimating the output of wind turbines, wind speeds can easily be influenced by other meteorological factors, such as air temperature and air pressure, as well as obstacles and terrain.The uncertainty of wind speed forecasting results not only from the forecasting models but also from measurement errors in the meteorological factors.These measurement errors cannot be eliminated due to systematic errors of the instruments and methods of observation.As a result, wind speed forecasting is not easy to address; moreover, wind speed modeling now has become one of the most difficult problems to tackle [3,4].Thus, this paper focuses on this meaningful area of research.
Researchers have put great effort into wind speed modeling and forecasting.Various wind speed forecasting methods have been proposed in the literature to predict wind speeds at different time horizons.Particularly for short-term wind speed forecasting, which is important for scheduling, controlling, and dispatching energy conversion systems [5], these methods can be classified into two categories: physical methods and statistical methods.Physical methods are often referred to as meteorological predictions of wind speed, involving the numerical approximation of models that describe the state of atmosphere [6] and including the WRF model.These models always use physical data, such as temperature, pressure, and topography information, to predict future wind speed [7,8].Unlike physical models, statistical methods make forecasts by uncovering the relationships within the observed wind speed time series.They use historical wind speed data and sometimes other variables (e.g., wind direction or temperature) to build the statistical structures from which the forecasts are derived.The data used are recorded at the observation site or at other nearby locations where data are available.In the literature, many statistical methods have been applied to this topic, such as the autoregressive integrated moving average (ARIMA) model, Kalman filters, the generalized autoregressive conditional heteroscedasticity (GARCH) model, and more.These statistical models can be used at any stage in the modeling and often combine various methods into one.Currently, grey models (GM) [9,10] and some other new methods based on artificial intelligence (AI) techniques have been developed to address these problems.Examples include artificial neural networks (ANNs) of multilayer perceptrons (MLP) [11,12], radial basis function (RBF) [13], recurrent neural networks (RNNs) [14,15], and fuzzy logic [16,17].In fact, forecasting methods are conventionally not even classified as physical or statistical, as most modern methods include both.
Because each category of the abovementioned methods has its own strengths and weaknesses, complex modeling and forecasting problems cannot be well addressed by any single one.The combination of different methods has been shown, both by theoretical and empirical findings, to be an efficient and effective way of improving model performances [18][19][20][21].Some early works provide many useful hybrid forecasting models that combine strengths from different models to enhance their forecasting performance.When considering the wind speed forecasting problem, it is difficult to obtain a high accuracy level with a single forecasting technique, mainly due to the strong and random fluctuations of wind speed series.In particular, if the focus falls on short-term wind speeds, data series always show irregular variations on a short time scale, increasing the complexity of modeling shortterm wind speeds.Currently, combining different models has become a popular trend in wind speed estimations, and the following paper also adopts this concept to address shortterm wind speed modeling and forecasting.
The main contribution of this paper is its development of two hybrid methods, consisting of abnormal value detection and modification, decomposition and reconstruction processes, and AI-based training and forecasting.The proposed approaches utilize a novel mixture of three technical tools, including the medians of five-three-Hanning (53H) weighted average smoothing method, the ensemble empirical mode decomposition (EEMD) method, and the nonlinear autoregressive (NAR) neural networks.The developed approaches are able to provide more accurate forecasting results than traditional techniques to address the tough but significant problem of short-term wind speed forecasting on wind farms.
It is well known that short-term wind speed series are difficult to model or predict, mainly due to their strong and random variation within a short time scale.For this reason, data processing is necessary to filter the abnormal values, which greatly impact the model's performance, and to extract valid information from the raw dataset for wind speed modeling and forecasting.In this paper, the 53H method is adopted as the first step of a hybrid wind speed modeling procedure and used to detect and remove abnormal values from the original raw series.Thus, a modified series can be obtained for use as the input data for the next modeling stage.After that, a decomposition and reconstruction process will be pursued according to the EEMD method to remove the noisy information contained in the data series because noise can impose a number of pseudo-variation requirements on models and may affect the correct understanding of data variations [22].As an adaptive decomposition method, EEMD is based on the local characteristic time scale of the data series.This method constitutes a powerful enhancement of the original empirical model decomposition (EMD) by considering white noise along with the idea of an ensemble mean.This strategy is applicable to nonlinear and nonstationary processes, such as short-term wind speed series.The major advantage of EEMD may be that it automatically identifies the intrinsic time scales of the data without any assumptions regarding signal stationarity.After decomposition by the EEMD method, the local narrow band components, called the intrinsic model functions (IMFs), can be obtained.Each IMF has its own physical meaning and statistical characteristics.Thus, the denoising procedure can be easily accomplished by removing subcomponents with high frequencies and then by reconstructing the data series.In combination with the EEMD decomposition and reconstruction, a modeling and forecasting procedure based on neural networks is developed.As artificial intelligence becomes increasingly important in the technology industry, the interest in neural networks has grown to be substantial.These systems have a strong ability to mimic natural intelligence and are able to learn from examples due to their construction of an input-output mapping without any explicit derivation of the model equation.Their greatest strength is that no knowledge of the internal system parameters is required [23] to offer an acceptable solution.Neural networks fall into different categories, but the one selected for this paper is the NAR neural network, a type of RNNs with feedback arrangement based on a nonlinear autoregressive model.This paper combines these three methodologies in two ways.Employing the 53H method in the first stage, a modified series removing abnormal values will be obtained and used for the decomposition process.After addressing the decomposed IMFs by removing the subcomponents with high frequencies, the first goal of the hybrid method is to model and forecast each IMF using a NAR neural network.Then, the forecasted signal is reconstructed as the final result.The second method is to reconstruct the signal first by adding the noise-filtered IMFs, followed by the modeling and forecasting of the reconstructed signal with a NAR neural network to obtain the final forecasting result.Both approaches to model combination enhance the model's performance significantly with respect to short-term wind speed forecasting, although the second combination performs better in most cases.The major highlight of this paper is its novel mixture of abnormal detection, signal decomposition, and AI-based training and forecasting processes.As the first hybrid stage approaches, the 53H method contributes to the detection and modification of the abnormal values in the raw datasets; this strategy is quite effective when considering greatly fluctuating series, such as short-term wind speed observations.In addition, the use of two combined approaches to the decomposition and forecasting sectors enables the proposed hybrid methods to ably capture the complex characteristics of the original wind speed series, thus promoting their applicability to different datasets.The data sampled in this paper represent wind speed observations at ten-minute intervals that were collected from three stations in western China.For each station, observations in January, April, July, and October were chosen as the representations of winter, spring, summer, and autumn, respectively.The simulated results indicate that the developed hybrid models have a strong capacity for successful shortterm wind speed modeling and forecasting, vastly outperforming the methods chosen for comparison.
The rest of this paper is organized as follows.Section 2 introduces the related works, and two proposed hybrid models.A case study and model discussion are provided in Section 3 in detail.Then, Section 4 reaches the conclusions.

Methods
This section reviews the methodologies and works related to the proposed hybrid models in the following text.

Medians of Five-Three-Hanning Weighted Average Smoothing.
The medians of five-three-Hanning weighted average smoothing, presented by Tukey [24,25], are a statistical method of abnormal value removing.It is also termed the 53H method, where "5" is a method for a median of five smoothing, "3" is for a median of three smoothing, and "H" denotes Hanning smoothing.This method holds the principle that the median is a robust estimator of the mean to produce a smooth time sequence that can be extracted from the raw signal.
Let the raw time sequence be   ,  = 1, 2, . . ., , where  is the total number of points in the signal.Then, the 53H method can be expressed as the following steps.

Empirical Mode Decomposition (EMD).
In the EMD process, the raw signal can be decomposed into locally narrow band components called intrinsic model functions (IMFs), which can be regarded as hidden oscillation models embedded in the original data series [26].Given a raw data series (), the procedure of EMD method can be expressed as follows.
(i) Identify all the local extrema and determine the upper and lower envelops.Their mean is denoted as  1 (), and the first component is defined as (ii) Ideally, ℎ 1 () should satisfy the definition of an IMF.However, that is usually not the case.Therefore, it is necessary to repeat the above sifting [27].Treat ℎ 1 () as a proto-IMF and define (iii) The above sifting process will be repeated  times, as Until ℎ 1 () satisfies the stoppage criterion, let  1 () = ℎ 1 ().The stoppage criterion is determined by Cauchy type of convergence test, defined as Separate  1 () from the rest of the data by (iv) Repeat the above process with all the subsequences, as The process should be stopped when the residue   () becomes a monotonic function from which no more IMF can be extracted.
(v) Reconstruct the original signal as When using an EMD method, it does not require any a priori known basis [26,28]; this indicates that EMD method is completely adaptive to the signal itself.

Ensemble EMD (EEMD).
The EEMD method is a noiseassisted enhancement of the EMD method proposed by Huang et al. [26,29], since the EMD method has several known difficulties.The first major weakness of the original EMD is the frequent occurrence of model mixing, which is defined as a single IMF [26,30].IMFs can also consist of widely disparate scales or can consist of a similar signal residing in different IMF components.Model mixing is often a consequence of an intermittent signal, which can not only cause serious aliasing in the frequency distribution but can also make the physical meaning of individual IMFs seriously unclear.To overcome the scale separation issue without introducing a subjective intermittence test, the ensemble EMD method was presented [30][31][32][33] as a powerful modification of the original EMD method with the idea of ensemble mean.The EEMD method adds white noise to the original dataset, as The noise is treated as possible random noise that would be encountered in the measurement process.Then, decompose the dataset with added white noise by using the EMD procedure, and obtain the ensemble means of the corresponding IMFs of the decompositions as the final result.
The main effect of decomposition using EEMD is that the added white noise series cancel each other in the final mean of the corresponding IMFs.This means that the IMFs stay within the natural dyadic filter windows and thus significantly reduce the chance of mode mixing and preserve the dyadic property.

Nonlinear Autoregressive (NAR) Neural
Network.Artificial neural network (ANN), a widely used category of neural networks, is considered an intelligent system that has strong ability to recognize time series patterns and nonlinear characteristics [34,35].ANN combines artificial neurons to process information; simple neurons are connected by weight links and this sets up a network.Each input is multiplied by those weights computed by a mathematical function which defines the activation of the neuron.There also is another activation function, which computes the output of the artificial neural; it depends on a certain threshold.
The output of a neuron can be written as where  is the bias for the neuron.The bias input to a neuron can be regarded as an offset value.It helps the signal to exceed the threshold of activation function, where the activation function is denoted as . represents the output;   and   are the inputs and weights, respectively.Neural networks can be classified into dynamic and static categories.Static networks have no feedback elements and contain no delays; the output is calculated directly from the input through feedforward connections.Relatively in dynamic networks, the output depends not only on the current input to the network, but also on the current or previous inputs, outputs, or states of the network.Among them, recurrent neural network (RNN) is an essential class where connections between units form a directed cycle (Figure 1).
It allows exhibiting dynamic temporal behavior by creating an internal state of the network.The NAR neural network is a type of recurrent network with feedback arrangement based on a nonlinear autoregressive model.Thus, in a NAR network, there is a feedback of the true output instead of the estimated one in the input.This allows using only static backpropagation when training the network and also makes the feedforward architecture more accurate.
Currently, it is commonly used in multistep ahead time series forecasting.The forecasting value is determined by the following equation: where  is a nonlinear function.Specifically, the function values depend only on regressed  previous values of the output signal.
2.5.Proposed Models.This section develops two hybrid methods for short-term wind speed modeling and forecasting.The proposed methods consist of three stages: abnormal value removing and modification, signal decomposition and denoising, and signal reconstruction and NAR-based forecasting.Figure 2 introduces the workflow of the proposed approaches, briefly showing the three stages as follows.
Stage 1: Abnormal Value Removal and Modification.The accurate forecasting of the short-term wind speeds heavily depends on the reliability of the observed data series.Thus, in the first stage, the 53H method is chosen as a detector to find and remove abnormal values from the raw observation series.This stage is composed of a 53H-modified series.
Stage 2: Signal Decomposition and Denoising.During the actual operations of wind farms and power systems, several uncertain factors may influence the data acquisition process, including measurement, recording, conversion, and transmission.Most of these are beyond control, and any of these factors can introduce noises and uncertainties into the wind speed series, leading to poor generalization and undesirable forecasting performances.Thus, wind speed modeling and forecasting is a difficult proposition.To address this problem, the EEMD-based signal filtering method is applied as a denoising process.This stage decomposes the 53H-modified signal into IMFs using the EEMD method and removes the subcomponents with high frequencies.

Stage 3: Signal Reconstruction and NAR-Based Prediction.
After addressing the decomposed IMFs by removing the subcomponents with high frequencies, there are two modeled concepts.The first involves the modeling and forecasting of each IMF by the NAR neural network, after which the forecast signal is reconstructed as the final result.This model is named HEN1 (short for the hybrid model 1 combined 53H detection, EEMD filter, and NAR networks).The second involves an initial reconstruction of the signal by adding the noisefiltered IMFs, followed by the modeling and forecasting of the reconstructed signal with a NAR neural network, obtaining the final forecast result.Similarly, this model is denoted as HEN2 (the hybrid model 2 combined 53H detection, EEMD filter, and NAR networks).

Simulation Process of Hybrid 53H-EEMD-NAR Method.
As described in Section 3, this paper develops two hybrid models for short-term wind speed modeling and forecasting.The guiding concept of hybrid approaches is to construct a mixed procedure, combining an abnormal detection, a modification method, a signal filter method, and NARbased neural network forecasting.Both of the hybrid models are three-stage procedures, as introduced in Figure 2.This section aims to show the simulation process for each stage and then to provide the forecasting results.

Abnormal Value Removing and Modification.
In the first stage of the hybrid approaches, the 53H method is employed to remove and modify the abnormal values in the raw data series.This step is necessarily the first stage of the following model construction and estimation because short-term wind speeds always show strong and random variations within a short time interval.Great fluctuations heavily impact the model performance, resulting in poor forecasting accuracy and even model invalidation.As a result, data detection and smoothing modification methods are adopted here to help the hybrid approaches overcome the interference from abnormal and strong variations.
Figure 4 provides the results obtained from the 53H method, using the wind speed series at Station 1 collected in July 2011 as an example.These data encompass not only the results of the whole dataset but also three enlargements, which are the detection and modification of the strongvariation data segments chosen from the whole set.It is    apparent that the points modified by the 53H method are centralized into several segments and marked by the dashed rectangular boxes.A detailed version of these special segments is provided by three enlargements, which demonstrate the clear view that the raw observations in these segments always have strong variations.Furthermore, the modified data series effectively avoid the disordered fluctuations, instead containing major information on the trends and changes.Figure 4 also illustrates the differences in the basic statistics between the raw data series and the 53H-modified series.The mean values of the 53H-modified series are found to be nearly equal to the mean values of the raw data, all of them located within the narrow interval of ±0.01 m/s.When considering the difference in the standard deviations of the two groups, the 53H method helps to cut down the standard deviations from the raw series to the eleven chosen samples, with their varying degrees of reduction.All of these contribute to the enhanced performance of the hybrid models and to the development of hybrid forecasting accuracy.Consequently, it is of great importance that the 53H-modified series will be used as the input of the EEMD filter in the following stages for the hybrid approaches.

Decomposition and Noise
Filter.Moving on to the second stage of the proposed hybrid approaches, the EEMD method is applied as a signal filter for data denoising.Figure 5 displays this noise reduction process for the data collected at Station 1 in July 2011.Decomposition for the selected sample generates nine IMFs and one residual series.The extracted IMFs represent a range of frequencies, from high to low.The IMFs with higher frequencies represent the pattern of shorter periods, whereas the IMFs with lower frequencies represent the pattern of longer periods.That is, as IMF1 has the highest frequency, it most likely also has the noise information; IMF9 can thus be regarded as the trend term.Generally, if an IMF represents a very short-term pattern, it should be discarded to achieve denoising for the original data series.
From Figure 5, it is clear that IMF1 and IMF2 are subsignals with high frequencies, and the patterns corresponding to these two IMFs are of a very short period.Considering the short-term wind speed forecasting problem studied in this paper, the first two IMFs can be extracted to represent very short-term patterns.as introduced in Section 3.3.1."Noise signals" in Figure 6 represent the sum of all of the removed subsignals, while "denoised data series" stand for the filtered series that have discarded the noise information.Obviously, the denoised series is much smoother and is able to contain the major information in the original data series.This step is significant within the proposed hybrid approaches because the EEMD method helps in overcoming the interference from a noisy signal.In the next stage, the denoised data series will be treated as the inputs of the NAR-based procedures.

Reconstruction and Prediction.
As introduced above, two methods of combination were designed at this stage to mix the EEMD-filtered signal series and NAR-based training and forecasting procedures.
The first method (denoted as HEN1) was intended to model and forecast each IMF using its respective NAR neural network; the forecast signal would then be reconstructed as the final result.The second method (denoted as HEN2) was intended to reconstruct the signal first by summing the noise-filtered IMFs, followed by the model from the NAR neural network, in order to obtain the final forecasting result.These constructed approaches are employed for one-day-ahead predictions; for each adopted NAR neural network, the number of hidden neurons is determined by an optimal selection process using the training dataset.The adjustment of data training was performed according to the Levenberg-Marquardt algorithm, which is commonly used to solve nonlinear least squares problems [36].Additionally, the performance was evaluated by error criteria chosen as MSE for this paper.Figure 7 shows the forecasting results of both proposed models, and data displayed was sampled from Station 1 in July 2011.
Obviously, the forecasting results from the two proposed models perform similarly and approach the original data series.This can be observed in the correlation plots of the predicted wind speeds versus the actual wind speed, which are also displayed in Figure 7.In the correlation plots, the dashed straight line indicates that the predicted wind speed is equal to the actual observation; meanwhile, the farther the points get away from the straight line, the larger the forecasting error is introduced into the hybrid approaches.It is clear that most points are located in a narrow range near to the straight line.As indicated, the proposed hybrid approaches perform well when faced with short-term wind speed forecasting.The frequency distributions also suggest that the forecasting errors concentrate around the zero point; the distributions from the two hybrid approaches seem similar to each other.
The results obtained from the two proposed models were similar; however, each model also demonstrated distinct characteristics and benefits for hybrid forecasting procedures.The major benefit from the EEMD-based data decomposition model was its capacity to break down the original data series into several subsignals, called IMFs, to satisfy the two requirements introduced in [26].Therefore, while an IMF does represent a simple oscillatory mode as a counterpart to simple harmonic function, it is also much more general.Instead of the constant amplitude and frequency of a simple harmonic component, an IMF can have variable amplitude and frequency along the time axis.As a result, its modeling and forecasting for each IMF provides more satisfactory results and better accuracy.In another respect, the number of decomposed IMFs cannot be controlled; when the number is comparatively large, the HEN1 model may also generate a large forecasting error because the modeling and forecasting for each IMF introduces model error into the entire hybrid procedure.Just as in the problem discussed in this paper, wind speed series with ten-minute time intervals show a complex mixture of nonlinearity, volatility, and strong randomness.As shown in Figure 5, nine IMFs were extracted from the original data series using EEMD-based decomposition.Applying the HEN1 model, NAR-based forecasting introduces a large model error in stage 3, especially in several partitions.This is shown in Figure 7; as at several specific points, the HEN1 model has comparatively large forecasting errors.In this setting, the HEN2 model is always able to show its strength by summing the filtered signal series together as the input of a NAR neural network.By removing the noisy signals, the power of the HEN2 model is mainly derived from the smoother nature of the denoisy series, which retains the major information from the original data series.Thus, the following NAR-based forecasting procedure provides a final result.
Figure 8 then provides a brief overview of the predicted results from the two hybrid approaches for all twelve samples used in this paper.Correlation plots indicate that the distances between the hybrid predictions and the actual wind speed are quite close for both of the proposed approaches.Meanwhile, wind speed series with large variations can also be found, such as the data collected in April at Station 1; in that month, both of the hybrid methods had comparatively large errors.Despite of this, the hybrid approaches maintained the trend of variation, retaining the major information of actual wind speed in this sample.In the corresponding correlation plot, all of the points were concentrated around the given dashed straight line, which indicates that the methods are effective and performed stably for this strongly fluctuating data series.In the following section, the model comparison and performance analysis will be introduced.It will be shown that the proposed hybrid methods perform much better than the other selected models, mainly due to their combination of the strengths from different approaches.

Analysis and Performance Comparison.
To evaluate the forecasting performance of the proposed hybrid approaches, several methods are chosen for model comparisons, including single forecasting methods, hybrid approaches, and commonly used benchmarks.According to the proposed methods in this paper, the single NAR-based neural network, hybrid 53H-NAR method, and hybrid EEMD-NAR method were compared.The generally widely used time series model, called the ARIMA process, was also chosen, as well as the persistent model that is commonly considered a benchmark for short-term forecasting problems.
Figure 9 displays the forecasting results of the different chosen methods, including the two proposed approaches and five selected methods; each method is shown as applied to the data collected at Station 1 from July 2011.The proposed approaches perform much better than the others, particularly in the segments with complex variation and strong fluctuations.Here, three segments are selected for a more detailed explanation, as marked by the dashed rectangular boxes in Figure 9. Segment 1 is representative of data with complex variation; the wind speed in this period has large variance.The single NAR-based neural network reveals an apparently weaker ability with an abnormal final result.This is mainly due to the inferences from both the abnormal values and the noise-signals within the raw data series.This result also indicates that it is difficult to obtain reliable and stable forecasting results using a single method (despite the AI-based forecasting tool chosen here), especially when considering any complex processes similar to short-term wind speed series.The marked Segment 2 follows a downward trend of the wind speed series immediately after a sharp increase; this situation is common in actual observations and is important in an accurate forecasting of short-term wind speeds.The two proposed approaches hold well and follow the changing trends of the actual series.While the single NAR-based neural network and 53H-NAR provide fluctuating results, Abstract and Applied Analysis  the persistent and ARIMA models show apparently delayed trends.Next, Segment 3 was selected because the actual series experienced a rapid decline here before sharply trending upward.The NAR and 53H-NAR methods enlarge the range of ups and downs within the actual series, mainly due to the abnormal values and noisy signals contained in the original series.The EEMD-NAR method performs well in the first half of the decreasing section, but then it moves far away from the trend of the actual observations.Comparing these data to the results of the proposed HEN1 model, which performed much better in this segment, the contribution of the 53Hbased modification is obvious.Generally speaking, the two proposed hybrid approaches can provide more reliable and stable results regarding short-term wind speed forecasting problems.Table 1 and Figure 10 show the performance comparisons among different models for all twelve samples addressed in this paper.Three error criteria are considered, including the RMSE, MAE, and MAPE, introduced in Section 3.2.After that, Table 2 provides the performance comparisons according to data collected from different seasons and from different stations.On the whole, the proposed two approaches provide the final results with minimal statistical error; the HEN2 model performed best in most cases.Analyzing the examinations at Station 1, the two proposed approaches perform similarly; the values of all three error criteria grow quite close between the two hybrid models.While at Station 3, the HEN2 model shows a stronger ability to forecast short-term wind speeds, compared with the HEN1 model.This may result from difference in the series information between the observations of these two stations.As such, a model's performance may become distinct by inputting different datasets.The average MAPE of the HEN2 model is 8.71%, which is the lowest MAPE among the considered approaches as calculated for the twelve simulated samples.This outcome leads to reductions of 38.25%, 36.57%, and 13.73%, according to MAPE, respectively, when it is compared with the NAR, 53H-NAR, and EEMD-NAR results.This indicates the strong capacity of the proposed HEN2 method to forecast short-term wind speeds, suggesting the benefits of combining these three statistical tools.More specifically, although the HEN2 model maintains the lowest MAPE, its forecasting errors range from 3.63% to 17.20% for different samples.This implies that the model's performance is heavily based on the original input dataset, mainly regarding its internal information and the statistical characteristics of the historical observations.Moreover, this persistent method is always taken as a benchmark in short-term forecasting problems, as in this paper.In Table 1, the forecasting errors from the NAR and ARIMA models get close to the errors    from the benchmark, demonstrating that a single method cannot meet the need for accurate short-term wind speed forecasting.
Table 2 provides a clear comparison among different seasons and stations.Clearly, each model performs similarly across different stations, with slightly higher errors at Station 2. However, there is a large gap in the forecasting errors when the results are considered according to different seasons, as shown in the table.The data from January and April includes significantly greater forecasting errors higher than that from July and October.This situation is identical in all three chosen stations, with the exception of Station 1. Combining the data description (as Figure 3 shows) into analysis, if the input datasets perform with stronger variation and more complex changing trends, the forecasting results will always be accompanied by higher statistical errors.

Conclusions
Short-term wind speed forecasting, an essential support to the regulatory actions and short-term load dispatching planning during the operation of wind farms, is currently regarded as one of the most difficult problems to be solved.This paper contributes to the goal of short-term wind speed forecasting by developing two three-stage hybrid approaches (named HEN1 and HEN2, resp.).Both are three-stage models, with an abnormal value detection and modification process based on the 53H method, a signal decomposition and noise filtering sector dealt with by the EEMD algorithm, and a training and forecasting stage handled by NAR-based neural networks.The chosen datasets were ten-minute wind speed observations from three stations in western China, including twelve samples.Both the simulation and the comparison indicate a strong capacity for the proposed methods to address short-term wind speed forecasting problems.
(i) The HEN2 model performs best in most cases.
The average MAPE of the HEN2 model is 8.71%, which is the lowest MAPE out of all the considered approaches, based on the twelve samples simulated.This approach results in reductions of 38.25%, 36.57%, and 13.73% according to MAPE compared with the NAR, 53H-NAR, and EEMD-NAR, respectively.(ii) The HEN2 performs better than the HEN1 in most cases.This may results from both the different model structures of them and the particularity of data in different stations.When the number of IMFs obtained from EEMD is relatively large, the HEN2 performs better.The reason may be that each IMF can introduce extra uncertainty into the final forecast, during the HEN1 process.(iii) According to comparisons of different seasons and stations, each model was shown to perform similarly across different stations, although a large gap of forecasting errors was found when considering the results according to different seasons.This indicates that model performance is heavily reliant on the inner information of an input data series, even though the proposed hybrid approaches maintain the lowest forecasting error in all twelve samples.
In summary, the developed methods, especially the HEN2 method, can provide a significant enhancement of the model's performance in short-term wind speed forecasting.This is of great significance in the actual regulation of active decision making and short-term load dispatching planning during the operation of wind farms.

Figure 1 :
Figure 1: The structure of NAR neural networks.

Figure 2 :
Figure 2: The flowchart of the proposed models.

Figure 5
also displays the removed signal information and the denoised data series, indicating that EEMD-based decomposition and signal filtering are quite effective in removing the interference from noisy signals.For the other examples used in this paper, Figure6provides a simple view of the EEMD-based decomposition and signal filter results.For each sample, the input of EEMD-based decomposition is the 53H-modified data series, (e) Basic statistics between the raw series and 53H-modified series Notice: "Variation" means the range and trend of variations compared with the raw datasets.(c) Segment enlargement 2 (d) Segment enlargement 3 (a) Detection and modification result by 53H method (b) Segment

Figure 4 :
Figure 4: Data detection and modification by the 53H method, selected from July 2011 at Station 1.

Figure 7 :
Figure 7: The forecasting results from both proposed hybrid models, exemplified by data from July 2011 at Station 1.

Figure 8 :
Figure 8: (a) Results of hybrid approaches for Station 1.(b) Results of hybrid approaches for Station 2. (c) Results of hybrid approaches for Station 3.

Figure 9 :Figure 10 :
Figure 9: Forecasting results from different models, exampled by data at Station 1 from July 2011.
3.1.Collection of Data.The wind speed data used for the model construction and performance testing in this paper includes observations at ten-minute time intervals that were collected from three sites in the Hexi Corridor of western China.For each station, observations in January, April, July, and October were selected to represent winter, spring, summer, and autumn, respectively, in order to evaluate the applicability of the proposed models.The wind speeds from the different seasons have their own ranges of values and variation; generally, wind speeds in April have larger mean values.The datasets used are displayed in Figure3; the first 755 points are used for model construction and then the following 144 points are chosen as a testing set.For each station, the figure consists of four parts, mainly included in the information on the selected data series and site descriptions.Basic statistics show that wind speed observations collected from the different sites have their own ranges of variation and statistical characteristics.Generally, the wind speeds from April, the representation of spring, have larger values, while winter always contains lower wind speeds.
3.2.Criteria for Evaluating Forecasting Performance.The experiments in this paper adopt three error criteria, which are commonly used to evaluate the forecasting performances of different models.They are the root mean square error (RMSE), mean absolute error (MAE), and mean absolute percentage error (MAPE).Let   ,  = 1, 2, . . ., , be the observation series with the corresponding forecast x .These criteria are defined as follows:

Table 1 :
Forecasting errors of different models.

Table 2 :
Performance comparison according to data collected from different seasons and from different stations.