^{1}

^{2}

^{1}

^{2}

This paper presents a novel approach for accurately modeling and ultimately predicting wind speed for selected sites when incomplete data sets are available. The application of a seasonal simulation for the synthetic generation of wind speed data is achieved using the Markov chain Monte Carlo technique with only one month of data from each season. This limited data model was used to produce synthesized data that sufficiently captured the seasonal variations of wind characteristics. The model was validated by comparing wind characteristics obtained from time series wind tower data from two countries with Markov chain Monte Carlo simulations, demonstrating that one month of wind speed data from each season was sufficient to generate synthetic wind speed data for the related season.

One of the most challenging features of wind energy application is the uncertainty of the wind resource. Wind speed and hence wind energy potential has a key influence on the profitability of a wind farm and on the management of transmission and distribution networks by utility systems operators. Accurate and reliable data related to both long-term and short-term wind characteristics are essential for site selection and technology specification [

Learning-based techniques predominantly employ advanced analysis using neural- or fuzzy-based approaches or hybrid models. Guo et al. [

An emerging research area is the use of Markov chain Monte Carlo (MCMC) simulation techniques to evaluate and model wind speed characteristics. Several researchers used first-order MC models. Jones and Lorenz [

Some researchers have included wind direction in an attempt to improve the MC model. Ettoumi et al. [

Kaminsky et al. [

The following researchers used first-, second-, and third-order MC models for both wind speed and wind power data. Papaefthymiou and Klöckl [

Erto et al. [

Finally, Negra et al. [

This paper proposes the use of stochastic simulation techniques for synthetic wind data generation and the potential development of a tool for the assessment of the profitability of wind farm projects, accurate prediction for system operations, and wind energy markets. The research summarized above suggests that MC models produce accurate results in terms of PDFs and general statistical characteristics of data in particular mean, median, standard deviation, variance, quartiles, minimum and maximum values, and Weibull distribution parameters. However, because of their short memory, they are unable to reproduce the persistence and/or periodic structure of wind data which manifests itself in ACFs and PSDs. This paper addresses this issue and attempts to retain the integrity of the data while reducing complexity. Instead of using yearly data, the data sets are divided into subsets (therefore keeping the individual seasonal information) and using sufficiently less data to simulate the whole subset while preserving the main statistical characteristics.

This research employs two different data sets obtained from operational meteorological towers located in the USA and Turkey. The first tower is located at the National Wind Technology Center (NWTC), Colorado, USA, and it provided hourly average wind speed time series data at 10 m, 20 m, and 50 m for a period of 12 months (01.12.2005–01.12.2006). The second tower is located in the Süpürgelik region of Yalova in Turkey, and it provided hourly average wind speed time series at 30 m for a period of 9 months (01.12.2005–01.09.2006). The two countries are both located in the northern hemisphere but have different seasonal characteristics and provide one method of determining if the model can be applied to other data sets.

The objective of this research is to develop a model that can accurately reflect seasonal variations using limited data from an annual time series data set. To address this, both data sets were arbitrarily divided into categories identified as winter, spring, summer, and autumn. As both data sets have been obtained from locations in the northern hemisphere, seasons have been defined as winter: December, January, and February; spring: March, April, and May; summer: June, July, and August; and autumn: September, October, and November. The first month of each season was selected and used to generate synthetic data for the related season with the aim of using less data in the simulation process. First- and second-order transition matrices were produced from the measured data for the selected months and used to generate synthetic data. Figure

Histograms of seasonal wind speed time series: (a) TR at 30 m and (b) US at 20 m.

A first-order MC was used to determine the next state of a stochastic process depending only on its current state. In the case of second-order MC, the next state depends on both the present state and the most recent previous state. This could be generalized for higher order MC where the probability of a future state depends only on the given past history of the process through the present state, not on any other past state. The order of the chain represents the present or/and previous time steps which have been taken into account to calculate the transition probability of the next state. These probabilities are included in the cells of a matrix called a transition probability matrix. The size of the matrix depends on the states which should be discretized by considering the nature of the random variable as well as the modeling purpose. In order to calculate transition probabilities, a further assumption is made which depends on the definition of an MC.

A discrete-time stochastic process is an MC if, for all

Equation (

If one assumes that the conditional probability stated in (

The transition probability matrix (TPM) for a first-order MC with

For all

The formula in (

If the transition probability in the

The MCMC simulation procedure for synthetic generation of wind speed time series is therefore achieved using the following steps.

Define states of the associated MC and construct TPM.

Construct CPM.

Generate uniformly distributed random numbers between 0 and 1.

Select an initial state randomly, say,

Compare the value of the random number with the elements of

A transition from state

By repeating Steps

In order to test the validity of the model, a combination of visual evaluation, comparison of general statistical parameters (descriptive statistics and Weibull distribution parameters), and goodness of fit tests were used. The transition probability matrices of both first- and second-order MC models have the probability mass concentrated on and around the diagonal elements. This result implies that the next wind speed will be most likely in the same state as the current wind speed and the probability of a transition between far states is infrequent. General statistical parameters of observed and generated wind speeds for both locations are presented in Tables

General statistical parameters in m/s for observed wind speed time series (Colorado, US, 20 m).

Actual data (m/s) | ||||
---|---|---|---|---|

Winter | Spring | Summer | Autumn | |

Mean | 6.36 | 4.28 | 3.58 | 4.11 |

St. deviation | 4.60 | 3.23 | 1.74 | 3.01 |

Variance | 21.13 | 10.41 | 3.02 | 9.07 |

Coef. of variation | 72.29 | 75.44 | 48.60 | 73.23 |

Minimum | 0.31 | 0.31 | 0.56 | 0.48 |

First quartile | 2.84 | 2.25 | 2.35 | 2.09 |

Median | 5.25 | 3.33 | 3.17 | 3.09 |

Third quartile | 8.79 | 5.21 | 4.46 | 5.11 |

Maximum | 27.15 | 23.86 | 11.49 | 20.12 |

Weibull shape Parameter (ML) | 1.437 | 1.493 | 2.188 | 1.507 |

Weibull shape Parameter (WAsP) | 1.371 | 1.074 | 1.769 | 1.114 |

Weibull scale Parameter (ML) | 7.021 | 4.785 | 4.053 | 4.600 |

Weibull scale Parameter (WAsP) | 6.900 | 3.936 | 3.790 | 3.806 |

General statistical parameters in m/s for generated wind speed time series (Colorado, US, 20 m).

Synthetic data (m/s) | ||||
---|---|---|---|---|

Winter | Spring | Summer | Autumn | |

Mean | 6.24 | 3.77 | 3.81 (3.74) | 3.86 (4.03) |

St. deviation | 4.63 | 3.22 | 1.85 (1.77) | 2.75 (2.96) |

Variance | 21.44 | 10.36 | 3.41 (3.13) | 7.59 (8.79) |

Coef. of variation | 74.21 | 85.34 | 48.46 (47.30) | 71.35 (73.49) |

Minimum | 0.25 | 0.25 | 0.75 (0.51) | 0.50 (0.50) |

First quartile | 2.69 | 1.81 | 2.43 (2.53) | 2.06 (2.11) |

Median | 5.07 | 2.90 | 3.34 (3.25) | 3.02 (3.20) |

Third quartile | 8.67 | 4.57 | 4.81 (4.64) | 4.89 (4.90) |

Maximum | 27.47 | 23.95 | 10.97 (10.99) | 19.76 (19.70) |

Weibull shape Parameter (ML) | 1.418 | 1.336 | 2.203 (2.245) | 1.532 (1.499) |

Weibull shape Parameter (WAsP) | 1.336 | 0.972 | 1.743 (1.699) | 1.180 (1.077) |

Weibull scale Parameter (ML) | 6.888 | 4.142 | 4.318 (4.240) | 4.319 (4.504) |

Weibull scale Parameter (WAsP) | 6.757 | 3.359 | 3.996 (3.863) | 3.731 (3.629) |

Note: values in parenthesis are the simulation results for 2nd order MC.

General statistical parameters in m/s for observed wind speed time series (Yalova, TR 30 m).

Actual data (m/s) | |||
---|---|---|---|

Winter | Spring | Summer | |

Mean | 6.68 | 5.56 | 5.91 |

St. deviation | 4.01 | 2.97 | 3.25 |

Variance | 16.09 | 8.80 | 10.58 |

Coef. of variation | 60.04 | 53.37 | 55.06 |

Minimum | 0 | 0 | 0.01 |

First quartile | 3.78 | 3.37 | 3.32 |

Median | 6.19 | 5.17 | 5.50 |

Third quartile | 9.17 | 7.24 | 8.19 |

Maximum | 23.23 | 17.83 | 18.34 |

Weibull shape Parameter (ML) | 1.596 | 1.945 | 1.868 |

Weibull shape Parameter (WAsP) | 1.785 | 1.917 | 1.976 |

Weibull scale Parameter (ML) | 7.378 | 6.263 | 6.645 |

Weibull scale Parameter (WAsP) | 7.561 | 6.241 | 6.736 |

General statistical parameters in m/s for generated wind speed time series (Yalova, TR, 30 m).

Synthetic data (m/s) | |||
---|---|---|---|

Winter | Spring | Summer | |

Mean | 6.23 | 5.50 | 5.32 (5.57) |

St. deviation | 3.85 | 3.00 | 2.81 (2.99) |

Variance | 14.83 | 9.03 | 7.91 (8.97) |

Coef. of variation | 61.80 | 54.65 | 52.91 (53.76) |

Minimum | 0 | 0 | 0 (0.004) |

First quartile | 3.51 | 3.26 | 3.09 (3.17) |

Median | 6.01 | 5.27 | 5.16 (5.21) |

Third quartile | 8.50 | 7.26 | 7.38 (7.83) |

Maximum | 23.37 | 17.99 | 16.54 (15.89) |

Weibull shape Parameter (ML) | 1.474 | 1.784 | 1.919 (1.923) |

Weibull shape Parameter (WAsP) | 1.874 | 1.955 | 2.176 (1.939) |

Weibull scale Parameter (ML) | 6.790 | 6.117 | 5.965 (6.271) |

Weibull scale Parameter (WAsP) | 7.316 | 6.233 | 6.148 (6.249) |

Note: values in parenthesis are the simulation results for 2nd order MC.

The frequency distributions of observed and generated wind speed time series for both locations are presented in Figures

Ansari-Bradley test results.

Actual data | Synthetic data | Equal medians | Ansari-Bradley test ( |
Decision |
---|---|---|---|---|

Winter 20 m US | Winter 20 m (1st order MC) | Yes | 0.2493 | H_{1 }rejected |

Spring 20 m US | Spring 20 m (1st order MC) | No | — | — |

Summer 20 m US | Summer 20 m (1st order MC) | No | — | — |

Summer 20 m (2nd order MC) | Yes | 0.0143 | H_{1 }rejected | |

Autumn 20 m US | Autumn 20 m (1st order MC) | Yes | 0.7394 | H_{1 }rejected |

Autumn 20 m (2st order MC) | Yes | 0.0408 | H_{1 }rejected | |

Winter 30 m TR | Winter 30 m (1st order MC) | Yes | 0.1361 | H_{1 }rejected |

Spring 30 m TR | Spring 30 m (1st order MC) | Yes | 0.099 | H_{1 }rejected |

Summer 30 m TR | Summer 20 m (1st order MC) | No | — | — |

Summer 20 m (2nd order MC) | No | — | — |

Histograms of observed and generated wind speed time series (Colorado, US, 20 m): (a) winter, (b) spring, (c) summer, and (d) autumn.

Histograms of observed and generated wind speed time series (Yalova, TR, 30 m): (a) winter, (b) spring, and (c) summer.

The Ansari-Bradley test is a nonparametric test which requires that random variables are mutually independent and coming from a continuous population with equal medians but does not require the assumption of normal distribution. The null hypothesis is that

Considering that the variability of the simulation output is affected by the stochastic nature of the input wind speed data as well as by the random nature of the MC model itself, the MCMC simulation technique is sufficient to preserve most of the statistical characteristics and stochastic behavior of wind speed time series. Also, it is possible to improve the accuracy by using a second-order MC model.

The simulation outputs for 10 m and 50 m heights for the Colorado data were also examined. While the 10 m output fits better than the 20 m, the 50 m output is a poorer fit than the 20 m. However, by increasing the state size and considering the shape of the PDF of the observed wind speed to determine the state intervals, it is possible to improve the accordance. These results prove that 1 month of wind speed time series is sufficient to generate synthetic wind speed time series for the related season, accurately.

The collection of long-term wind speed measurements is essential for the economic evaluation and hence the potential ability to obtain financial support for any wind energy project. In some cases, prolonged and continual data for resource assessment may not be available and there is a need for tools that provide accurate and reliable simulations of wind speed data based on limited data sets. This creates an opportunity for the production of synthesized data sets based on statistical parameters of a wind regime. In this paper, a limited data set was used to produce synthesized data. By using observed wind speed data from selected months, synthetic wind speed data was generated for related seasons. The comparisons between the actual data and the simulations showed that the statistical characteristics were satisfactorily reproduced. Therefore, the most important result is that only one month of wind speed data was sufficient to reproduce most of the general statistical characteristics and the stochastic behavior of wind speed time series for the related season. This result implies that Markov chain models could be used to complete missing data. The study also showed that the models used in this approach are impacted by the characteristics of the data set which prevalently manifests itself by probability distributions. Examining and considering these probabilistic characteristics in discretization of the states of a Markov chain model would provide a better representation of the actual wind pattern. A further study is needed to determine the sensitivity of the simulation outputs with regard to the different probability distributions. Also, it is expected that an application of a continuous-time Markov process may improve the accuracy especially in terms of reproduction of missing data.

The authors would like to thank Dr. Ahmet Duran Şahin for providing the wind speed data for Turkey. The authors gratefully acknowledge the financial support for this research which was provided by Agri-Futures Nova Scotia.