Assessment of Three Learning Machines for Long-Term Prediction of Wind Energy in Palestine

In this research, an approach for predicting wind energy in the long term has been developed. +e aim of this prediction is to generate wind energy profiles for four cities in Palestine based on wind energy profile of another fifth city.+us, wind energy data for four cities, namely, Nablus city, are used to develop the model; meanwhile, wind energy data for Hebron, Jenin, Ramallah, and Jericho cities are predicted based on that.+reemachine learning algorithms are used in this research, namely, Cascade-forward neural network, random forests, and support vector machines.+e developed models have two input variables which are daily average cubic wind speed and the standard deviation, while the target is daily wind energy. +e R-squared values for the developed Cascade-forward neural network, random forests, and support vector machines models are found to be 0.9996, 0.9901, and 0.9991, respectively. Meanwhile, RMSE values for the developed models are found to be 41.1659 kWh, 68.4101 kWh, and 205.10 kWh, respectively.


Introduction
Renewable energy sources have become the focus of the attention of countries and energy companies because of their sustainable and inexhaustible nature and their availability everywhere in the world unlike fossil fuels and minerals, and most importantly, they are not polluting to the environment [1]. Wind energy is the second largest source of renewable energy in the world after hydropower and it is one of the fastest growing energy sources [2]. e main advantages of wind energy as an energy source are that it is relatively safe, environmentally friendly, quick to install, and scalable (a wind farm can consist of a couple of turbines or hundreds depending on the needs they cover) and it has a low-carbon footprint throughout the lifecycle of the project [3,4]. However, many challenges are facing the exploitation of wind energy, such as the high initial cost, the impact on wildlife, noise, and visual pollution (some remote areas suitable for wind energy need a high cost to be connected to the electrical network), and the unpredicted amount of energy due to variation in wind speed and weather conditions. e most significant problem facing wind energy is that the wind does not blow all the time in most sites. Wind speed is variable all the time; it may blow for four days in a row and then sits idle for two days or more [2]. erefore, wind energy cannot be available all the time like fossil fuels. Because of that, wind speed analysis and forecasting must be done at the location where the wind farm is to be established to ensure that it will be appropriate to produce energy in a feasible manner.
Wind energy forecasting is extremely important to select an appropriate site to install a wind farm capable of producing sufficient and profitable amounts of energy. Wind energy forecasting methods can be divided into two categories depending on the time range, namely, short-term forecasting methods and long-term forecasting methods. Short-term methods are used to predict wind energy over a short period ranging from an hour to several days, while long-term methods are used to predict wind energy over a long period ranging from several days to several years. In the literature, many methods that rely on statistics or machine learning have been used to predict wind energy in both short term and long term.
Numerous researchers have used machine learning to predict short-term wind energy. In [5], Li et al. proposed a new effective method for wind power forecasting based on the support vector machine model. In this method, the optimal parameter for the support vector machine model was optimized using improved dragonfly algorithm. Najeebullah et al. [6] used an approach that utilizes a combination of machine learning techniques for regression and feature selection before using enhanced particle swarm optimization and a hybrid neural network for wind power forecasting using a dataset consisting of daily wind speed, relative humidity, temperature, and wind power. Jursa and Rohrig [7] developed a model based on evolutionary optimization algorithms with artificial neural networks and nearest neighbor search. is model was able to predict wind power on hourly basis. Amjady et al. [8] used a hybrid machine learning model consisting of radial basis function, backpropagation neural networks, and enhanced particle swarm algorithm for wind power forecasting.
is model has achieved good results compared to some other methods. Abhinav et al. [9] proposed a robust and accurate model based on wavelet neural network.
is model is applicable to all seasons of the year and requires less historic data compared to other methods in literatures. Similarly, Chitsaz et al. [10] used wavelet neural network trained by improved clonal selection algorithm to optimize the free parameters of the wavelet neural network. Moreover, Du et al. [11] used three-layer feedforward wavelet neural network for multistep prediction of wind power time series.
is model presents excellent results compared to other methods in literatures. Pousinho et al. [12] proposed a hybrid approach for wind power forecasting in Portugal by combining particle swarm optimization and adaptive-networkbased fuzzy inference system. Significant improvements in prediction accuracy can be achieved using this approach, compared to results obtained from five other methods. A very short-term wind forecasting method was developed by Potter and Negnevitsky [13] using an adaptive neurofuzzy inference system to forecast a wind time series. is method requires both wind speed and wind direction as input for the adaptive neurofuzzy inference system. Lahouar and Ben Hadj Slama [14] used random forests for hour-ahead wind power forecasting in Tunisia. is method is immune to irrelevant inputs and does not require optimization. Wang et al. [15] proposed a hybrid model based on Bayesian model averaging and ensemble learning (BMA-EL) for daily wind forecasting. is model can forecast wind power under different meteorological conditions, with higher precision and reliability.
On the other hand, statistical models have also been used to predict wind power, and the most important of these models is ARMA model. Chen and Folly [16] successfully used the ARMA model to predict wind energy and wind speed for an hour ahead. is model has less error in wind power and wind speed forecasting compared to artificial neural networks and adaptive neurofuzzy inference systems. Erdem and Shi [17] used four approaches based on autoregressive moving average (ARMA) to predict wind speed and direction tuple. Results are compared using absolute average error (MAE) as a measure of prediction quality. e component model was found to be better in wind direction prediction than the traditional-linked ARMA model, while the opposite was observed for wind speed prediction. Han et al. [18] proposed two hybrid models based on autoregressive moving average (ARMA) by adopting the nonparametric models. e results of this research show that nonparametric hybrid models are generally better than other models and have more robust performance prediction. Sfetsos in [19] applied autoregressive integrated moving average (ARIMA) for multistep forecasting of 10 minutes averaged data and the subsequent averaging to generate mean hourly predictions. e result of this model outperforms the conventional methods that utilize past mean hourly wind speed as model input. Kavasseri and Seetharaman [20] used fractional ARIMA model for wind power forecasting on day-ahead and two-day-ahead horizons. Model errors were computed and compared to the persistence models.
e results show a significant improvement in prediction accuracy compared to the persistence method. Dupre et al. [21] used the downscaling model to predict hourly and subhourly wind speed at 100 m height using outputs from the European Centre for Medium-Range Weather Forecasts (ECMWF).
is model outperforms ANN, ARMA, and persistence models.
In long-term wind forecasting using machine learning, Grassi and Vecchio [22] used a two-hidden-layer neural network trained by backpropagation learning algorithm on wind data from three different wind farms. eir model was able to predict wind power on monthly basis with a high accuracy. Barbounis and eocharis [23] suggested using a locally recurrent neural network optimized by recursive prediction error (RPE) algorithm for long-term wind power and wind speed forecasting. is suggested model shows a better performance and accuracy compared to atmospheric and time series forecasting models. Samadianfard et al. [24] used a multilayer perceptron (MLP) optimized by whale optimization algorithm for long-term wind speed forecasting in Iran. e research shows that whale optimization algorithm could improve the accuracy of the MLP model. Carolin Mabel and Fernandez [25] developed a model based on artificial neural networks using MATLAB. To predict wind power, this model requires three inputs: wind speed, relative humidity, and generation hours. Yan and Ouyang [26] used physical mechanisms and data mining algorithms to create a hybrid model for wind power forecasting based on a monthly basis. Model results show a better performance in terms of accuracy and cost analysis compared to traditional models. Khan et al. [27] used a robust method called Cartesian genetic programming to develop artificial neural network for wind power forecasting. is model can forecast wind power from one hour up to a year. Dumitru and Gligor [28] proposed a model based on feedforward artificial neural network for long-term wind power forecasting in South-East part of Europe.
In long-term wind forecasting using statistical models, Cadenas and Rivera [29] developed a hybrid model based on artificial neural network (ANN) and autoregressive integrated moving average (ARIMA) models for wind speed forecasting.
is hybrid model shows better accuracy compared to other ANN or ARIMA based models. Kamal and Jafri [30] used the time series model ARMA for wind speed prediction.
is model takes into account several features such as autocorrelation, non-Gaussian distribution, and diurnal nonstationarity. Liu et al. [31] proposed two hybrid models called ARIMA-Kalman and ARIMA-ANN for wind speed prediction. Results show that both models are performing well and can be applied to the nonstationary wind speed. De Alencar et al. [32] used artificial neural network, autoregressive integrated moving average (ARIMA), and wavelets hybrid model for wind speed prediction. is model can be used on short-term, mediumterm, and long-term wind speed prediction.
From the literature, it is clear that the most effective methods are artificial neural networks as well as support vector machines. Meanwhile, most of the utilized artificial neural networks are feedback forward neural network. It is assumed here in this research that other neural networks may perform better than feedback artificial neural network such as cascaded feedback forward artificial neural network. Moreover, the research examines a new forecasting technology which is random forest, as such a technology proved its ability in predicting other renewable energies such as solar energy. us there are two main objectives in this research which are proposing an assessment of three types of learning machines which are cascaded feedback forward artificial neural networks, random forest, and support vector machine for predicting long-term wind speed. In addition, this research aims to predict wind energy for four cities in Palestine as such a research has not been done before. e importance of the second objective is to have a model that is able to predict wind energy for any location in Palestine. Such a model will be important for any investment in the field of wind energy in this country.

Proposed Learning Machines
Machine learning is a subfield of artificial intelligence (AI) that has evolved from pattern recognition and is used to explore data structure and fit models that users can understand and use [33]. It answers the question of how to build a computer program using historical data, solve a specific problem, and automatically improve program efficiency through experience [33]. Machine learning is related to the field of mathematical statistics and mathematical optimization. It is divided into multiple methods such as supervised learning, unsupervised learning, semisupervised learning, and reinforcement learning; each method has specific use cases and its own algorithms.
Supervised learning is a model in which both the required input and output data are provided. Input and output data are categorized to provide the basis for learning to process data in the future. In this research, three supervised learning algorithms will be used, namely, cascade-forward neural network, random forests, and support vector machines.

Cascade-Forward Neural Network.
Cascaded forward neural network (CFNN) can be described as a "self-organizing" neural network. e CFNN network generates many hidden layers one after another to learn and evolve from its basic structure to become a multilayer network. It is called cascade because it includes a connection to the following layers from the input and each previous layer. It consists of three layers: input layer, hidden layer, and output layer. e network does not change the neuron values in the input layer, but it distributes them on hidden and output layers. A transfer function obtains the final output value by receiving the sum of the neuron output values multiplied by its weight [34].

Random Forest.
Random forest is a classification and regression machine learning technique that works by assembling a multitude of decision trees developed by [35]. Random forest algorithm is a mixture of numerous decision trees that are generated from the learning data of the predictor's samples; these samples are chosen arbitrarily at each node using the bootstrapping technique. e random forests algorithm is based on the CART model approach [36]. However, there are some noteworthy differences. e first difference is that the training data is selected arbitrarily while the best split is computed at each split node of the random forests. All the trees in the random forest reach the maximum by using the "no clipping" step. e input variables can be ranked based on their significance to the output by comparing the effect of each input on the model accuracy.
is comparison is based on the out-of-bag error.

Support Vector
Machines. Support vector machines (SVMs) are one of the most popular classic machine learning techniques. One of the most important advantages of the SVM technique is that it is nonparametric and therefore does not adopt any prior knowledge about the primary distribution of data. Another advantage is its distribution and resiliency and the fact that it can easily handle large datasets with unknown, complex, and high-dimensional dependency structures [37]. However, the most important feature of the SVMs technique is its robustness. is feature is very significant because it guarantees this technique to remain able to work well even with the presence of outliers or extreme data, regardless of whether these are simple errors in the data or extreme observations, or data comes from an extreme value or heavy-tailed distributions [37].

Utilized Wind Speed Data
In this research, wind speed data for five Palestinian cities are used, namely, Nablus, Ramallah, Hebron, Jenin, and Jericho. ese data were measured at 10 m high and were obtained from the PVGIS database. Figure 1 shows these data.

Wind Speed and Installation Height.
e PVGIS wind speed data are measured at 10 m above ground level, which is lower than the hub height of 50 m of the selected turbine. erefore, the wind speed at the turbine hub height can be estimated by using the power law equation with the measured data as reference input. Power law equation is given by

Mathematical Problems in Engineering
where V is the wind speed at the hub height, V 0 is the measured wind speed, H is the hub height, H 0 is the height where the speed was measured, and a is a constant that varies with surface roughness and terrain condition. is constant (a) has a typical value of 0.14 for smooth, level, grass-covered terrain [38].

Wind Distribution.
e output power of wind turbines depends on the speed of the wind and air density, but they are not the only factors. Wind power also depends on wind speed distribution and wind speed frequency. e two-parameter Weibull distribution is the most used distribution in wind energy studies. It has been found that this distribution is a good fit with the measured wind speed data. e Weibull probability density function is given by And the Weibull cumulative distribution function is given by where v is the wind speed, k is Weibull shape parameter, and c is Weibull scale parameter. Weibull parameters are calculated by fitting the wind speed data to (2) by using an iterative procedure to minimize the summation of the absolute difference between (2) and the real data. Table 1 shows Weibull scale parameters and Weibull shape parameters for the selected cities for wind energy forecasting in this research. Figure 2 shows the Weibull probability density function for these selected cities.

Wind Energy.
Wind energy has been used for thousands of years in many ways such as windmills, sailing, and wind turbines. Modern large-scale wind turbines are machines used to convert kinetic wind energy into mechanical energy, which is converted into electrical energy using electric generators [39]. Wind energy is a kinetic energy obtained by utilizing the kinetic energy of the flowing air. e available wind kinetic energy is directly proportional to air mass and airflow speed. However, it is easier to use air density instead of air mass to compute available wind energy and actual wind turbine output energy using (4) [38] and (5) [40], respectively: where P w is the available wind power, P a is the actual output power generated by the wind turbine, ρ is the air density, A is the swept area of the blades of the wind turbine, V is wind speed, and C p is the power coefficient. e power coefficient (C p ) is an indicator of total wind turbine system efficiency, and it depends on many factors, such as tip angel, blade shape, and the correlation between wind speed and rotor speed. According to Carrillo [41] the maximum theoretical value of the power coefficient (Betz limit) for any turbine is 0.593. However, real turbines cannot achieve this value and their maximum value is usually around 0.5. e output power of a wind turbine is usually represented through its power curve, where there is a relationship between wind speed and turbine output power, as shown below [41]: where v is hourly wind speed, v ci is turbine cut-in speed, v r is turbine rated speed, v co is turbine cut-out speed, P turbine is turbine output power, P r is turbine rated power, and P a is the nonlinear relationship between turbine power and wind speed (in (2)). In this research, a 1 MW turbine will be used for wind power modeling. Figure 3 shows the power curve of this turbine, and the technical specification of this turbine is shown in Table 2.

Prediction of Wind Speed Data
In this research, training and testing datasets were generated using datasets obtained from PVGIS database. Figure 1 shows this dataset which contains hourly wind speed data at 10 m height for twelve years for the city of Nablus and one year for Hebron, Ramallah, Jenin, and Jericho. e training dataset for machine learning model was generated based on Nablus hourly wind speed data, where the inputs in the training dataset are the daily average cubic wind speed and the standard deviation, while the target is the daily wind energy. Similarly, the testing dataset was generated based on other cities hourly wind speed data. e training process for the machine learning model starts by calculating wind speed at hub height (50 m) using (1), hourly energy using (6), daily energy using (7), daily average cubic wind speed using (8), and daily standard deviation using (9). en, daily average cubic wind speed and standard deviation are used to train the models to predict daily wind energy: e trained models testing process starts by generating the testing dataset for Hebron, Ramallah, Jenin, and Jericho in the same way training datasets were generated. Figure 4     Mathematical Problems in Engineering 5 shows the flowchart of the whole approach used for wind energy forecasting. e accuracy of the proposed model is evaluated based on the root mean square error (RMSE), mean bias error (MBE), and coefficient of determination (R 2 ).
MBE provides information about the long-term performance of the proposed method and it shows the average variance between the predicted values of the energy to the corresponding values of the actual energy. In the current prediction, a positive MBE error represents overestimation of data from datasets and vice versa. MBE can be determined using the equation below: Eventually, RMSE is a measure of the variance of the energy values from the model around the values of the actual energy and it provides information on the short-term performance. RMSE is calculated by where E p (k) is the predicted energy, E a (k) is the actual energy value, and n is the number of data points. Eventually, R 2 is coefficient for analyzing how a difference in one or more variables can be explained by a difference in another variable. R 2 can be calculated by where SS res is regression sum of squares and SS tot is the sum of squares of residuals.

Results and Discussion
In this research, long-term wind energy has been predicted using three machine learning algorithms, namely, cascadeforward neural network (CFNN), random forests (RFs), and support vector machines (SVM). MATLAB statistics and machine learning toolbox were used to train and test these models. Nablus wind speed data was used to train the models; then these models were tested on wind speed data of four other cities.
In general, all of the developed models have two inputs which are daily average cubic wind speed and standard deviation and one output which is daily wind energy. e developed CFNN model consisted of ten hidden layers and 32 neurons and the network is trained using Levenberg-Marquardt optimization. As for the training data, 70% of the data were used for training, 15% for validation, and 15% for testing. Meanwhile, the data for the other four cities were not used in all of the aforementioned process so as to avoid overfitting. Meanwhile, the proposed RF model consisted of 150 trees with one leaf each. Finally the kernel function in SVM was chosen as Quadratic.
According to the results, the cascade-forward neural network predicted energy throughout the year, with the highest accuracy followed by SVM with slightly lower  Mathematical Problems in Engineering accuracy, and finally the RFs model. It was noted that the RFs model could not predict high energy because the models were trained at relatively low speeds and this model relies mainly on regression; it cannot predict data outside its range. Table 3 shows the RMSE, MBE, and R 2 for the three models.
In general, for the CFNN model, the average RMSE is 33.07 kWh which means that the error in predicting any daily value of wind energy in this location might be up to this range. e highest value for of RMSE for the CFNN was recorded for Hebron city whereas wind energy potential is the highest; meanwhile the training of the model was done based on data for Nablus city which has average wind energy optional. is affects the accuracy of the model in predicting high wind energies as such values were considered outliers for the dataset. However, it does not help using wind speed data for Hebron city to train the model instead of Nablus city as the model will be less accurate in predicting data for other cities. us, the best is to develop individual models for each city and nearby locations. However, in this research the aim was to develop a general model for all locations in Palestine. On the other hand, the MBE values for the proposed CFFN model were varying from an average of 4.15 kWh of overestimation of wind energy and − 4.44 kWh of underestimation of wind energy. ese values are very fine and show an ability of the proposed model to predict such data. Finally a high value of correlation factor was noticed for this model and other models, which is quite logical as the utilized inputs and outputs are highly correlated.
As for the SMV model, the measures show close accuracy to the proposed CFNN model, while RF model showed the worst accuracy measures as compared to the other two models. is is also expected as RFs are better for classification than prediction of nonlinear and highly uncertain data.
In addition to that, CFNN shows the strongest ability to predict wind energy for locations with high energy potential (Hebron city) as compared to other models. Based on that, CFNN was chosen in this research as the best model for prediction of wind energy.

Conclusion
In this research, three learning machines were developed to predict wind energy in the long term for any location in Palestine.
ese learning machines were trained by using wind energy data in four cities in Palestine. Wind energy data for three cities were used to train the developed models, while the developed models were utilized to predict wind energy for the fourth city (Nablus). According to the results, all of the proposed models were able to predict wind energy in several other cities with acceptable accuracy. Specifically, the proposed CFNN model was the most accurate model for wind energy forecasting at different locations. Meanwhile, the proposed SVM model accuracy was slightly less than the proposed CFNN model. On the other hand, the accuracy of the proposed RF model was quite far from the other proposed models (CFNN and SVM) as RF is usually used for classification more than prediction. However, the accuracy of these models can be increased effectively by increasing training data using different locations data with different wind speed ranges. After all, the accuracy of the CFNN model and other models was evaluated based on three statistical measures which are RMSE, MBE, and R 2 . For CFNN model, average of these values was 30.07 kWh, 4.3 kWh, and 0.999, respectively. Such a method and analysis are useful for any investment or research in the field of wind energy in Palestine.

CFNN:
Cascade-forward neural network RF: Random forest ANN: Artificial neural network SVM: Support vector machines BMA_EL: Bayesian model averaging and ensemble learning ARMA: Autoregressive moving average MAE: Absolute average error ECMWF: European centre for medium-range weather forecasts PSO: Particle swarm optimization P w : Available wind power P a : Actual output power P turbine : Turbine output power ρ: Air density A: Area of the blades MBE: Mean bias error R 2 : Coefficient of determination MAPE: Mean absolute percentage error RPE: Recursive prediction error CART: Classification and regression tree AI: Artificial intelligence PVGIS: Photovoltaic geographical information system V: Wind speed V 0 : Measured wind speed

H:
Hub height a: Constant varies with surface roughness and terrain condition H 0 : Height where the speed was measured P r : Turbine rated power C p : Power coefficient v r : Turbine rated speed v co : Turbine cut-out speed RMSE: Root mean square error SS tot : Sum of squares of residuals.
Data Availability e utilized data with editable figures and tabulated files are available from the authors upon request.

Conflicts of Interest
e authors declare no conflicts of interest.