Short-Term Forecasting Models for Photovoltaic Plants: Analytical versus Soft-Computing Techniques

We present and compare two short-term statistical forecasting models for hourly average electric power production forecasts of photovoltaic (PV) plants: the analytical PV power forecasting model (APVF) and the multiplayer perceptron PV forecasting model (MPVF). Both models use forecasts from numerical weather prediction (NWP) tools at the location of the PV plant as well as the past recorded values of PV hourly electric power production. The APVF model consists of an original modeling for adjusting irradiation data of clear sky by an irradiation attenuation index, combined with a PV power production attenuation index. The MPVF model consists of an artificial neural network based model (selected among a large set of ANN optimized with genetic algorithms, GAs). The two models use forecasts from the same NWP tool as inputs. The APVF and MPVF models have been applied to a real-life case study of a grid-connected PV plant using the same data. Despite the fact that both models are quite different, they achieve very similar results, with forecast horizons covering all the daylight hours of the following day, which give a good perspective of their applicability for PV electric production sale bids to electricity markets.


Introduction
Power plants based on renewable energy sources (RESs) have experienced an important expansion in recent years.Economic, political and social reasons have propelled this expansion: rising fossil-fuel prices and carbon pricing, falling technology costs, governmental policies that promote their construction by means of subsidies, and a social preference for noncontaminant power plants.By 2035, the production in power plants based on renewable energy is expected to account for almost one-third of total electricity output [1], with wind energy and solar photovoltaic (PV) energy as main sources.
The variability and volatility of the renewable resource can make the integration of this kind of power plants into the power grid difficult.The supply and the load of electric power must be balanced at every instant: upcoming values of the power production in power plants and upcoming values of electric load are needed by the system operator in order to schedule reserves and to operate the power system in an economic way [2][3][4].Forthcoming power production in RES based power plants is calculated on the basis of shortterm forecasts.Additionally, in countries with a day-ahead electricity market, large power plants based on RES can act, as any other electricity producer providing power generation sale offers (bids) to the market.This kind of producers, such as wind farms or large PV plants, use forecasts of hourly energy generation to prepare the electric energy sale bids.These forecasts present a drawback: in electricity markets, when a power producer does not follow the scheduled bid (just that offered to the Market Operator), he/she will be penalized with retributions lower than those established in the market for those hours with deviation between the electric energy actually produced and that presented in the bid.So, for a PV power producer, good quality forecasting systems are needed for reducing penalties in electricity markets, and for optimizing profits.These technical and economic reasons have driven the development of short-term power forecasting models for wind farms or for relatively large grid-connected PV plants.
In recent years, several short-term power forecasting models related to PV plants have been published.In an early stage, the models for PV plants were oriented to obtain solar radiation predictions [5][6][7][8].Some works present models specifically dedicated to the hourly power generation forecasting in PV plants [9][10][11][12][13][14][15][16][17].The most applied technique in these forecasting models is a specific soft-computing technique known as artificial neural networks (ANNs).Power outputs of a PV plant with forecasting horizons of 1 and 2 h ahead were predicted with several forecasting models in [9] models based on ANNs optimized with genetic algorithm (GA) achieve the best results.In [10], a model based on ANNs was used to provide power forecast for a small scale PV panel at different time horizons up to 5 hours.One-hourahead power output forecasts were obtained using a model in [11], based on ANNs and wavelet transformation.Hourly power production forecasts for PV plants were presented in [12], not only in a local scale but also in regional scale.Hourly insolation and temperature forecasts for the next 24 h was presented in [13]; afterwards, such forecasts were used to estimate PV plant hourly power productions.Predictions of PV plants production based on genetic programming of evolution of fuzzy rules were proposed in [14].PV plant hourly power generations were obtained by support vector machines in [15,16].Recently, a combination of ANN and evolutionary algorithms have been applied in [17] to shortterm PV plants production forecasting.
This paper presents an analytical model and a softcomputing model for short-term power forecasting of PV plants and the comparison between them.The first model, named analytical PV power forecasting model (APVF), is an original approach based on analytical irradiance computation for clear sky, adjusted by an irradiation attenuation index, combined with a PV power production attenuation index, and it utilizes, as input, hourly radiation forecasts from a NWP.The second one, named multilayer perceptron PV forecasting model (MPVF), is an artificial neural network based model (selected as the best one among a large set of ANNs optimized with GAs), that uses the forecasts of weather variables obtained with a NWP model as inputs.The two models (APVF and MPVF) have been developed by two independent research groups.Both models have been applied to provide hourly PV power forecasts for a reallife grid-connected PV plant, using the same data variables.The forecasting horizons cover all the daylight hours of the following day (the PV power production is forecasted before dawn of the previous day).Despite the fact that both models are very different, the forecasting results in the case study show a very similar and satisfactory behaviour, enabling their use for practical applications of PV electric energy bids to electricity markets and PV plants maintenance tasks.
The paper is structured as follows: Section 2 presents the NWP model use to provide the forecasts of the weather variables used by the two compared short-term PV power forecasting models (APVF and MPVF); Section 3 describes the APVF model; Section 4 describes the MPVF model; Section 5 compares the results obtained with both models in the forecast of PV power production in a real-life gridconnected PV plant; lastly, Section 6 presents the conclusions.

Numerical Weather Prediction Models
Numerical weather prediction (NWP) models have been frequently used in the past years to carry out short-term forecasts of meteorological variables in many different applications.These models are usually classified according to the spatiotemporal characteristics of their weather predictions.Then, each NWP model tries to forecast atmospheric variables with a degree of quality that depends on the geographical extension and on the temporal resolution of their forecasts; thus, some models with high spatial scale obtain predictions for a relatively shorter temporal validity.Meanwhile, longer forecasts horizons are reached by NWP models of lower spatial scale.
In this paper, the model global forecasting system (GFS) [18], with world coverage, provides fundamental meteorological information to another NWP model, the MM5 model [19] (used in this paper), with a greater spatial resolution, which was started every day with weather predictions from the GFS model.
Afterwards, NWP meteorological forecasts for the geographical location of the PV plant were used for the two PV generation forecasting models of this paper (analytical PV power forecasting model, APVF, and multilayer perceptron PV forecasting model, MPVF).The forecasts correspond to hourly average values of the meteorological variables for all the daylight hours of the following day.More specifically, NWP forecasts provide predictions of hourly irradiance as inputs of the APVF model; and the MPVF model utilizes hourly predictions of irradiance, temperature, and other variables related to atmospheric heat transfer.

Analytical PV Power Forecasting Model (APVF)
The APVF model corresponds to an analytical approach essentially based on a multiplicative decomposition.The hourly PV power production time series is decomposed into a deterministic component, the hourly PV power production for clear sky, and into a power attenuation index component.The values of this last component (power attenuation index) are calculated from the forecasts of weather variables, as explained later in this paper.In Section 3.1, the parameterization of the APVF model is detailed.Afterwards, the steps for the forecasting application of the APVF model are described in Section 3.2.

APVF Model Parameterization.
The process of parameterization of the APVF model, which will be presented in the following paragraphs, includes basically computations of actual hourly extraterrestrial irradiance (Section 3.1.1);actual hourly irradiance (clear sky hourly irradiance) calculations (Section 3.1.2);a regression model between real and clear sky hourly PV power productions (Section 3.1.3);and a PV production attenuation index (Section 3.1.4).

Computation of Actual Hourly Extraterrestrial Irradiance.
The first step of the model parameterization consists of calculating the value of actual extraterrestrial hourly irradiance ( 0 ) [20,21].It is an analytical computation that depends on the hour of the day, the day of the year, and the geographic coordinates of the PV plant location.Such value of irradiance  0 is calculated according to (1): where:  SC is a solar constant (1367 W/m 2 );   is the day of year number, counted from the beginning of the year;  is the solar declination [20], in decimal degrees, calculated according to (2);  is the geographic latitude, in decimal degrees, corresponding to the location of the PV plant;  is the hour angle, in decimal degrees, calculated according to (3).The solar declination, , is obtained by The hour angle, , is given, in decimal degrees, by where the ℎ  is the solar time, in hours, computed with where ET is the correction corresponding to the "equation of time" [21], in hours, obtained by (6); longitude is the geographic longitude, in decimal degrees, that corresponds to the location of the PV plant [20], calculated according to (8); and  UTC is the coordinated universal time, in hours, that corresponds to where OT is the local standard time, in hours (the time shown by a clock) and SA is the time, in hours, by which clocks are set ahead of the local time zone.In the European Union, SA is usually one hour during winter and autumn and two hours during spring and summer.The correction of equation of time, ET, is given by where , in decimal degrees, is given by The geographic longitude, longitude, is calculated according to (8): where LL is the local longitude, in decimal degrees (true longitude of the site for which the calculation is to be made) and LS is the reference (standard) longitude, in decimal degrees, of the local time zone (positive towards the west and negative towards the east of the Greenwich Meridian).

Calculation of the Clear Sky Hourly
Irradiance.The parameter actual hourly irradiance   (clear sky hourly irradiance) is a reference irradiance used in the APVF forecast model.Such parameter   is computed as a function of  0 and the solar time ℎ  .In the following paragraphs, an analytical procedure is described to obtain   in the location of the PV plant.
The scatter plots of Figure 1 show the relation between the forecasted hourly irradiance   on a horizontal surface (in the vertical axis) and the extraterrestrial irradiance  0 (in the horizontal axis) corresponding to the PV plant location.Such forecasted hourly values of   are obtained from the NWP model.
In Figure 1, for the diverse values of  0 (in the horizontal axis), there are maximum values of the forecasted irradiance   (in the vertical axis).The set of points that forms the envelope of such maximum values corresponds to forecasts belonging to clear sky.For the specific PV plant location, a regression relation between the maximum values of   and the  0 values can be obtained, which represents the clear sky irradiance   as a function of  0 ; then, as shown in Figure 1, it is necessary to create two different regressions, one for the hours before the solar noon, Figure 1(a), and the other for the hours after the solar noon, Figure 1(b).

Regression between Real and Clear Sky Hourly PV Power
Productions for the Clear Sky.The scatter plots of Figure 2 show the relationship between real (recorded) hourly PV power generation,   , in the vertical axis, and clear sky hourly irradiance,   , in the horizontal axis, corresponding to the PV plant location.
In Figure 2, for each value of clear sky irradiance,   , there is a maximum value of real PV power generation, that is, the value of PV power generation   for a specific clear sky irradiance value   .This function,   = (  ), can be obtained by a polynomial regression and it is an asymmetrical function relatively to the solar noon, due to local shadowing, albedo or tracking system characteristics, and orientation and inclination of the PV system.Therefore, it is necessary to create two different regressions, one for the hours before the solar noon and the other for the hours after the solar noon, as shown in Figures 2(a) and 2(b).

PV Power Production Attenuation Index.
The APVF forecasting model provides forecasts of hourly PV power production based on the relationship between the irradiance attenuation index,   , and the PV power production attenuation index,   , all of which will be described in the following paragraphs.Both attenuation indexes define the quotient between values for a specific forecasted meteorological condition and the corresponding clear sky condition.Thus, the irradiance attenuation index   is given by where   is the hourly irradiance forecasted by the NWP and   is the computed hourly irradiance for clear sky.Furthermore, the attenuation index for PV power production,   , is given by where   is the real (registered) hourly production of the PV plant and   is the corresponding hourly production for a clear sky condition.
The scatter plot of Figure 3 presents the real values of   (in the vertical axis) and   (in the horizontal axis).Then, the relationship, by polynomial regression, between the index   and the index   is found and it also is shown in such Figure 3.
A good correlation between the real PV power production   and the irradiance   , forecasted by the NWP, corresponds to a good correlation between   and   .The polynomial regression function presented in Figure 3 shows high dispersion, reflecting a real-life case study of a PV plant with relatively high meteorological forecast error.
Notice that the representation of   as function of   can be modeled by a polynomial regression, by fixing two points (  = 0,   = 0) and (  = 1,   = 1) in the plot of Figure 3. hourly forecasts are available for 7 days horizon, refreshed 4 times per day.

Steps for
Step 2. Compute the value of  0 for each hour of the forecasting horizon (different in each day of the year), for the location of the PV plant, according to Section 3.1.1.
Step 3.For each hour of the forecasting horizon, calculate the value of the irradiance for clear sky,   , using the corresponding value of  0 , computed in the previous step, according to Section 3.1.2.
Step 4. For each hour of the forecasting horizon, compute the value of the irradiance attenuation index,   , using the value of   and the value of   , since   =   /  .
Step 5.For each hour of the forecasting horizon, obtain the value of the PV power production   for clear sky using the value of   in the corresponding regression function, according to Section 3.1.3.
Step 6.For each hour of the forecasting horizon, determine the value of the power production attenuation index,   , using the value of   in the corresponding polynomial regression, according to Section 3.1.4.
Step 7. Compute the value of the forecasted hourly PV power production as the product of the value of   multiplied by the value of   .

Multilayer Perceptron PV Forecasting Model (MPVF)
The MPVF model is a two hidden layers multilayer perceptron neural network (MLP) based model developed to provide hourly power production in a PV plant using the forecasts of weather variables obtained with a NWP model as inputs.The MPVF model was selected as the best one among a large set of potential models of different families (time series models, neural network models, and neurofuzzy Sine of the elapsed fraction of the year  9 Cosine of the elapsed fraction of the year  10 Sine of the elapsed fraction of the day  11 Cosine of the elapsed fraction of the day models).The selection process is described in detail in [22], where the MPVF is named as the MLP 2A model.The next paragraphs are a brief description of the MPVF model.The data used for the development of the MPVF model were hourly PV power generation values obtained from the PV plant under study and forecasts of weather variables for the location of the PV plant for the next day.The weather variables selected were those related to radiation and atmospheric heat transfers at terrain surface level.These forecasts were obtained with the NWP model for horizons covering all the daylight hours of the following day.
The available inputs variables include seven weather forecasted variables, obtained with the NWP model, and four variables representing the moment of the year and the moment of the day corresponding to the forecasting horizon.Table 1 summarizes the available input variables.The model only had an output variable: the forecasted hourly PV power production.
Notice that the variable "surface downward shortwave radiation" (of Table 1) is   , mentioned above, in Section 3.
The data were divided into three sets (training, crossvalidation and testing data sets), as it is described in the case study of Section 5 of this paper.The cross-validation data set was used to avoid overtraining.The MPVF is the final MLP model obtained after an optimization process ruled by a genetic algorithm.The optimization process could choose the input variables (among those available), the number of neurons in the two hidden layers, and the parameters of the used training algorithm (back-propagation).The number of generations was 50, equal to population size.The crossover and mutation rates were established in 0.9 and 0.02, respectively.The transfer function for the neurons of hidden layers was the hyperbolic tangent and a linear transfer function for the output neuron.
The MPVF model used the eleven available input variables and had 7 neurons in both hidden layers.A sensitivity analysis on the MPVF model reveals that the most relevant input variables of the model were the forecasted Surface downward shortwave radiation and Surface temperature.

Case Study
This section describes computer results achieved with the two forecasting models (APVF and MPVF models), that is, their forecasts of the hourly PV power generation for a real-life grid-connected photovoltaic plant.

Characteristics of the PV Plant.
The rated power of the PV plant is 36 kWp and it is composed by several photovoltaic panels of different technologies (single fixed panels and tracking systems with one and two axes).Such PV plant is located in the region of La Rioja (Spain).

Characteristics of Data.
Hourly electric power generation data, corresponding to one year, of the PV plant were recorded with measurement equipment placed in its location.Furthermore, forecasting of hourly irradiance, of temperature, and of the other meteorological variables (mentioned in previous Section 4) was obtained from the NWP model.
Furthermore, in order to create the MPVF model, the available data were classified into three sets: the training data set, with 60% of the data; the cross-validation set, with the 20% of data; and the testing set, with the 20% of the data.The cross-validation data set was used to stop the training process when the error of the data of this set began to increase (early stopping), and the testing set was utilized for comparing the two forecasting models of this paper.

Evaluation Criteria.
In this section, we define the error indexes used to calculate the performance of the forecasting APVF and MPVF models.
The RSM error (RSME) in percent, defined by (11), is the RMS value of the predictions of a forecasting model, calculated with respect to the rated power of the PV plant where  represents the forecasting error (difference between real and forecast value of hourly power),  is the forecasting horizon (in hours), (+ | ) is the forecasting error for the horizon  when the forecast process is carried out at instant , and  is the total number of samples in the testing set.
The MA error (MAE) in percent, given by ( 12), is the mean absolute error of the predictions of a forecasting model, calculated with respect to the rated power of the PV plant The DEMA error (DEMAE) in percent, defined by ( 13), is the daily energy forecast mean absolute error of the predictions of a forecasting model, with respect to the daily real recorded electric energy where rde corresponds to the real daily energy produced in the PV plant; fde is the forecasted one; and  corresponds to the number of days in the testing set.Lastly, the percentage error (PE) in percent, given by ( 14), is the difference between real and forecasted values of hourly energy production (hourly average power) with respect to the rated power of the PV plant

Analysis of Computer Results
. This section analyzes and discusses the computer results obtained for the real-life casestudy presented above.
In order to carry out suitable comparisons, the computer results from a "reference model" were used.This reference model is a full analytical model that does not use any kind of meteorological forecast.The reference model provides the hourly clear sky PV power production   adjusted to match the annual real recorded PV electric energy.The hourlypower generation prediction value  ref of the reference model is given by where   is the hourly clear sky PV power production (obtained from Step 5 of Section 3.2 of this paper);   is the annual real (recorded) PV electric energy production; and   is the annual PV electric energy production for clear sky, that is, the annual electric energy corresponding to the hourly clear sky power production   along one year.Table 2 shows the RMSE and MAE errors of the APVF and MPVF models, as well as the ones of the reference model, for the testing data of the case study.
Therefore, forecasting models of this paper achieve significant better errors than the ones of the reference model.The improvement in RMSE, improveRMSE(%), and the improvement in MAE, improveMAE(%), can be calculated by ( 16) and ( 17 where MAErefmo is the MAE of the reference model and MAEformo is the MAE of the forecasting (APVF or MPVF) model.Figure 4 gives the histogram of percentage errors (PEs) of both forecasting models of this paper, only for the hours with electrical production of the PV plant (daylight hours).Such models present an error well centered with an approximated variance of 10%.The variance of the MPVF model is slightly higher, because this model consists of an artificial neural network (ANN) that utilizes a minimization of the square errors, meanwhile the APVF model does not use it.
Figures 5 and 6 show the hourly electrical production values of the forecasts, for several consecutive days of different meteorological characteristics (Figure 5 for clear sky days, and Figure 6 for cloudy days), obtained by the two models (APVF and MPVF models), as well as the real (recorded) power generation in the PV plant.All the days represented in the figures correspond to the testing data set.
Figure 7 shows the values of the forecast RMS error (RMSE), with the test data set, for both forecasting models on an hourly basis.They are of similar values, and they ranged from 0% to 23.7%, with some differences between them in the hours of higher PV power production.
Furthermore, Figure 8 shows values of the forecast MA error (MAE), of the forecasting models, on an hourly basis.They ranged from 0% to 19.64%, also showing some differences between them in the hours of higher power production.Then, the hours with higher error values correspond to the 13 and 14 hours.
Figure 9 shows the values of the forecast RMSE for both forecasting models in the vertical axis and the irradiance attenuation index (  ) in the horizontal axis: similar RME errors of the two models can be observed, ranged from 0% to 20%.
Additionally, Figure 10 shows the values of the forecast error MAE for the models in the vertical axis and the irradiance attenuation index (  ) in the horizontal axis, as such values of the MAE that are similar in the models, ranged from 0% to 15.5%.In Figures 9 and 10, a slight decreasing trend of the errors seems to be observed for meteorological conditions of clearer sky days.
Table 3 shows the daily energy forecast error DEMAE.
Then, the improvements of DEMAE, improveDEMAE (%), calculated by (18), for the forecasting models, with respect to the reference model, are 23.26% and 19.64% for APVF and MPVF models, respectively,   where DEMAErefmo is the DEMAE of the reference model and DEMAEformo is the DEMAE of the forecasting (APVF or MPVF) model.Therefore, the satisfactory computer results give a perspective of the goodness of both forecasting models.It also provides a vision of the potential capability of such models to provide suitable PV hourly power production forecasts to be used in practical applications for PV electric energy production sale bids to day-ahead electrical markets as well as for maintenance task of PV plants.
Obviously, the APVF and MPVF forecasting models can be easily developed for any other PV plant using historical data (past values of hourly power production and forecasts of weather variables) corresponding to such PV plant.

Conclusions
This paper describes and compares two short-term statistical forecasting models (APVF and MPVF models) for hourly electrical production of any PV plant, which use meteorological forecasts from numerical weather prediction (NWP) models.Such meteorological forecasts obviously correspond to the geographical location of the PV plant.Furthermore, both PV forecasting models also utilize past recorded values of hourly PV electric power production.
The NWP model provides hourly meteorological predictions (irradiance), at the location of the PV plant, as input for the forecasting APVF model, and it also supplies a more complete set of hourly meteorological predictions (irradiance, temperature, and other meteorological variables), as inputs for the forecasting MPVF model.
The APVF model essentially consists of an original analytical approach based on irradiation data adjusting modeling of clear sky by an irradiation attenuation index, combined with a PV power production attenuation index.On the other hand, the forecasting MPVF model utilizes the best artificial neural network selected among a large set of ANNs, optimized by GAs.Thus, such forecasting models of this paper are very different.These photovoltaic power forecasting models, developed by two independent research teams, have been applied to a real-life case study of a grid-connected PV plant using the same training, cross-validation, and testing data.Both models achieve very similar and satisfactory computer results, obtaining RMS errors that vary from 11.95% to 12.10% for the case study, which are significantly better than the RMS error (28.59%) of the reference model also described in this paper.
Therefore, the application of these two new PV forecasting models is very advantageous with respect to the utilization of the reference model.
Lastly, both PV forecasting models can be used by PV plants owners for hourly electric energy sale bids (based on forecasted PV hourly production) to day-ahead electric markets.Usually, a PV power producer is economically more penalized when larger derivation (error) exists between real power productions and electric energy sale bids of his/her PV plant.Since APVF and MPVF models achieve satisfactory forecasting errors, they can contribute to the reduction of economic penalizations in PV plant owner's retributions and, therefore, to increase net profits for the PV plant owner.Furthermore, the APVF and MPVF models of this paper are also useful for scheduling of maintenance task in PV plants.
An undergoing research is carried out at present to improve the APVF and MPVF models order to achieve better PV power generation forecasts.

Figure 3 :
Figure 3: Relationship between the   and   indexes.

Figure 6 :Figure 7 :
Figure 6: Hourly electrical production values of the forecasts for cloudy days.

Figure 8 :Figure 9 :
Figure 8: MAE of the forecasting models on an hourly basis.

Figure 10 :
Figure 10: MAE of the forecasting models versus irradiance attenuation index.

Table 2 :
RMSE and MAE forecasts errors.