The intermittence and fluctuation character of solar irradiance places severe limitations on most of its applications. The precise forecast of solar irradiance is the critical factor in predicting the output power of a photovoltaic power generation system. In the present study, Model I-A and Model II-B based on traditional long short-term memory (LSTM) are discussed, and the effects of different parameters are investigated; meanwhile, Model II-AC, Model II-AD, Model II-BC, and Model II-BD based on a novel LSTM-MLP structure with two-branch input are proposed for hour-ahead solar irradiance prediction. Different lagging time parameters and different main input and auxiliary input parameters have been discussed and analyzed. The proposed method is verified on real data over 5 years. The experimental results demonstrate that Model II-BD shows the best performance because it considers the weather information of the next moment, the root mean square error (RMSE) is 62.1618 W/m2, the normalized root mean square error (nRMSE) is 32.2702%, and the forecast skill (FS) is 0.4477. The proposed algorithm is 19.19% more accurate than the backpropagation neural network (BPNN) in terms of RMSE.
National Natural Science Foundation of China618751716186501561705192Natural Science Foundation of Yunnan Province2017FD0691. Introduction
Along with the rapid increase of solar power generation, more and more solar power is connected to the grid, which has already shown its substantial economic impact. Based on the statistics of the International Renewable Energy Agency (IRENA), the total installed capacity for PV has reached 205.493 GW in China at the end of 2019 [1]. However, power generation from photovoltaic systems is highly variable due to its dependence on meteorological conditions. There is a severe challenge to the security of the power grid because of the fluctuation of solar power. Therefore, an effective method of solar irradiance forecasting can mitigate intermittency as it gives information about future trends and allows users to make decisions beforehand.
Solar forecasting is a timely topic, and several short-term solar irradiance forecasting approaches have been presented recently. Broadly, prediction can be divided into five categories based on forecast methods as follows [2]: (1) time series; (2) regression; (3) numerical weather prediction; (4) image-based forecasting; and (5) machine learning. A time series is a sequence of observations taken sequentially in time. That is divided into stationary and nonstationary time series forecasting models. Autoregressive (AR), moving average (MA), and autoregressive moving average (ARMA) are commonly used to forecast stationary trends; integrated moving average (IMA), autoregressive integrated moving average (ARI-MA), seasonal autoregressive integrated moving average (SARIMA), and other models are used to forecast nonstationary trends [3–6]. Regression is a statistical process for estimating the relationships among variables; it is a handy tool to describe the relationship between solar irradiance and exogenous variables [7, 8]. Numerical weather prediction (NWP) models directly simulate the irradiance fluxes at multiple levels in the atmosphere, separately considering the shortwave and longwave parts of the solar spectrum [9, 10]. Image-based forecasting method is using satellite cloud images and all-sky images as main or auxiliary data sources to forecast irradiance. This can effectively increase forecasting skills, as it provides warning of approaching clouds at a lead time of several minutes to hours [11–13]. The machine learning method, as a branch of artificial intelligence, can learn from datasets and construct a nonlinear mapping between input and output data. Nowadays, machine learning (ML) is perhaps the most popular approach in solar forecasting and load forecasting [2]. Although artificial neural networks (ANNs) and support vector machines (SVMS) are still the basis of machine learning methods in solar irradiance prediction, many other approaches have been used recently, such as k-nearest neighbors (kNN), random forest (RF), gradient boosted regression (GBR), hidden Markov models (HMMs), fuzzy logic (FL), wavelet networks (WNN), and long short-term memory networks (LSTM) [14–22]. Meanwhile, some hybrid algorithms are used to improve the prediction accuracy. For example, the metaheuristic algorithms, such as cuckoo search (CS) algorithm, krill herd (KH) algorithm, and chaotic immune algorithm, are combined with a support vector regression (SVR) model to predict electric load [19, 23–26]. Some signal preprocessing methods, such as variational mode decomposition (VMD) method and empirical mode decomposition (EMD), are also used in the hybrid model [24, 25]. Obviously, the abovementioned methods are not detailed lists. Many other applications of machine learning algorithms in solar radiation prediction can be found in recent literature [27].
As a novel machine learning tool, LSTM has successful applications in solar irradiance forecasting [28–30]. Due to its special maintaining a memory cell structure, it can preserve the important features which should be remembered during the learning process and improve performance. Therefore, using LSTM to predict irradiance can not only obtain the correlation during continuous hours but also extract its long-term (e.g., seasonal) behavior trends [30]. Yu et al. [29] proposed an LSTM-based approach for short-term global horizontal irradiance (GHI) prediction under complicated weather conditions, the result indicated that LSTM outperforms ARIMA, SVR, and NN models, especially on cloudy days and mixed days. Qing and Niu [30] proposed a novel hourly day-ahead solar irradiance predicted method using weather forecasts based on LSTM networks. The proposed algorithm uses the hourly weather forecasts of the same day and data information at the predicted time as the input variables, and the hourly irradiance values of the same anticipated day are taken as the output variable. Experimental results show that the proposed learning algorithm is more accurate than persistence, linear least squares regression method (LR), and BPNN due to the consideration of time dependence. Srivastava and Lessmann [28] studied the ability of LSTM in predicting solar irradiance, demonstrated the robustness of LSTM, and showed that the LSTM model with optimally configured outperforms GBR and FFNN for day-ahead GHI forecasting. Abdel-Nasser and Mahmoud [31] proposed a method based on LSTM to forecast the output power of PV systems accurately. Liu et al. [32–34] proposed a new hybrid approach for the wind speed high-accuracy predictions based on some decomposition algorithm (such as secondary decomposition algorithm (SDA), empirical wavelet transform (EWT), and VMD) and the LSTM networks.
However, the LSTM methods mentioned above do not deeply study the effects of different parameters and structures on experimental results, but these factors will affect the prediction accuracy. In this paper, two different models based on traditional LSTM network are applied, and the effects of various parameters are investigated; meanwhile, four models based on a novel LSTM-MLP structure with two-branch input is proposed. For the new LSTM-MLP model, we use historical irradiance (or historical irradiance and meteorological parameters) as the main input and the meteorological parameters at the current time or the next time as the auxiliary input to predict the irradiance at the next time through the multilayer LSTM-MLP network. Experimental results show that the proposed model can achieve better prediction results.
The main innovations of this study are as follows: (1) An LSTM-MLP structure with two branches, including main input and auxiliary input, is proposed, which can provide a reference for similar models. (2) It is confirmed that the lagging time plays an important role when the input variables of the LSTM model are small. Still, for more input information, it is not that the more the lagging parameters, the higher the accuracy. (3) The meteorological parameters at the next moment play a vital role in the prediction accuracy, which can be gained by the weather forecast.
The organization of this paper is as follows: The methodology is described in detail in Section 2. Section 3 provides information about the dataset. Experimental results and discussion are presented in Section 4. Finally, conclusions are given in Section 5.
2. Method2.1. Long Short-Term Memory Network
In the learning phase, the traditional neural network cannot use the information learned in the previous time step to model the data of the current step. This is the main shortcoming of conventional neural networks. RNNs attempt to solve this problem by using loops that pass information from one step of the network to the next, ensuring the persistence of the information. In other words, the RNNs connect the previous information to the current task. Using previous sequence samples may help to understand the current sample.
The LSTM network, which has the time-varying inputs and targets, is a special RNN and was initially introduced by Hochreiter and Schmidhuber [35]. Due to the excellent ability to solve the long-term and short-tern dependency problem, the LSTM network often has satisfactory performance in processing time series. A general architecture is composed of a cell (the memory part of the LSTM unit) and three “regulators” (usually called gates), of the flow of information inside the LSTM unit: an input gate, an output gate, and a forget gate. The memory unit is an essential parameter of the LSTM network, which can store information over an arbitrary time. The input gate, forget gate, and output gate can control the actual input signal by adding or deleting information to the signal state.
A schematic of the LSTM block can be seen in Figure 1. Every time a new input comes, its information will be accumulated to the cell if the input gate is activated. The prior cell status could be forgotten in this process if the forget gate is activated. Whether the latest cell output will be propagated to the final state is further controlled by the output gate.
Detailed schematic of the long short-term memory block.
The model input is denoted as x=x1,x2,…,xT, and the output sequence is denoted as y=y1,y2,…,yT, where T is the prediction period. In the context of solar irradiance forecasting, xcan be considered as historical input data (e.g., irradiance and meteorological parameters), and y is the forecasting data. The predicted irradiance will be iteratively calculated by the following equations [36]:(1)it=σWixxt+Wimmt−1+Wicct−1+bi,(2)ft=σWfxxt+Wfmmt−1+Wfcct−1+bf,(3)ct=ft⊙ct−1+it⊙gWcxxt+Wcmmt−1+bc,(4)ot=σWoxxt+Wommt−1+Wocct−1+bo,(5)mt=ot⊙hct,(6)yt=Wymmt+by,where it denotes the input gate, ft is the forget gate, ct is the activation vectors for each cell, ot is the output gate, mt is the activation vectors for each memory block, w is the weight matrices, b is the bias vectors, and “ʘ” represents the scalar product of two vectors, and σ⋅ denotes the standard logistics sigmoid function defined as follows:(7)σx=11+e−x.
g⋅ is a centered logistic sigmoid function defined as follows:(8)gx=41+e−x−2,x∈−2,2.
h⋅ is a centered logistic sigmoid function defined as follows:(9)hx=21+e−x−1,x∈−1,1.
2.2. Model Development
As previously mentioned, the primary objective of this study is to examine the feasibility of the LSTM network for short-term solar irradiance forecasting and find the optimal structure of the LSTM for the forecast. In this section, firstly, the standard LSTM solar irradiance forecasting pipeline is introduced. Then, a classical LSTM model with two input structures and a novel model with four different input structures were conducted to discuss the performance of the LSTM network.
Figure 2 presents a standard pipeline for solar irradiance forecasting through LSTM. The data is divided into training, validation, and test. The feed-forward and feed-backward are the two types of LSTM models that are used to process the data and train network further. The error calculation is carried out when the models are developed, which can be used to describe the training accuracy and decide the feed-backward. At the final stage, the selection of a successful model for prediction is established.
The standard LSTM solar irradiance forecasting pipeline.
The structure of the conventional LSTM model (we call it Model I) for solar irradiance forecasting can be seen in Figure 3. The network structure contains 1 input layer, 2 LSTM layers (or 1 LSTM layer), and 1 output layer. The input layer includes two different structures in which the input A is the data of historical irradiance, and input B is the data of historical irradiance and meteorological parameters. These structures can be expressed as I-A and I-B. For input A (I-A), the historical irradiance at t−1,t−2,…,t−m time is feed LSTM layer 1; for input B (I-B), the historical irradiance and meteorological parameters at t−1,t−2,…,t−m time is feed LSTM layer 1, and m is the length of the lagging window in time.
The framework of the traditional LSTM Model I.
Meanwhile, the novel LSTM-MLP structure is proposed in Figure 4 (named Model II). A two-branch structure is designed, including one main input, one auxiliary input, one main output, and one auxiliary output. The data of history irradiance (or irradiance and meteorological parameters) is as main input, which is feed to LSTM layers. When the data is output from the LSTM layer, one part is output as auxiliary output, and the other part is previously combined with the meteorological parameters (auxiliary input) at the current or next time and sent to a new MLP structure. After several hidden layers of MLP, the final output is the main output, which is the irradiance prediction value at the next time.
The framework of the proposed LSTM Model II.
The simplified expression of the above operation is as follows:(10)yaux=ℱLSTMxinputmain input,hLSTM=ℱLSTMxinputmain input,hdense=densehLSTM⊕xaux input,ymain=ℱMLPhdense,where xinputmain input represents the main input, which is the time series of historical irradiance (or together with historical meteorological), ℱLSTM denotes the LSTM layer, hLSTM represent the output through the LSTM layer, xaux input denotes the auxiliary input described in Figure 4, ⊕ means the concatenate operator, dense means the fully connected layer, ℱMLP denotes the MLP layer, and yaux and ymain denote the auxiliary output and main output, respectively.
As can be seen in Figure 4, there are two input methods for the main input and auxiliary input, respectively. According to the different combinations of main inputs A and B and auxiliary inputs C and D in Figure 4, the model can be expressed as II-AC, II-AD, II-BC, and II-BD. In order to find better network parameters, six experiments are designed with two models mentioned above. There are (1) Model I-A; (2) Model I-B; (3) Model II-AC; (4) Model II-AD; (5) Model II-BC; and (6) Model II-BD, where the influence of different lagging time parameters (e.g., from Lagging 1 and Lagging 12) is discussed.
Figure 5 shows the input (or main input) time series structure of the train samples. S(t) is the current data, n is the number of train samples, and m is the number of input data in each group, which is the number of lagging time and the length of the lagging window in time. For example, we used S(t−m), S(t−m−1), S(t−2), and S(t−1) as training input and S(t) as training output. Then the data are shifted; the input has become S(t−m−1) to S(t−2), the output is S(t−1), and so on.
Input (or main input) time series structure of the train samples.
2.3. Forecasting Accuracy Evaluation
To assess the prediction performance of the involved models, four error measures, which include the root mean square error (RMSE), the normalized root mean square error (nRMSE), the mean absolute error (MAE), the mean bias error (MBE), and R (Pearson’s correlation coefficients) are utilized in the forecasting experiments.
These indexes can be defined as follows:(11)RMSE=1N∑i=1NYi′−Yi2,nRMSE=RMSEY¯,MAE=1N∑i=1NYi−Yi′,MBE=1N∑i=1NYi′−Yi,R=∑i=1NYi−Y¯Yi′−Y′¯∑i=1NYi−Y¯2∑i=1NYi′−Y′¯2,where N denotes the number of testing instances, Yi′ denotes the prediction value of the models, Y′¯ denotes the mean value of Yi′, Yi denotes the measured value, and Y¯ denotes the mean value of Yi.
Besides, forecast skill (FS) is an indicator that compares a selected model with a reference model (usually with the persistence model), regardless of the prediction horizon and location [37,38], which is a fair-minded approach to evaluating performance in solar irradiance prediction, as described by the following equation [2]:(12)FS=1−nRMSEnRMSEpersistence.
The persistence model is one of the most basic prediction models, which is often applied to compare the performance of other prediction models. The definitions of this model are varied; this paper adopts the most basic definition, which is to assume that the predicted value at the next time is the same as the present value [39,40]:(13)GHIt+Δt=GHIt.
To further evaluate the performance of the adopted model compared with the benchmark model, the promoting percentage of RMSE P is employed to make a further comparison. The formulas are as follows:(14)P=RMSEbenchmark−RMSEcomparisonRMSEbenchmark×100%,where P is promoting percentage of RMSE, and RMSEbenchmark and RMSEcomparison are the root mean square error computed from the benchmark model and comparison model, respectively.
3. Data and Analysis
The data used in this study came from a solar power plant in Denver, Colorado, USA. Average global horizontal irradiance (GHI; in this paper, solar irradiance represents GHI) and meteorological data (such as ambient temperature, relative humidity, wind velocity, atmospheric pressure, precipitation, and so on) have been collected in a one-hour resolution during January 1, 2012, to December 31, 2016, from NREL Solar Radiation Research Laboratory [41]. The data from 2012 to 2015 is used for training and validation; the data from 2016 is used for testing. The main statistical characteristics of solar irradiance in this dataset are shown in Table 1.
Main statistical features of solar irradiance in the dataset.
Samples
Statistical indicator (GHI (W/m2))
Number
Max
Mean
Std.
All samples
43824
1090
188.8933
270.6560
Training samples
35040
1090
187.9399
270.3011
Testing samples
8784
1050
192.6963
272.0491
Pearson’s correlation coefficient is the test statistics that measures the statistical relationship or association between two continuous variables. The relationship between irradiation and wind speed, atmospheric pressure, air temperature, and relative air humidity was analyzed to determine whether these variables should be included as inputs and which parameters to choose as inputs in this network. Table 2 shows the Pearson correlation coefficient between the five weather variables and the solar irradiance on the dataset. It can be observed that only temperature and humidity have a high correlation. However, the irradiance is not correlated with wind speed, precipitation, and pressure, so these three meteorological parameters are excluded. Figure 6 shows the average hourly irradiance distribution for different months in 2016. It can be noticed that there is a strong correlation between hours for each day and solar irradiance. Obviously, the irradiance value is low at the beginning of the day and increases to the peak value at noon and then gradually decreases in the afternoon. Meanwhile, it can be noticed that the peak of irradiance is different every month. The highest peak is between June and July, and the lowest peak value is between December and January. Consequently, the time must be used as an input variable.
Pearson’s R correlation coefficients between meteorological parameters and solar irradiation.
Meteorological parameters
Pearson’s R correlation coefficients
Wind speed
0.0053
Atmospheric pressure
0.0429
Precipitation
−0.0328
Relative air humidity
−0.3044
Air temperature
0.3745
Hourly average solar irradiance data for different months in 2016.
Autocorrelation function (ACF) refers to the degree of similarity between time series and their own lag series in a continuous-time interval. However, irradiance is a time-series data, which can be characterized by ACF. Let Xt be a time series with length T. Denote Xt−h the lagged time series by h periods. The autocorrelation of Xt at lag h is given by(15)ρXh=CovXt−Xt−h=γXhγX0=EXt−μXXt−h−μXEXt−μX2,where γXh is the autocovariance of Xt at lag h, γX0 is the autocovariance of Xt at lag 0, and μX is the expected value of Xt.
From the ACF plot above, we can see that our daily period consists of 24 timesteps (where the ACF has the second-largest positive peak). While it was easily apparent from the natural law, it can also be seen from Figure 7 that the time interval of the maximum positive and negative correlation is 12 hours. At the same time, in the actual model calculation, when the lagging time is between 12 and 24, the performance is very similar. Therefore, in this paper, we choose a 12-hour lagging time.
ACF plot of hourly solar irradiance. The abscissa represents lag time, and the ordinate represents the autocorrelation coefficient.
The training dataset is optimized by Adam algorithm, and the sigmoid function is used in the output layer for all models. The program code of this paper is performed on an Intel® Core™ I7-8600 CPU using Python 3.7.5 and Keras 2.3.1 with TensorFlow 2.0.0 backend.
4. Results and Discussion
In this section, the above six models were simulated and calculated to verify the performance of the proposed method. We discuss the effect of the input length (determined by the lagging time). The forecasting results under different lengths of the input sequence with different models are shown in Tables 3–6. The details of forecasting results and analysis are given as follows:
For Model I, since it has only one single-branch input, the number of input variables directly affects the prediction accuracy. As can be seen in Table 3, it is clear that with the increase of lagging time parameter, the RMSE and nRMSE decrease continuously. This fact implies that, for this case study, data from previous points in time is vital for forecasting, especially when only historical irradiance is used for prediction.
However, when the historical irradiance and meteorological parameters are input to the LSTM network at the same time, the influence of the lagging time parameters on forecasting accuracy has a significant downward trend. When the lagging time is only one hour, the RMSE of Model I-A is 110.64 W/m2, and the RMSE of Model I-B is 75.4654 W/m2, which shows that when the lagging time is fixed, the information of meteorological parameters helps the prediction of irradiance very well.
As can be seen in Tables 3 and 4, in general, the prediction accuracy will increase with the increase of lagging time in 1–12 hours. However, the expansion of lagging time will lead to a rise in input variables, increasing in operation time. Considering these factors, we need to choose a more reasonable lagging time. In this case, although the best lagging time is 10 hours and 11 hours for Model I-A and Model I-B, respectively, we think the 8 hours lagging time is reasonable. Without a doubt, the perfect lagging time may be different for different datasets.
For Model II-AC and Model II-AD, compared with Model I, we add an independent branch with meteorological parameters (C: meteorological parameters at the current time; D: meteorological parameters at the next time) as input, which plays an important role. Comparing the results with Models I-A and II-AC in Tables 3 and 5, with the same lagging time, the prediction accuracy has a noticeable difference; especially when the lagging time is small, the difference is more prominent. For instance, when the lagging time is 1 hour, the RMSE is 110.64 W/m2 in Model I-A, but the RMSE is 73.2477 W/m2 in Model II-AC. The best prediction accuracy of the two models is 75.22 W/m2 and 71.0791 W/m2 by RMSE, respectively, which shows that the proposed new branch can improve the prediction accuracy. Meanwhile, it can also be seen from Table 5 that historical irradiance is used as the main input, and whether the auxiliary input is the meteorological parameter at the current time or the next time, the prediction accuracy is the same.
Comparing Tables 5 and 6, we find that using the meteorological parameters of the next moment can better take advantage of the proposed new branch structure. As shown in Model II-BD in Table 6, when historical irradiance and meteorological parameters are the main input and the meteorological parameters at the next moment are the auxiliary input, the prediction effect is the best; the RMSE and nRMSE are 62.1618 W/m2 and 32.2702, respectively. For Model II-BC, because the current meteorological parameters in the auxiliary input already exist in the main input, the accuracy improvement effect is not apparent.
The performance of Model I-A with 1–12-hour historical irradiance (from Lagging 1 to Lagging 12).
RMSE (W/m2)
nRMSE (%)
MAE (W/m2)
MBE (W/m2)
R
Lagging 1 (1 h)
110.64
55.67
66.32
−1.57
0.9216
Lagging 2 (2 h)
84.77
43.02
45.52
2.65
0.9524
Lagging 3 (3 h)
80.37
40.75
40.88
1.57
0.9574
Lagging 4 (4 h)
79.56
40.35
43.15
−0.07
0.9583
Lagging 5 (5 h)
78.45
39.78
42.71
0.63
0.9596
Lagging 6 (6 h)
76.95
39.00
40.38
1.07
0.9610
Lagging 7 (7 h)
77.53
39.21
42.17
0.22
0.9607
Lagging 8 (8 h)
75.91
38.47
38.26
−0.94
0.9621
Lagging 9 (9 h)
75.81
38.43
37.43
0.85
0.9622
Lagging 10 (10h)
75.22
38.12
36.90
−2.41
0.9628
Lagging 11 (11 h)
75.26
38.15
37.36
2.02
0.9628
Lagging 12 (12 h)
75.91
38.25
41.15
2.25
0.9624
The performance of Model I-B with 1–12 hour historical irradiance (from Lagging 1 to Lagging 12).
RMSE (W/m2)
nRMSE (%)
MAE (W/m2)
MBE (W/m2)
R
Lagging 1 (1 h)
75.4654
39.2068
34.179
−7.2412
0.9615
Lagging 2 (2 h)
73.0181
37.9311
32.2728
−5.3405
0.9637
Lagging 3 (3 h)
72.0597
37.429
33.022
−8.7159
0.9648
Lagging 4 (4 h)
73.1294
37.9803
35.0303
−13.1121
0.9649
Lagging 5 (5 h)
72.5747
37.6879
32.0244
−5.851
0.9641
Lagging 6 (6 h)
73.3373
38.0796
34.9869
−10.9972
0.9638
Lagging 7 (7 h)
72.0033
37.3826
30.9602
−3.4511
0.9648
Lagging 8 (8 h)
71.9089
37.3294
30.8381
−3.7222
0.9647
Lagging 9 (9 h)
71.4177
37.0707
31.2055
−2.3624
0.9656
Lagging10 (10 h)
72.6381
37.7026
30.7169
−0.6251
0.9642
Lagging 11 (11h)
71.0791
36.8959
32.8573
−7.3356
0.9656
Lagging 12 (12 h)
71.8434
37.2962
30.6385
−1.9523
0.9649
The performance of Model II-AC and II-AD with 1–12-hour historical irradiance (from Lagging 1 to Lagging 12).
Model II-AC
Model II-AD
RMSE (W/m2)
nRMSE (%)
RMSE (W/m2)
nRMSE (%)
Lagging 1 (1 h)
73.2477
38.0547
71.843
37.3249
Lagging 2 (2 h)
71.9784
37.391
71.2248
36.9995
Lagging 3 (3 h)
71.4207
37.0971
71.2623
37.0148
Lagging 4 (4 h)
71.3815
37.0725
71.2267
36.9921
Lagging 5 (5 h)
71.1067
36.9256
71.8271
37.2996
Lagging 6 (6 h)
71.3472
37.0462
70.0358
36.3653
Lagging 7 (7 h)
71.2828
37.0086
70.3305
36.5142
Lagging 8 (8 h)
71.6838
37.2126
69.7479
36.2076
Lagging 9 (9 h)
71.1645
36.9392
71.8127
37.2757
Lagging 10 (10 h)
70.2837
36.4806
71.2756
36.9954
Lagging 11 (11 h)
72.045
37.3973
71.3583
37.0408
Lagging 12 (12 h)
70.4213
36.558
70.1675
36.4262
The performance of Model II-BC and II-BD with 1–12-hour historical irradiance (from Lagging 1 to Lagging 12).
Model II-BC
Model II-BD
RMSE (W/m2)
nRMSE (%)
RMSE (W/m2)
nRMSE (%)
Lagging 1 (1 h)
72.3486
37.5875
67.076
34.8482
Lagging 2 (2 h)
72.7865
37.8108
66.3851
34.4854
Lagging 3 (3 h)
71.5573
37.168
66.7471
34.6695
Lagging 4 (4 h)
72.5258
37.6668
66.0398
34.2982
Lagging 5 (5 h)
72.8937
37.8535
64.2217
33.3502
Lagging 6 (6 h)
70.9322
36.8307
63.4328
32.9368
Lagging 7 (7 h)
70.4761
36.5898
64.4025
33.4365
Lagging 8 (8 h)
71.8012
37.2735
63.4285
32.927
Lagging 9 (9 h)
71.4277
37.0758
62.955
32.678
Lagging 10 (10 h)
71.2123
36.9626
63.0588
32.7305
Lagging 11 (11 h)
70.8344
36.7688
64.965
33.7222
Lagging 12 (12 h)
71.6534
37.1976
62.1618
32.2702
The best parameters and architecture of the LSTM network for 1-hour-ahead forecasting with the proposed six models are shown in Table 7. In Model I, two LSTM layers within 100 and 40 neurons (100–40) are used with lagging time 10 and 11, respectively, but in Model II, a 64–32 MLP hidden layer is added, and most of them used only one LSTM layer.
The best parameters and architecture of the six LSTM models for 1 hour ahead forecasting.
Model
Input shape
Structure (hidden layers)
Optimizer/epoch
Main input
Auxiliary input
Model I-A
(k × 10 × 1)
—
100-40 (LSTM)
Adam/200
Model I-B
(k × 11 × 5)
—
100-40 (LSTM)
Adam/200
Model II-AC
(k × 10 × 1)
(k × 3 × 1)
32 (LSTM)-64-32 (MLP)
Adam/200
Model II-AD
(k × 8 × 1)
(k × 3 × 1)
32 (LSTM)-64-32 (MLP)
Adam/200
Model II-BC
(k × 7 × 5)
(k × 3 × 1)
32 (LSTM)-64-32 (MLP)
Adam/200
Model II-BD
(k × 12 × 5)
(k × 3 × 1)
30-10 (LSTM)-64-32 (MLP)
Adam/200
k: the sample size of the minibatch.
The performance of the six models with the optimal parameters and structure can be seen in Table 8 and Figure 8. Compared with the persistence model, the performance of the forecast skill (FS) and the promoting percentage of RMSE (P) of each model is significantly improved. Compared with BPNN, the P of each model has also improved, and the improvement advantage of Model II-BD is more visible, reaching 19.19%.
The performance of the six models with the optimal parameters and structure.
Model
Model I-A
Model I-B
Model II-AC
Model II-AD
Model II-BC
Model II-BD
Persistence
BPNN
RMSE (W/m2)
75.22
71.0791
70.2837
69.7479
70.4761
62.1618
112.5854
76.9272
nRMSE (%)
38.12
36.895
36.4806
36.2076
36.5898
32.2702
58.4263
39.9215
FS
0.3476
0.36856
0.37566
0.3803
0.3737
0.4477
0
0.3167
P1 (%)
33.19
36.87
37.57
38.05
37.40
44.79
0
0.3167
P2 (%)
2.22
7.60
8.64
9.33
8.39
19.19
−46.35
0
P1: the benchmark model is persistence model; P2: the benchmark model is BPNN model.
The RMSE of the six models with different lagging time (from 1 hour to 12 hours) based on the optimal parameters and structure.
The RMSE and time cost curve of the different models with different lagging time are shown in Figure 9. It can be seen from the figures that with the increase of lagging time (the dimension of the input variable increases), the time cost increases approximately linearly (especially, in Figure 9(f), there is a sudden change in time cost because the number of LSTM layers increased from 1 to 2). This is because the increase in input variables leads to an increase in the amount of calculation. Meanwhile, except for Model I-A, the RMSE of other models does not decrease linearly with the rise of lagging time, but only shows a certain downward trend, and the whole curve is fluctuant. This indicates that the optimal lagging time is not the maximum lagging time; we need to choose the appropriate lagging time according to the actual dataset and the required accuracy.
The RMSE and time cost curve of the different models with different lagging time. (a) Model I-A, (b) Model I-B, (c) Model II-AC, (d) Model II-AD, (e) Model II-BC, and (f) Model II-BD.
The one-hour-ahead irradiance forecasted results for the proposed Model II-BD with the best parameters and architecture are shown in Figure 10. As can be seen in Figure 10(a), the blue circle (O) in the figure represents the measured value, the red asterisk (∗) denotes the forecasted value, and the predicted value and the actual value can remain the same for most of the time. It can be shown more clearly from the local enlarged drawing that the difference between measured and forecasted values is small. It is clear from Figure 10(b) that the predicted values are strongly correlated with the measured solar irradiance data, and the linear regression coefficient reaches 0.9642. So, in summary, the forecasted values of the solar irradiance have good agreement with the measured values.
Scatter plots of actual vs. predicted values for the proposed Model II-BD with the best parameters and architecture.
Through the above experimental results, we found that the Model II-BD structure of the LSTM-MLP model has the best prediction accuracy. The following LSTM-MLP model specifically represents the LSTM-MLP model with a Model II-BD structure.
Six experimental simulations were performed to verify the performance of the proposed LSTM-MLP model, including BP network, general RNN network, random forest network, SVM network, general LSTM, and LSTM-MLP model. The forecasting results are shown in Table 9. As can be seen from the table, the RMSE, nRMSE, MAE, MBE, and R criteria of the proposed LSTM-MLP model outperformed the other five general machine learning models. Compared with BPNN, RNN, random forest, SVM, and LSTM, the promoting percentage of RMSE (P) was improved by 19.19%, 20.15%, 11.68%, 19.31%, and 13.48%, respectively. Obviously, the LSTM-MLP model’s predicted results are better than those of the other five models. This is because, for the LSTM-MLP model, it is a mixture model of LSTM and MLP models, in which the MLP model adds a new input containing the future hidden information, so it can improve the prediction accuracy.
Performance comparison between the proposed LSTM-MLP model and the general machine learning method.
Model
RMSE (W/m2)
nRMSE (%)
MAE
MBE
R
BPNN
76.9272
39.9215
33.2872
−4.7798
0.9601
RNN
77.8444
40.4115
40.0704
−16.2864
0.9602
Random forest
70.3808
36.5369
30.5618
−0.9593
0.9659
SVM
77.0384
39.9931
50.9564
−1.6776
0.9593
General LSTM
71.8434
37.2962
30.6385
−1.9523
0.9649
The proposed LSTM-MLP
62.1618
32.2702
26.6538
−0.4547
0.9737
Furthermore, the data of three weather conditions are randomly selected from the test set, and the results are shown in Figure 11 and Table 10. On a sunny day (June 27, 2016), all the prediction curves and the measurement curve are in good agreement, LSTM model has the best prediction result, but LSTM-MLP also has high precision, and the nRMSE is only 4.73%. On a cloudy day (June 8, 2016), the measured values show significant volatility, and the predicted values of different models have similar trend curves, but the error is more prominent. The rapid change of the cloud layer in the cloudy day brings enormous difficulties with irradiance prediction. In contrast, the proposed model shows a good prediction effect; the nRMSE is 35.89%. On a rainy day (May 17, 2016), the measured value of irradiance is low, which can be seen from the solid red line in Figure 11, but the predicted value of the red dotted line can better follow the change of measured value. This indicates that the proposed LSTM-MLP model shows better performance on rainy days. All related results are reported in Table 10.
Comparison between measured and forecasted hourly solar irradiance for three types of weather with different methods. (a) Sunny (June 27, 2016), (b) cloudy (June 8, 2016), and (c) rainy (May 17, 2016).
Root mean square error (RMSE, W/m2) and normalized root mean square error (nRMSE, %) of different models in sunny, cloudy, and rainy weather.
Model
Sunny (June 27, 2016)
Cloudy (June 8, 2016)
Rainy (May 17, 2016)
RMSE
nRMSE
RMSE
nRMSE
RMSE
nRMSE
BPNN
38.0148
7.1308
186.9453
47.3051
42.6140
74.164
RNN
49.4941
9.2841
188.9789
47.8197
37.2885
64.8956
Random forest
29.4791
5.5297
147.7844
37.3957
43.9405
76.4725
SVM
66.8935
12.5478
171.4332
43.3799
61.1678
106.4545
General LSTM
21.8439
4.0975
183.1270
46.3389
29.0062
50.4815
The proposed LSTM-MLP
25.2291
4.7325
141.8265
35.8881
26.6292
46.3446
In order to place the work with other published works, the results with the proposed approach and results from different studies of others are compared in Table 11. The results are similar.
Comparison between the best result obtained in this study and conventional methods.
In this work, a new novel LSTM-MLP structure with two-branch input is proposed. The proposed LSTM-MLP includes one main input, one auxiliary input, one main output, and one auxiliary output. The data of historical irradiance (or irradiance and meteorological parameters) is as main input, which is feed to LSTM layers. One part from the LSTM layer is output as auxiliary output, and the other part is previously combined with the meteorological parameters (auxiliary input) and sent to a new MLP structure. The output from several hidden layers of MLP is the main output, which is the final irradiance prediction value. Four network structures based on LSTM-MLP and two network structures based on traditional LSTM are designed and developed. A real-world test case in Denver, which consists of 5 years of data, is used to verify and discuss the potential of each model.
The experimental results demonstrate that the proposed Model II-BD, which with historical irradiance and meteorological parameters as main input and the next moment meteorological parameters as an auxiliary input, significantly outperforms other models in terms of three widely used evaluation criteria. The RMSE is 62.1618 W/m2, the nRMSE is 32.2702%, and FS is 0.4477. Compared with BPNN, the promoting percentage of RMSE (P) of Model II-BD is 19.19%. The meteorological parameters at the next moment play a vital role in the prediction accuracy, which can be gained by the weather forecast. The lagging time is a significant variable for the input of LSTM, especially when only historical irradiance is used as input (e.g., Model I-A).
Data Availability
The data used to support the findings of this study are available from the corresponding author upon request.
Conflicts of Interest
The authors declare that there are no conflicts of interest regarding the publication of this article.
Acknowledgments
This work was supported by the National Natural Science Foundation of China (Grant nos. 61875171, 61865015, and 61705192) and the National Natural Science Foundation of Yunnan Province (Grant no. 2017FD069).
IRENA2020Abu Dhabi, UAEhttps://www.irena.org/-/media/Files/IRENA/Agency/Publication/2020/Mar/IRENA_RE_Capacity_Statistics_2020.pdfYangD.KleisslJ.GueymardC. A.PedroH. T. C.CoimbraC. F. M.History and trends in solar irradiance and PV power forecasting: a preliminary assessment and review using text mining20181686010110.1016/j.solener.2017.11.0232-s2.0-85041284821JiW.CheeK. C.Prediction of hourly solar radiation using a novel hybrid model of ARMA and TDNN201185580881710.1016/j.solener.2011.01.0132-s2.0-79954620339VoyantC.MuselliM.PaoliC.NivetM.-L.Numerical weather prediction (NWP) and hybrid ARMA/ANN model to predict global radiation201239134135510.1016/j.energy.2012.01.0062-s2.0-84857692148WangY.WangC.ShiC.XiaoB.Short-term cloud coverage prediction using the ARIMA time series model20179327428310.1080/2150704x.2017.14189922-s2.0-85059058743HassanJ.ARIMA and regression models for prediction of daily and monthly clearness index20146842142710.1016/j.renene.2014.02.0162-s2.0-84896046609DongZ.YangD.ReindlT.WalshW. M.A novel hybrid approach based on self-organizing maps, support vector regression and particle swarm optimization to forecast solar irradiance20158257057710.1016/j.energy.2015.01.0662-s2.0-84926152923JiangH.DongY.XiaoL.A multi-stage intelligent approach based on an ensemble of two-way interaction model for forecasting the global horizontal radiation of India201713714215410.1016/j.enconman.2017.01.0402-s2.0-85010651011VerboisH.HuvaR.RusydiA.WalshW.Solar irradiance forecasting in the tropics using numerical weather prediction and statistical learning201816226527710.1016/j.solener.2018.01.0072-s2.0-85041412724PerezR.LorenzE.PellandS.Comparison of numerical weather prediction solar irradiance forecasts in the US, Canada and Europe20139430532610.1016/j.solener.2013.05.0052-s2.0-84879522649KamadinataJ. O.KenT. L.SuwaT.Sky image-based solar irradiance prediction methodologies using artificial neural networks201913483784510.1016/j.renene.2018.11.0562-s2.0-85059299359ChuY.PedroH. T. C.LiM.CoimbraC. F. M.Real-time forecasting of solar irradiance ramps with smart image processing20151149110410.1016/j.solener.2015.01.0242-s2.0-84923023966WangF.ZhenZ.LiuC.Image phase shift invariance based cloud motion displacement vector calculation method for ultra-short-term solar PV power forecasting201815712313510.1016/j.enconman.2017.11.0802-s2.0-85037040177YagliG. M.YangD.SrinivasanD.Automatic hourly solar forecasting using machine learning models201910548749810.1016/j.rser.2019.02.0062-s2.0-85062002547FanC.WangJ.GangW.LiS.Assessment of deep recurrent neural network-based strategies for short-term building energy predictions201923670071010.1016/j.apenergy.2018.12.0042-s2.0-85058170181OlaiyaF.AdeyemoA. B.Application of data mining techniques in weather prediction and climate change studies201241515910.5815/ijieeb.2012.01.07AkarslanE.HocaoğluF. O.EdizkanR.A novel M-D (multi-dimensional) linear prediction filter approach for hourly solar radiation forecasting20147397898610.1016/j.energy.2014.06.1132-s2.0-84905719169WangF.XuanZ.ZhenZ.A minutely solar irradiance forecasting method based on real-time sky image-irradiance mapping model202022011307510.1016/j.enconman.2020.113075DongY.ZhangZ.HongW.-C.A hybrid seasonal mechanism with a chaotic cuckoo search algorithm with a support vector regression model for electric load forecasting2018114100910.3390/en110410092-s2.0-85055205835PengZ.YuD.HuangD.HeiserJ.KalbP.A hybrid approach to estimate the complex motions of clouds in sky images2016138102510.1016/j.solener.2016.09.0022-s2.0-84987959391BenaliL.NottonG.FouilloyA.VoyantC.DizeneR.Solar radiation forecasting using artificial neural network and random forest methods: application to normal beam, horizontal diffuse and global components201913287188410.1016/j.renene.2018.08.0442-s2.0-85053190417MonjolyS.AndréM.CalifR.SoubdhanT.Hourly forecasting of global solar radiation based on multiscale decomposition methods: a hybrid approach201711928829810.1016/j.energy.2016.11.0612-s2.0-85007579688HongW.-C.DongY.LaiC.-Y.ChenL.-Y.WeiS.-Y.SVR with hybrid chaotic immune algorithm for seasonal load demand forecasting20114696097710.3390/en40609602-s2.0-80052161756ZhangZ.HongW.-C.LiJ.Electric load forecasting by hybrid self-recurrent support vector regression model with variational mode decomposition and improved cuckoo search algorithm20208146421465810.1109/aCCESS.2020.2966712ZhangZ.DingS.SunY.A support vector regression model hybridized with chaotic krill herd algorithm and empirical mode decomposition for regression task202041018520110.1016/j.neucom.2020.05.075FanG.-F.QingS.WangH.HongW.-C.LiH.-J.Support vector regression model based on empirical mode decomposition and auto regression for electric load forecasting2013641887190110.3390/en60418872-s2.0-84877355447VoyantC.NottonG.KalogirouS.Machine learning methods for solar radiation forecasting: a review201710556958210.1016/j.renene.2016.12.0952-s2.0-85008622769SrivastavaS.LessmannS.A comparative study of LSTM neural networks in forecasting day-ahead global horizontal irradiance with satellite data201816223224710.1016/j.solener.2018.01.0052-s2.0-85041651713YuY.CaoJ.ZhuJ.An LSTM short-term solar irradiance forecasting under complicated weather conditions2019714565114566610.1109/ACCESS.2019.29460572-s2.0-85073693266QingX.NiuY.Hourly day-ahead solar irradiance prediction using weather forecasts by LSTM201814846146810.1016/j.energy.2018.01.1772-s2.0-85041717892Abdel-NasserM.MahmoudK.Accurate photovoltaic power forecasting models using deep LSTM-RNN20173172727274010.1007/s00521-017-3225-z2-s2.0-85031406046LiY.WuH.LiuH.Multi-step wind speed forecasting using EWT decomposition, LSTM principal computing, RELM subordinate computing and IEWT reconstruction201816720321910.1016/j.enconman.2018.04.0822-s2.0-85046628506LiuH.MiX.LiY.Smart multi-step deep learning model for wind speed forecasting based on variational mode decomposition, singular spectrum analysis, LSTM network and ELM2018159546410.1016/j.enconman.2018.01.0102-s2.0-85042506919LiuH.MiX.-w.LiY.-f.Wind speed forecasting method based on deep learning strategy using empirical wavelet transform, long short term memory neural network and Elman neural network201815649851410.1016/j.enconman.2017.11.0532-s2.0-85035079800HochreiterS.SchmidhuberJ.Long short-term memory19979817351780http://www7.informatik.tu-muenchen.de/∼hochreithttp://www.idsia.ch/∼juergen10.1162/neco.1997.9.8.17352-s2.0-0031573117MaX.TaoZ.WangY.YuH.WangY.Long short-term memory neural network for traffic speed prediction using remote microwave sensor data20155418719710.1016/j.trc.2015.03.0142-s2.0-84925966105JoshiB.KayM.CopperJ. K.SproulA. B.Evaluation of solar irradiance forecasting skills of the Australian bureau of meteorology’s ACCESS models201918838640210.1016/j.solener.2019.06.0072-s2.0-85067193926HuangX.ShiJ.GaoB.TaiY.ChenZ.ZhangJ.Forecasting hourly solar irradiance using hybrid wavelet transformation and elman model in smart grid2019713990913992310.1109/ACCESS.2019.2943886MartínL.ZarzalejoL. F.PoloJ.NavarroA.MarchanteR.ConyM.Prediction of global solar irradiance based on time series analysis: application to solar thermal power plants energy production planning201084101772178110.1016/j.solener.2010.07.0022-s2.0-77956176157BlagaR.SabadusA.StefuN.DughirC.PaulescuM.BadescuV.A current perspective on the accuracy of incoming solar energy forecasting20197011914410.1016/j.pecs.2018.10.0032-s2.0-85055169225NRELGolden, CO, USANREL Solar Radiation Research Laboratoryhttps://midcdmz.nrel.gov/apps/sitehome.pl?site=BMSLeeK.YooH.LevermoreG. J.Quality control and estimation hourly solar irradiation on inclined surfaces in South Korea20135719019910.1016/j.renene.2013.01.0282-s2.0-84874428802BrabecM.PaulescuM.BadescuV.Tailored vs black-box models for forecasting hourly average solar irradiance201511132033110.1016/j.solener.2014.11.0032-s2.0-84911420678VoyantC.DarrasC.MuselliM.PaoliC.NivetM.-L.PoggiP.Bayesian rules and stochastic models for high accuracy prediction of solar radiation201411421822610.1016/j.apenergy.2013.09.0512-s2.0-84885974735TraperoJ. R.KourentzesN.MartinA.Short-term solar irradiation forecasting based on dynamic harmonic regression20158428929510.1016/j.energy.2015.02.1002-s2.0-84928429252AkarslanE.HocaogluF. O.A novel adaptive approach for hourly solar radiation forecasting20168762863310.1016/j.renene.2015.10.0632-s2.0-84946593673ZhaoX.WeiH.WangH.ZhuT.ZhangK.3D-CNN-based feature extraction of ground-based cloud images for direct normal irradiance prediction201918151051810.1016/j.solener.2019.01.0962-s2.0-85061748796BaeK. Y.JangH. S.SungD. K.Hourly solar irradiance prediction based on support vector machine and its error analysis201732293594510.1109/TPWRS.2016.25696082-s2.0-85013805546