Interval Prediction of Photovoltaic Power Using Improved NARX Network and Density Peak Clustering Based on Kernel Mahalanobis Distance

,


Introduction
As an alternative to traditional fossil fuels, PV power is a clean and economical renewable energy. It has become a critical factor in reshaping the power industry through profound technological changes, significantly promoting energy sustainability [1]. e global cumulative installed capacity of PV power in 2020 reached 714 GW and increased by 21.6% over 2019. PV power is easily affected by weather factors, such as GHI, AT, and WS, so it is highly nonscheduled and random [2]. With the large-scale grid-connected PV power, the randomness of PV power output will significantly increase the uncertainty of power grid dispatching and threaten the safe operation of the power system. erefore, comprehensive and accurate PV power prediction is a necessary tool to reduce the uncertainty of PV power that can provide a reliable basis for power grid dispatching and improve the stability and economic efficiency of the power system [3]. It is of great significance to enhance the competitiveness of PV power, ensure the safety of the power system, and optimize the power grid operation.
To establish an accurate PV power forecasting model, many scholars have carried out a research series. e prediction model mainly includes point forecast and probability forecast [4]. For the point prediction model, the forecasting methods can be classified into four types: (a) persistence, (b) statistical, (c) machine learning, and (d) hybrid method. e primary methods are ARMA, regression, exponential smoothing, ANN, SVM, etc [2]. Scholars proposed a hybrid method with NARX and a multilayer neural network to avoid the limitations of a single method and improve the performance of the forecasting results. is neural network is a recurrent neural architecture dynamic system with feedback connections enclosing multilayers of the network. e NARX network can evaluate the nonlinear relationship between input and output, predicting the future output time series value [5]. e NARX network has faster convergence and generalizes better than other neural network methods, widely used to solve the nonlinear problem for forecasting the PV power [6,7]. Meanwhile, the optimized parameter methods are widely applied to improve the forecasting accuracy of the model. e DEPSO-based forecasting model was applied to predict the output of a building-integrated PV system [8]. VanDeventer et al. proposed the SVM optimized by GA for short-term PV power forecasting [9]. Although the above methods can effectively improve the accuracy of PV power forecasting, the forecasting error cannot be avoided due to the inherent randomness and uncertainty of the meteorological system. From the perspective of decisionmakers, quantitative analysis of uncertainty can ensure supply reliability with minimized operating costs.
In recent years, interval forecasting, an effective form of a probabilistic forecasting method, has been widely applied to effectively quantify the uncertainty of PV power generation [10]. Interval prediction consists of the maximum and minimum values of point forecasting results based on a data distribution with a certain confidence level, which can provide accurate intervals of PV power fluctuations [11]. Compared with parameter estimation methods affected by specific data distribution assumptions, such as fuzzy inference, Beta distribution function, and Gaussian process, KDE is a popular method to estimate data distribution without prior assumption of datasets [12]. In [13], the distribution of forecasting error was estimated based on the seasonal model and KDE to realize the interval prediction of PV power output. e hybrid models combined with ELM and KDE were established for interval forecasting PV power under weather classification [14]. A prediction intervals estimation method was proposed for solar generation based on GRU neural networks and KDE [15]. Meanwhile, the factors related to the forecasting error have a significant impact on the density function. is article uses the MKDE method to obtain the complete probability density curve of PV power error with meteorological features. Moreover, the upper and lower limits are determined according to their confidence intervals, which is helpful to improve the performance of the interval prediction.
However, the above methods only provide point forecasts and PIs, which are difficult to completely deal with the volatility and uncertainty of PV power output. Some effective clustering methods are introduced into forecasting methods to improve interval prediction accuracy, such as similar days, weather, or seasons [16]. ese methods mainly cluster the historical meteorological data, which is hard to ultimately reflect the impact of volatility and uncertainty of the weather on PV power output. Different weather conditions are the significant reason for the error between the forecasting value and the actual value of PV power. Compared with other cluster methods, such as fuzzy c-means, K-means, and hierarchical clustering, DPCA is a novel clustering method. is algorithm can automatically find the cluster center and realize the efficient clustering of arbitrary shape data [17]. It has been widely used in fault detection, image feature, and target monitoring [18][19][20]. In this article, the dataset containing the errors and weather factors is applied to the cluster using DPCA improved by KMD, which helps to improve the effectiveness of interval prediction.
In this paper, a novel interval prediction model is established for PV power. SAE-BRNARX is applied for point forecasting. We introduce the KDPCA and MKDE to estimate the error distribution. Moreover, we calculate the PIs under different confidence levels.
is hybrid model has been validated by using real-world PV power datasets. is article has four main contributions: (1) the point forecasting model is established in this paper, combining with SAE, BR, and NARX network. is method can extract features, select appropriate network parameters to reduce data redundancy and overfitting, and avoid falling into the locally optimal solution. (2) Due to the uncertainty of intraday weather changes, this paper uses 5-minute level data for clustering. According to the difference of forecasting errors and meteorological factors in different periods, the data are clustered into multiple categories to enhance the effectiveness of interval prediction. (3) We utilize KMD to replace Euclidean distance to construct the KPCA, improving the reliability of clustering. KMD can map data from low-dimensional space to high-dimensional space to realize the linear separability of data. (4) MKDE is used to calculate the joint probability density of prediction error and meteorological factors under each cluster to measure upper and lower limits of forecasting intervals under different confidence levels. Compared with other methods, MKDE can better describe the data with unknown distribution to obtain a narrower prediction interval width and higher interval coverage. e rest of the paper is organized as follows. Section 2 introduces the principles and process of the SAE-BRNARX forecasting model in detail. In Section 3, the KDPCA is proposed for clustering data. Section 4 establishes the uncertainty analysis of the forecasting model. In Section 5, the implementation procedure of the PV power interval forecasting model is given in detail. Section 6 presents the evaluation metrics of point and interval forecasting in detail. Section 7 discusses the interval forecasting results through 2 Complexity four seasonal data. Finally, the conclusion and future research directions are drawn in Section 8.

SAE-BRNARX Point Forecasting Model
Due to the problems of overfitting and falling into the locally optimal solution in forecasting, the model combined with the BRNARX network and SAE is proposed to forecast the PV power output.

Bayesian Regularized NARX Network.
NARX is a dynamic RNN model for time series forecasting. It includes the input layer, hidden layer, output layer, and input-output delay structure. NARX has a memory applied to establish a mapping between inputs and output. Compared with the feedforward network, the NARX model containing exogenous input information has higher degrees of freedom and can require fewer parameters to participate in the calculation. It has more accuracy than other artificial neural networks in describing the characteristics of dynamic timevarying systems, and the formula is as follows: where y t is the prediction value of y t ; f(•;θ) is the nonlinear mapping function between inputs and output; θ is the network parameter to be measured; y � [y 1 ,y 2 , . . ., y N ] T is the observed historical value; x � [x 1 ,x 2 , . . ., x N ] T is the input of the NARX; e t is error term and assumes that the mean is 0 and the variance is σ 2 ; n y and n x are the delay numbers of outputs and inputs, respectively. In this model, time sequences are embedded using two methods: 1st-TDLs, shortmemory filters, and 2nd-feedback signals from the output layer. e multilayer perceptron is applied to construct the NARX network as a feedforward neural network, reflecting the mapping relationship between inputs and output. NARX network uses feedforward to transmit information through multilayer neurons and optimizes its connection weight to improve the global approximation. It can express complex nonlinear characteristics accurately, such as hysteresis, saturation, and chaos. erefore, according to equation (1), the NARX network with multiple inputs and single outputs can be expressed as where n i and n h are the numbers of neurons in the input layer and hidden layer of neural network; w h ji and w 0 j are the connection weights between the i th neuron of input and output and the j th hidden layer neuron, respectively; b h j and b 0 are the thresholds of the j th neuron of hidden layer and output layer, respectively; g 1 (·) and g 2 (·) are the activation functions of the hidden layer and the output layer, respectively; network parameter θ denotes the composition of connection weights w and thresholds b.
BR is used to optimize its parameter θ to improve the accuracy network and avoid falling into locally optimal solution in forecasting PV power by NARX network. BR is the method to reduce the negative impact of considerable weight on training results. It can reduce the overfitting and error of the model to adapt to the complexity of the network structure, and the optimal performance function is as follows: where is the forecasting ability, n D is the number of training data samples; i /2 is the regularized penalty term, and n θ is the number of paraments, which can control network parameters to reduce structural redundancy; α and β is the regularization parameters.
Bayesian theory sets the network parameters as random variables to obtain the posterior probability distribution used to determine the optimal network parameters, as follows: erefore, the posterior distribution of regularization parameters of the NARX network is as follows: e optimal regularization parameter based on Gaussian distribution is as follows: where c � n θ − 2α * trace((∇∇S(θ * )) − 1 ) is the number of effective network parameters by Bayesian regularization learning; E * θ and E * D are E θ and E D calculated using the optimized network parameter θ * .

Sparse Autoencoder.
e SAE neural network is an unsupervised learning algorithm. It uses the backpropagation method to set the outputs equal to the inputs. e SAE tries to realize an approximation function that makes the output is similar to the input. Compared with the autoencoder, the SAE has the sparsity constraint that limits the average activation value of hidden layer neurons. It consists of two layers of neural networks, encoder, and decoder, to reconstruct the input data. e input data x is encoded using SAE, which is designed to obtain a better representation: z � f(W (1) x + b (1) ). en, the reconstructed data can be expressed as: (2) ), where, W is the weight matrix, and b is Complexity 3 the deviation matrix. A sparse penalty is added to the loss function of SAE, which can obtain the optimal network weight parameters and avoid overfitting. e objective function is as follows: where Z � [z (1) , . . . , z (N) ] is the encoding of the sample set; ρ(Z) is the sparse metric function, expressed as where, ρ j � 1/N N n�1 z (n) j is the average activation value of the j th neuron in the hidden layer, which is similar to the activation probability of the j th neuron. Kullback-Leibler (KL) divergence is used to calculate the heterogeneity between ρ j and the given value ρ * , which is as follows:

Establishment of the SAE-BRNARX Forecasting
Model. e SAE is proposed to extract useful features. is neural network can significantly reduce the redundancy of data structure, making the extracted features more useful for recognition and forecasting. en, the BRNARX network applies the extracted data for forecasting the PV power output, which can reduce overfitting to improve forecasting accuracy.
e system model combined with SAE and BRNARX network is shown in Figure 1. e input features are meteorological factors like GHI, WS, AT, and previous PV power output, and the target is the current PV power output.
e forecasting process has the following steps: (1) e min-max normalization is applied to normalize the data of inputs and targets. Assume an input vector X � x 1 ,x 2 , . . ., x n , i � 1, 2, . . ., n. e min-max normalization is formula as follows: (2) e normalized input data is used to train the SAE feature extractor. e input features are encoded using this trained SAE to extract features. e output of SAE is the encoded features from normalized data.
(3) e encoded features are applied as input for training the BRNARX network. e data is divided into three parts, 60% for training, 15% for validation, and 25% for testing. (4) e PV power output is predicted for the testing set. (5) e forecasting values of PV power are denormalized to get actual values, and its accuracy is compared with other forecasting models.

Density Peak Clustering Algorithm Improved by Kernel Mahalanobis Distance
PV power output will be affected by meteorological factors. Under different weather conditions, the fluctuation range of PV power output is different and uncertain. At the same time, considering the nonlinear relationship between factors, the KMDDPC algorithm is employed to cluster the samples, including meteorological factors and error between the actual value and forecasting value of PV power. e sample points with similar characteristics are clustered into one cluster for improving the accuracy of interval prediction.

Density Peak Clustering
Algorithm. DPCA is a clustering method based on density peak. Its primary characteristic is that the density of the cluster center is higher than the adjacent density, and it is far away from the low-density points. Two significant parameters are local density ρ i and its minimum distance to other high-density points δ i , respectively, which characterize the point x i . e local density ρ i is defined as where, d ij is the Euclidean distance between points i and j; d c is the cutoff distance. δ i represents the minimum distance between point x j and other high-density points, expressed as e premise for selecting a cluster center is to establish a decision graph. e decision graph is a point distribution map based on two-dimensional coordinates according to the values of ρ i and δ i . e points of high ρ and relatively high δ can be discerned as the cluster centers. To simplify the selection of cluster centers, DPCA computes the decision value c i of each point x i , which is defined as In general, the points with higher decision value c i are selected as the cluster centers. If selecting the cluster center is finished, the remaining points can be assigned to the same cluster according to their nearest neighbor of high-density points. In addition, the points with high ρ and low δ can be distinguished as outliers, which can be automatically found and excluded in the clustering process.

Kernel Mahalanobis
Distance. KMD is composed of KPCA and MD. e basic idea is to first project original patterns into a high-dimensional or infinite-dimensional space by an implicit nonlinear mapping φ(x).
en, the KMD of the sample data is computed in another low-dimensional kernel principal component space through principal component analysis (PCA), as shown in Figure 2. e derivation of KMD corresponding to the sample data is as follows: Let X � [x 1 ,x 2 , . . ., x n ] T be a group of data samples, where x i ∈ R L , L is the number of features. Let the group X map to the feature space by using the implicit nonlinear mapping φ(x), expressed as e sample data is centralized by e covariance matrix can be calculated by  Figure 1: A schematic diagram of the SAE-BRNARX network.
x 3 Complexity Assume λ and p are the eigenvalues and weight vectors of the covariance matrix S Φ , respectively, which is formulated as follows: Substitute equations (16) and (17), as follows: where Substitute equation (19) into equation (18), as follows: (20) can be transformed into KMD between x i and x j is calculated as follows: where

Establishment of KMDDPC Clustering Model.
If the distance type is not appropriate, DPCA is difficult to distinguish the differences between samples, resulting in poor classification ability. Considering the nonlinear correlation between features, KMD replaces the Euclidean distance in DPCA. KMDDPC is illustrated in Algorithm 1.

Uncertainty Analysis of Forecasting Model
e accurate calculation of PV power forecasting uncertainty is of great significance to support power grid dispatching and reduce the rotating reserve capacity of power generation equipment. is section applies the MKDE and confidence intervals to quantify the error distribution of PV power forecasting for uncertainty analysis for interval prediction.

Multivariate Kernel Density Estimation.
e MKDE is a method to analyze the distribution characteristics of multidimensional data.
is method does not require prior knowledge of the relevant data distribution, does not attach any assumptions to data distribution, and has significant practical value. Compared with one-dimensional or twodimensional KDE, MKDE considers the correlation between errors and other meteorological factors to improve uncertainty analysis accuracy.
MKDE is applied to calculate the distribution of the multi-dimensional datasets. Suppose the d-dimensional e d-dimensional vector can be expresses as x � [x 1 ,x 2 , . . .,x d ] T , and its joint PDF is defined as where, K and K h are the unscaled and scaled kernel's functions, respectively; the smoothing parameter h is the bandwidth. u � (X i − x)/h is the scaled distance. e diagonal bandwidth matrix and multiplicative kernel are applied to reduce the complexity of the calculation. e multiplicative kernel is applied: is the one-dimensional kernel smoothing function. According to the equation (23), the MKDE is formulated as follows: 6 Complexity e selection of the bandwidth parameter determines the PDF. e AIMSE is introduced to set the bandwidths, and its formula is as follows: dx. e optimal bandwidth is as follows: To further simplify the calculation of bandwidth, the method of the optimal diagonal bandwidth matrix based on AIMSE removing the minimum is proposed as follows: where σ i is the standard deviation of i.

Confidence Interval Based on MKDE.
After the joint probability density distribution of the PV power forecasting error is obtained using the MKDE, the CI quantifies the joint probability density distribution. e PV power forecasting error is the difference between the PV power forecasting value P fore and the actual PV power value P true at a specified point in time, and its formula is as follows: Suppose the vector including the errors and meteorological factors X � [e, u 1 , u 2 , . . ., u n ] T , where U � [u 1 , u 2 , . . ., u n ] T is the vector of meteorological features extracted by SAE. MKDE calculates the joint PDF of vector X, and the CDF of X is obtained. Its confidence level is calculated as follows: where, the interval [x low , x up ] is supposed as confidence interval under the confidence level of 1 − θ, the x low is the lower limit of the confidence interval, and x up is the upper limit. P(x low < x < x up ) shows that the probability that X falls into the interval [x low , x up ]. According to the time points x low and x up corresponding to the lower and upper limits, point e low and e up are selected simultaneously. erefore, the PV power forecasting interval is [P fore − |e low |, P fore + |e up |]. e PIs are as shown in Figure 3.

Implementation Procedure of the PV Power Interval Prediction Model
is section proposes a novel PV power interval prediction model that integrates the SAE-BRNARX for point forecasting, the KDPCA for clustering data, and MKDE for uncertainty analysis and interval prediction. e implementation of the prediction procedure is shown in Figure 4. e specific calculation process is as follows:

SAE-BRNARX Point Forecasting Model.
(1) e data is preprocessed, including selecting the required data and paradigm, cleaning (removing abnormal data to eliminate the effect of extreme circumstances and filling in missing data during equipment fault), and normalizing the data. (2) To reduce the redundancy and overfitting between indexes, SAE extracts the features by encoding the input data. e encoding data is divided into the training and testing dataset.
(3) e training data is used to obtain the parameters of the SAE-BRNARX network for point forecasting of PV power output. Compared with results of other e sample set is represented by X, and x i is the ith sample of X, i � 1, 2, . . ., n. KMDn × n denotes the KMD matrix of this sample set. kmd ij is the element of the ith row and jth column in the KMD matrix, representing the kmd between the ith sample and jth sample. δ k i is the KMD between point i and its k nearest neighbor. μ k is the average mean of δ k i in all the points.
10) e decision graph is drawn with ρ i as the x-axis and δ i as the y-axis in the two-dimensional space (11) e points are sorted in ascending order according to the decision value c i � ρ i × δ i (12) First m data points are as cluster centers, and the remaining points will be assigned to the same cluster as their nearest neighbor of high-density points (13) end

KMDDPC Clustering Algorithm
(1) e KMDDPC is applied to cluster the training dataset composed of the forecasting error and extracted features in Section 3. Moreover, the specific calculation process is shown in Section 3.3.

Uncertainty Analysis of PV Power Error
(1) e MKDE is used to calculate the joint distribution   Compared with the results of other interval prediction methods, a novel model combined with SAE-BRNARX, KMDDPC, and MKDE can supply a narrower prediction interval width, higher interval coverage and closer distance between the middle of interval and actual value at the same confidence levels.

Accuracy of Point Forecasting
Results. e nRMSE, nMAPE, MRE, nMAE, and R 2 are used to evaluate the point forecasting accuracy of the PV power prediction model [2], which is calculated as follows: where n is the number of the PV power forecasting points; P cap is the total installed capacity of the PV power equipment; y i and y i are the forecasting power and actual power at time point i, respectively.

Performance of Prediction
Intervals. PICP, PINAW, ACE and nMPICD are applied to evaluate the performance of prediction intervals. PICP indicates that the percentage of the testing data between the upper and lower limits, and its formula is defined as follows: where 1 [L i ,U i ] (·) is the characteristic function, indicating whether the actual PV power output P true lies within the range [L i ,U i ], If P true lies, 1 [L i ,U i ] (P true ) � 1; otherwise, 1 [L i ,U i ] (P true ) � 0; L i and U i are the minimum and maximum of the forecasting at the i th moment. In practice, the reliability requires that the PICP should be less than the confidence level of forecasting intervals, and the width of forecasting intervals should be as small as possible. PINAW is an important metric to evaluate the quality of forecasting intervals. And, its formula is defined as where R is the difference between maximum and minimum of forecasting, which is applied to normalize the forecasting interval average width in percentage. ACE denotes the bias between the PICP and PINC, expressed as nMPICD is applied to evaluate the distance between the middle of the PIs and the actual value P ture , defined as If both PIs cover a point, the PI value is inversely proportional to its performance.

Data Collection.
e installation site of the PV system selected in this paper is Yulara, located in the south of the Northern Territory of Australia, as shown in Figure 5. is area is located in the desert area, with a significant temperature difference between day and night, low precipitation, high solar radiation, and drought. is means that the region has rich solar energy resources and low weather fluctuation of PV power, which can fulfill the local power demand and export the excess energy to other regions. However, the extreme weather conditions in desert locations also lead to the performance degradation of PV systems in many ways, including the overheating of solar photovoltaic panels caused by intense solar radiation, the accumulated dust and sol effect, and the rapid aging and performance degradation of PV panels due to severe weather such as dust storms. e above conditions challenge the construction of an accurate PV interval prediction model. e data source is DKASC, and the specific installation and specification information is shown in Table 1 the 5minute level data of each station is collected in a year, including PV power output and meteorological features. e recorded meteorological data include GHI, WS, and AT. e study duration is between 6 a.m. and 7 p.m., from March 2019 to November 2020. According to [21], the dataset is     Figure 5.

Effect of Point
Forecasting. An ordinary personal computer implements the interval prediction scheme with AMD R7-3700K and 16.00 GB of RAM. e data samples are divided into four time periods from March 2019 to February 2020 to test the performance of the proposed model. e SAE-BRNARX network is used for point forecasting of PV power in four time periods, and its parameters are shown in Table 3. Furthermore, the parameters of comparative methods are shown in Table 4. SAE is used for feature extraction of meteorological information, as shown in Figure 6. en, the BRNARX network is applied to predict the PV power output. Figure 7 shows the sample time series plots of all models during the last five days of every period, which illustrates the excellent agreement between forecasting value using the SAE-BRNARX network and actual PV power values. Although the actual power curve fluctuates wildly in spring, the SAE-BRNARX network can also accurately forecast the PV power to a certain degree. At the same time, there are considerable deviations between model estimations and actual PV power, during early/late daylight hours and hours of violent fluctuation in NWP. e proposed models can still adapt to these unexpected changes in the PV power profile. eir performance in this period mainly determines the differences in forecasting accuracies in different models. During the last five days of every period, Figure 8 illustrates the changes of AE and APE for the proposed models, which validate the above views.
As shown in Table 5, according to the evaluation metrics, such as RMSE, MAPE, MAPE, nMAE and R 2 , the LSTM network and BRNARX network have larger prediction errors, and prediction effectiveness is second to SAE-

Complexity
BRNARX. KELM and RF have similar prediction effectiveness, worse than the LSTM and BANARX networks. BPNN has the worst prediction effectiveness. us, the effectiveness of the combined forecasting model is confirmed that proposed in this paper. e main advantage of the NARX network model is considering the simultaneous effects of historical observations and exogenous inputs on the data. SAE and BR are applied to improve the NARX network. e former reduces the redundancy of the externalities input data of the NARX network, and the latter can well solve the overfitting problem and find the local global solution.
us, the fluctuation range of predicted power is closer to the actual power fluctuation. From Table 6, the numbers of predicted errors (normalized form) in four time periods within ±0.02, ±0.05, and ±0.1 are more than several other methods, respectively. It indicates that the errors of the proposed method tend to the normal distribution.

Effect of Interval Prediction.
KMDDPC is used to cluster the samples, including forecasting errors and meteorological features extracted by SAE. e KMDs between every two samples in the four periods are shown in Figure 9. Figure 10 shows the clustering results when the number of clusters is 4(K � 4). e original linear inseparability samples are mapped to high-dimensional space through KMD to realize linear separability and sample clustering. Figure 11 shows the error fluctuation trend and its clustering by sorting the errors according to the original time series. It can be seen that K � 4 can avoid overclassification and better describe the distribution characteristics of errors. It does not produce clusters with too few samples due to too many clusters. At the same time, the errors are effectively clustered according to their value and extracted features. e reason is that the category of PV power forecast errors is vital for determining the reserve capacity required for grid  dispatch. From the economic view, it is impossible to configure a large reserve margin for a tiny probability scenario during grid operation and dispatch. Figure 12 represents the joint cumulative probability distributions of forecasting errors and extracted features calculated by MKDE in different clusters. Figure 13 demonstrates interval prediction of PV power under multiple clusters using MKDE at different confidence levels, such as 95%, 90%, and 80%, in the five last days of each period. To further validate the availability and universality of the proposed model, we calculate the PICP, PINAW, ACE and nMPICD of the four periods at different confidence levels, as shown in Table 7. ACE denotes that the PICP is significantly higher than its confidence level. PINAW shows that the widths of the PI depend on the levels of the CI. e larger prediction intervals have a higher probability of covering the actual values. If the confidence level decreases, the average distance between the middle of the prediction interval and the actual value is closer, as shown in nMPICD. From the evaluation results in four time periods, the performances of the Interval predictions in summer and winter are lower than theirs in the Autumn and Spring.
As shown in Table 8, compared with the K-means, FCM, and nonclustering model, the proposed model combined  with SAE and KMDDPC can successfully track the variation of PV power out at 90% confidence level. It has a narrower average bandwidth and a closer distance between the middle of PI and actual value. Table 9 also illustrates that the interval prediction using SAE data is better than that using NWP data. In four time periods, the predicted performance of MKDE is greater than other models, such as KDE, Monte Carlo, and Quantile Regression. PICP and ACE of the proposed model are almost higher than other models. Furthermore, PINAW and nMPICD are almost lower than other models. It shows that the multiple clusters determined by MKDE are helpful for interval prediction, which can supply a narrower prediction interval width and higher interval coverage.

Conclusion
PV power has considerable fluctuations in different seasons and days, and large errors can not be avoided in the point forecast. For obtaining comprehensive and practical forecasting information to deal with the uncertainty of PV power generation, a novel interval forecasting method based on SAE-BARNARX and KMDDPC-MKDE is proposed in this paper. By analyzing the PV power datasets from Yulara in four periods, the conclusions can be summarized as follows: (1) From the results of point prediction using SAE-BRNARX, the average values of nRMSE, nMAPE, MRE, nMAE, and R 2 for the four periods are 4.45%, 0.90%, − 0.15%, 3.39%, and 95.93%, respectively. e accuracy of prediction is higher than other deep learning and machine learning methods. (2) Compared with other classification methods, KMDDPC can better reflect the nonlinear relationship between features, which is more reasonable to classify the samples into multiple clusters. (3) MKDE can effectively simulate the prediction error distribution, which takes into account the influence of meteorological factors on PV power output. Its joint probability density curves at different confidence levels are generated to measure the uncertainty associated with point forecasting and PIs. (4) From the results of interval prediction using KMDDPC-MKDE, the average values of PICP, PINAW, ACE, and nMPICD for four time periods are 93.93%, 9.50%, 3.93%, and 7.10% at 90% confidence level, respectively. Compared with other methods, this method has narrower bandwidth, higher coverage, and a closer distance between the middle of PI and actual value. It supplies a new way to realize the interval prediction, which is helpful for system reliability assessment of PV power plants and dispatch of the smart grid.
Although the proposed method is evaluated to be effective for interval prediction of PV power, some limitations are still worth further study. e predicted model needs more meteorological factors, such as wind direction and rainfall, to deal with the fluctuation and randomness in the PV power. Some hyperparameters are determined by experience. e model lacks methods for detecting outliers and irrelevant variables. Further research can introduce some methods to enhance its performance. For example, a metaheuristic algorithm should be employed to optimize the BR, for selecting the appropriate parameters to improve the prediction accuracy. e recognition algorithm is applied for the adaptive detection of outliers. Moreover, the relevant variables can be selected to establish joint probability density curves with errors. e PI with a narrower width, higher coverage, and closer distance between the middle of the interval and the actual point can be applied for smart grid operation.

PV:
Photovoltaic NWP: Numerical weather prediction ANN: Artificial neural network SVM: Support vector machine GA: Genetic algorithm NARX: Nonlinear autoregressive exogenous model DEPSO: Particle swarm optimization improved by differential evolution BR: Bayesian regularization SAE: Sparse autoencoder PI: Prediction interval CI: Confidence interval KDE: Kernel density estimation GRU: Gated recurrent unit MKDE: Multivariable kernel density estimation DPC: Density peaks clustering KMD: Kernel Mahalanobis distance MD: Mahalanobis distance PDF: Probability density function LSTM: Long short-term memory RF: Random forest RNN: Recurrent neural network nRMSE: Normalized root mean square error CDF: Cumulative distribution function PDF: Probability density function