Optimal Identification for Dynamic PV Cell Parameter Based on a Data-Extension-Driven Method

,


Introduction
In the decades, the utilization of fossil fuels has caused serious pollution to the environment [1]. To avoid environmental deterioration, the renewable energy will play an important role in the future, in which solar energy will be widely used due to its easy installation, zero emission, sustainability, and economy [2]. To effectively design a photovoltaic (PV) system and maximize the generation efficiency, it should construct a highly accurate PV cell model [3] to describe the electric relationship between its output current and voltage. In general, it can be divided into three types of PV cell models according to the number of the parallel connected diodes, including the single-diode model (SDM) [4], double-diode model (DDM) [5], and triple-diode model (TDM) [6]. Although the PV cell parameters of these models can be acquired from the manufacturers, they are easy to deviate from the original standard-setting parameters because of environmental factors [3] and the inherent multimodal and nonlinear characteristics of PV cell [7]. Hence, it is crucial to acquire the real and dynamic parameters of PV cell under different weather conditions and operating status, which is usually called PV cell parameter identification [8]. Based on the PV cell parameter identification, the PV system simulation [9], performance assessment, realtime control, and dispatch [10] can be effectively and accurately implemented.
Due to the significant importance of PV system, many researchers have carried out various methods and comparative experiments for PV cell parameter identification. So far, these methods can be divided into three types, i.e., analytical methods [11], metaheuristic algorithms [12,13], and machine learning [14,15]. The literature review of these methods can be summarized as (i) Analytical Methods. In [16], a hybrid method by combining the analytical method with the numerical method was designed to extract the PV cell parameters based on the minimal information from the datasheet. Similarly, a statistical method cooperated with the analytical method [17] was proposed to achieve an accurate identification of PV cell parameter. Although the analytical methods [16,17] can realize a stable identification of PV cell parameter, it requires a time-consuming calculation with a series of complex mathematical operations; thus, a long computation time and a high computation cost will be easily caused (ii) Metaheuristic Algorithms. The metaheuristic algorithms [3] are the classical model-free optimization methods, which can avoid complex mathematical operations based on the PV cell model. Furthermore, they have excellent performance on the global exploration and local exploitation, which are suitable for the accurate parameter identification for the PV cell with nonlinear and complex characteristics. In [18], a whale optimization algorithm (WOA) inspired from the hunting process of humpback whales was used for parameter identification of three common PV cell models. Based on this method, two algorithms of improved WOA [19,20] were proposed to further enhance the convergence speed and the global searching ability. Due to the fast convergence rate, small number of tuning parameters, and strong exploitation ability, the grey wolf optimization (GWO) [21] has been employed for parameter estimation of PV cell. Furthermore, various variants of GWO were also proposed to improve the identification accuracy, including chaotic GWO [22], orthogonal learning-based GWO [23], and analytical plus GWO [24]. As one of the most classical metaheuristic algorithms, the particle swarm optimization (PSO) [25] was introduced for parameter identification of PV cell since it can offer great flexibility and efficient search for various complex optimization problems. Several variants of PSO, such as flexible PSO [26], guaranteed convergence to PSO [27], parallel PSO (PPSO) [28], timevarying acceleration coefficients PSO (TVACPSO) [29], and enhanced leader PSO (ELPSO) [30] have presented improvements to continue to result and be implemented in setting parameters. In addition, TVACPSO [29] can effectively improve the identification accuracy via an appropriate compromise between local exploitation and global exploration. Furthermore, the continuous mutation strategy of ELPSO [30] can effectively enhance the global search ability and avoid prematurity. Besides, there are many other metaheuristic algorithms for parameter identification of PV cell, such as the backtracking search algorithm [31], bald eagle search [32], squirrel search algorithm [33], queuing search optimization [34], white shark optimization algorithm [35], and moth flame algorithm [36] (iii) Machine Learning. In order to evaluate the generation characteristics of PV cell under different weather conditions, a feed-forward neural network [37] was adopted to approximate the parameters by taking the irradiation and temperature as the network input. To enhance the optimization stability and accuracy of metaheuristic algorithms, various machine learning techniques were used to extend the limited measured or given output data; thus, a reliable fitness function can be constructed with more fitting data. In [14], a back propagation neural network (BPNN) was employed to enhance the optimization performance of the original equilibrium optimizer. Following the similar way, an extreme learning machine (ELM) [15] and a genetic neural network (GNN) [38] were used to further improve the fitting accuracy and the parameter identification performance To guarantee a high optimization stability and a highquality solution simultaneously, the combination between machine learning-based data extension and metaheuristic algorithms has been verified as an effective technique for the PV cell parameter identification. Consequently, this work adopts the same data-extension-driven method for optimal identification of dynamic PV cell parameter, where a generalized regression neural network (GRNN) is introduced for data extension. To verify the effectiveness and advantages of the proposed method, a comprehensive case study with three typical photovoltaic models (i.e., SDM, DDM, and TDM) is carried out for performance comparison with traditional metaheuristic algorithms and the machine learning techniques. Compared with the existing studies, the main novelty and contributions of this work are concluded as follows: (1) The fitness function of the traditional metaheuristic algorithms for optimal identification of dynamic PV cell parameter is constructed based on the inadequate measured output current-voltage (I-V) data from the manufacturer. As a result, the traditional metaheuristic algorithms easily lead to largely different optimal parameter vectors in different runs, and a low identification accuracy will be caused with the seriously inadequate data. In contrast, the proposed method can provide more reliable I-V data via the GRNN-based data extension; thus, a more reliable fitness function can be constructed with the adequate and representative I-V data. Consequently, the optimization stability and identification accuracy of each metaheuristic algorithm can be improved simultaneously (2) Compared with the existing machine learning techniques for PV cell parameter identification, the GRNN-based data-extension-driven method can effectively avoid overfitting due to its stronger 2 International Journal of Photoenergy nonlinear mapping ability, higher error tolerance, and robustness [39]. Besides, it can further improve the computation efficiency with a simple structure and a fast network calculation The rest of this paper is organized as follows: Section 2 provides the mathematical modelling of optimal identification for dynamic PV cell parameter; Section 3 introduces the implement framework and details of the proposed data-extension-driven method for parameter identification of PV cell; the case studies are illustrated and discussed in Section 4; and the conclusion is given in Section 5.

Mathematical Modelling
2.1. Electrical Equivalent Circuit of PV Model. Because the PV cell can translate solar energy to electricity, the PV cell has a close relationship with the illumination. In general, the equivalent circuit of a PV cell is constricted with a semiconductor P-N junction diode [14]. The Shockley diode equivalent circuit model is considered the standard photovoltaic model [40].
2.1.1. Single-Diode Model. The equivalent circuit structure of the single-diode model of photovoltaic cell is shown in Figure 1. As depicted in Figure 1, a photovoltaic cell consists of a constant current energy with the current of and a diode in parallel. When the sunlight shines on the photovoltaic cell, it will produce a light current. According to Kirchhoff's current law, the output current can be described as follows: where I SDM L represents the output current of the PV cell using an SDM model; I d represents the diode current; and I sh denotes the shunt resistor current.
According to the Shockley principle, the current flowing through the equivalent diode I d is as follows: where I sd represents the reverse saturation current of the diode; V L denotes the output voltage of the PV cell; a denotes the diode's ideality factor; and V t means the junction thermal voltage, as given in the following: where T represents the temperature of the cell; K = 1:38 × 10 −23 J/K is the Boltzmann constant; and q = 1:6 × 10 −19 C is the electron charge. The current flowing through the shunt resistor can be expressed as follows: where R s is the series resistance of the PV module; R sh is the series and shunt resistance of the PV module; and V L represents the PV module's output voltage.
In summary, based on Equations (1)-(4), the current and voltage characteristic curve equation of the PV cell can be expressed as follows [14]: According to Equation (5), there are five unknown parameters due to the number of unknown parameters. Hence, namely, I ph , I sd , R s , R sh , and a, the SDM can be called as the five parameters model.

Double-Diode Model.
The double-diode model is shown in Figure 2. According to Kirchhoff's current law, the output current of the double-diode model is equal to the algebraic sum of the internal branch currents, as follows: where I DDM L is the output current of photovoltaic DDM; I d1 and I d2 are the currents through two diodes, respectively; and I sh is the current across parallel resistance.
Substituting I sh , I d1 , and I d2 in Equation (6), then the current equation can be given as follows [14]: where I sd1 and I sd2 represent the reverse saturation currents of diode 1 and diode 2, respectively. a 1 and a 2 represent the diode ideality constants. Although the DDM model has higher accuracy, it contains two more unknown parameters compared with  is expressed by the formula as follows: where I d1 , I d2 , and I d3 are the currents through three diodes, respectively. Further substitution into Equation (8), the current equation can be given as follows [14]: where I sd1 , I sd2 , and I sd3 represent the reverse saturation currents of diode 1, diode 2, and diode 3, respectively; a 1 , a 2 , and a 3 are the diode ideality constants. A TDM has another way of addressing which is the nine parameters model of a PV cell. Obviously, TDM has nine unknown parameters, including I ph , I sd1 , I sd2 , I sd3 , R s , R sh , a 1 , a 2 , and a 3 .

Objective Function.
For the PV cell parameter extraction, the objective function should highly match the parameter accuracy. In fact, the real internal parameters for a PV cell cannot be acquired [3]; thus, the parameter identification accuracy cannot be directly evaluated based on the parameter differences between the identification parameters and the real parameters. In general, the parameter accuracy is reflected by the output differences of the PV cell, i.e., the differences between the real measured output data and the calculated output data based on the identification parameters. Here, the objective function is designed as the root mean square error (RMSE) of the output current differences on multiple data points, which can directly reveal the whole difference between the real measured output data and the calculated output data. Hence, the objective function can effectively match the parameter accuracy, i.e., a smaller objective value indicates a higher parameter accuracy. It can be written as follows [3]: where x is the solution vector of PV cell parameters; N is the number of comparison data; I L and V L are the output current and output voltage, respectively. For a more intuitive illustration, the error functions f ðV L , I L , xÞ and the unknown parameters x under different PV models are shown in Table 1.

Dynamic PV Parameter Identification by
Data-Extension-Driven Method 3.1. Framework for Parameter Identification. As shown in Figure 4, the proposed parameter identification that uses a data-extension-driven method has three main parts of the framework. The first is training the generalized regression neural network (GRNN) through the measured current and voltage data. The second part is generating more I-V data to construct an improved fitness function. The last part is a series of explorations and exploits by the metaheuristic algorithms.

I-V Operation Data Fitting Based on Generalized Regression Neural Network
The basis neural network has a radial form that is GRNN, which has serval advantages such as quicker learning speed, better nonlinear mapping ability, and easier to train. GRNN is a forward-propagating network, which does not require a cyclic training process and an optimization for optimal network weights [41]. Hence, it can implement a fast calculation for both of the training process and the data extension. Different from the traditional artificial neural networks (e.g., back propagation neural network), GRNN does not tune the numbers of layers and neurons and does not require an optimization for optimal network weights. Moreover, GRNN can directly calculate the network output by the Parzen method-based nonparametric estimation. It has been verified that GRNN performs a strong nonlinear mapping ability, high error  International Journal of Photoenergy tolerance, and robustness [39], especially for the learning task with a small sample dataset [42]. As a result, it can effectively avoid overfitting; thus, a good prediction accuracy can be acquired for the data extension of the PV cell with the inadequate measure data. GRNN contains four structural layers: input layer, pattern layer, summation layer, and output layer. The number of neurons in the input layer is equal to the number of dimensions of the input vector in the learning sample, and each neuron is a simple distribution unit that passes the input variables directly to the pattern layer.
In general, corresponding to the network input X = ½x 1 , x 2 ,⋯x n T , the output is Y = ½y 1 , y 2 ,⋯y k T .
The number of neurons in the model layer is equal to the number of learning samples m, and each neuron corresponds to different samples. In the summation layer, two types of neurons are used for summation, as follows: where X is the network input variable; X i is the learning sample corresponding to the ith neuron; and δ is the hyperparameter of the model, which needs to be set before starting training. The number of nodes in the summation layer is equal to the output sample dimension plus one, i.e., k + 1: The output of the first summation layer is as below: where y ij is the connection weight between the ith neuron in the pattern layer and the jth neuron in the summation layer.
The number of neurons in the output layer equals to the dimension k of the output vector in the learning sample, and each neuron divides the output of the summation layer by

Model
Error function Solution vector 3.2.2. GRNN-Based I-V Data Fitting. With the increasing number of training data, the extraction accuracy will continually improve. The extended PV data allows more I-V data to be obtained from curve fitting, which can be realized by GRNN, as shown in Equations (11)- (14).    International Journal of Photoenergy function. By considering the new generation data, it can be written as follows [15,38]: where N p represents number of the extended data. Note that the constructed dynamic PV parameter identification is an optimization problem with the measure and extended data-based objective function in Equation (15). Since both of the measure and extended data of the PV cell are highly accurate, the uncertainties are not considered in the optimization. Compared with the original RMSE in Equation (10), the proposed method can construct a more reliable fitness function in Equation (15) with the additional virtual data by GRNN, which can provide an efficient guide for the metaheuristic algorithms. Consequently, the proposed method can contain reliability and resilience enhancement at the lowest possible costs.

Case Studies
In this paper, the parameters of three PV cell models are estimated by seven metaheuristic techniques, and 26 pairs of known I-V measurement data sets are used for simulation experiments. Among the 26 pairs of data, six data sets representing 50%, 60%, 70%, 80%, 90%, and 100% of the measurements were randomly selected to compare and evaluate the optimization of the metaheuristic algorithms based on different measurement data.
The total quantity of each data set and predict data is set to 50 in order to give metaheuristic algorithms a reliable fitness function, for example, 37 prediction data for 50% of the data set. The metaheuristic algorithm is also put through two evaluation processes: without data extension (i.e., using only a subset of the measured data) and with data extension. To further verify the performance of the proposed method, two data-extension-driven methods including extreme learning machine (ELM) [15] and genetic neural network (GNN) [38] are introduced for results comparison.
The maximum number of iterations is set to 300, and all methods are implemented in 100 independent runs to obtain 10 International Journal of Photoenergy  Note that the constructed optimal identification of dynamic PV cell parameter is executed offline with the input measured data. Then, the PV cell parameters can be approximated by the proposed method. Particularly, the input data can be measured from the experimental or the practical systems. Therefore, this work is only verified in the simulation environment without the experimental results.   14 International Journal of Photoenergy optimal parameters identified by various algorithms for SDM. It can be seen that the parameter difference between the original metaheuristic optimization algorithms and that with GRNN-based data extension is small. Besides, Figures 5 and 6 clearly show the convergence ability of various algorithms under different data. In addition, the optimization of each algorithm under the conditions of no data-extension-driven and data-extensiondriven methods can be observed. The average RMSE achieved by each method is presented in Figure 7, indicating that each approach may more readily locate global optimums via data-extension-driven operation. Moreover, Figure 8 depicts the boxplots of RMSE obtained by the original metaheuristic algorithms and the GRN-assisted algorithms under different scenarios of measured data for SDM. Each metaheuristic algorithm is executed in 100 runs. In Figure 8, the distribution ranges of RMSE obtained by GRNN assisted algorithm are clearly less than that without data extension. Figure 9 shows the output curves of PV with the optimal parameters obtained by GRNN-WCA based on 50% measured data for SDM. As can be seen in Figure 9, the output curves obtained from GRNN-WCA can match well with the points of the actual data. Table 4 shows the average RMSE obtained by different metaheuristic optimization algorithms. The results demonstrate that GRNN-based data extension can effectively improve optimization stability and accuracy against the original metaheuristic algorithms and the other two data-extension-driven algorithms. Table 5 displays the optimal parameters of the best solutions obtained by different algorithms for DDM under 100% measured data. Furthermore, Figures 10 and 11 illustrate the convergence situation of different algorithms. It can be seen that GWO's ability of convergence is slow. Without the GRNN-based data extension, BSA is easier to get a low-quality local optimum compared to other algorithms. On the other hand, the dataextension-driven algorithm can effectively boost their optimal seeking efficiency and quality while maintaining stronger convergence stability. Furthermore, the simulation results indicate that more training data will lead to higher-quality results. Figure 12 displays the comparison of average RMSE obtained by different algorithms in DDM. By using the same algorithms, the output data is enlarged by GRNN, which can get more precise values while each algorithm can discover the global optimum more efficiently. Figure 13 shows the boxplots of different methods using different numbers of measured data. It is clear that all the      Table 6 gives the different algorithms' average RMSE using various measured data sets under TDM. Compared with the original metaheuristic algorithms, the GRNN and GNN-based data-extension-driven algo-rithms can acquire a smaller average RMSE under different scenarios of measured data. Particularly, the ELM-based data-extension algorithm easily acquires a larger average RMSE due to the overfitting network with a limited number

19
International Journal of Photoenergy of sample data. For the DDM under 100% measured data, Table 7 shows the optimal parameters of PV cell obtained by different metaheuristic algorithms.
The curves of fitness function by all the metaheuristic optimization algorithms are shown in Figures 15 and 16. It illustrates that GRNN-assisted metaheuristic algorithms

20
International Journal of Photoenergy can converge to a high-quality optimum faster, especially when the available ratio of the measured data is only 50%. Figure 17 compares the average RMSE between the original metaheuristic optimization algorithms and that with GRNN-based data extension for TDM. It shows that the average values of RMSE achieved by PSO with GRNNbased data extension drop around 30%, which proves the excellent fitting performance of GRNN and the effectiveness of the data-extension fitness function.
Boxplots of different methods under different numbers of the measured data are shown in Figure 18. Similarly, the RMSE obtained by GRNN-assisted metaheuristic optimization algorithms can result in a narrower distribution range and upper/lower limits than that without the data extension.
In Figure 19, the curves and the spots obtained using the optimal parameters obtained by GRNN-WCA are plotted, where the number of measured data is equal to 50%. The high consistency between the approximate curves and the actual measured data demonstrates that the proposed dataextension-driven method can effectively make the metaheuristic algorithms search a high-quality optimum.

Comparison among Different Models.
For the presented three PV models, the model complexity increases as the number of parallel-connected diodes increases, as illustrated in Figures 1-3. Consequently, the identification difficulty also increases due to the increasing number of parameters. As a result, the computation efficiency of SDM is the highest among these three models; thus, each metaheuristic optimization algorithm can employ the smallest population size (i.e., 30) for parameter identification of SDM. On the other hand, the model accuracy of TDM is the highest in theory since it can express a more complex relationship between the output current and voltage. Based on the optimization results listed in Tables 2, 4, and 6, the sum of the average RMSE obtained by all the algorithms with different PV models under each scenario of measure data is given in Figure 20. It is clear that the sum of average RMSE with TDM is the smallest under each scenario of measure data, while that with SDM is the largest. It fully verifies the higher model accuracy of TDM compared with the other two PV models.

Conclusions
This work proposes a novel data-extension-driven method for parameter identification with different PV cell models, which includes two main contributions, as follows: (1) The GRNN-based data-extension strategy can be flexible to be integrated in different metaheuristic algorithms via giving a trustworthy fitness function, which can enhance the optimization accuracy and stability with a limited number of measured data. Hence, the proposed data-extension-driven method can achieve a high identification accuracy for the PV cell parameters with a small number of optimization executions (2) For the small-sample data-extension problem of PV cell, the performance of GRNN is superior to ELM and GNN, including the average RMSE and the distribution of RMSE. Besides, it can further improve the computation efficiency for parameter identification due to its simple structure and fast network calculation Like the existing studies, the measured data is assumed to be acquired under the standard test condition in this work. It cannot reveal the operation features of PV cell under different irradiations and temperatures. Besides, the presented three PV models are not easily available for a practical PV cell with equipment aging, partial shading condition, and other environmental influences. To handle these two issues, our future works will focus on the dynamic parameter identification of PV cell under different weather conditions and nonstandard output curves.

Data Availability
The case study's data used to support the findings of this study are available from the corresponding author upon request.