Gray-Related Support Vector Machine Optimization Strategy and Its Implementation in Forecasting Photovoltaic Output Power

School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China Meteocontrol (Shanghai) Data Tech Co., Ltd, Shanghai 200233, China School of Electronics and Information, Northwestern Polytechnical University, Xi’an, Shaanxi 710072, China Department of Petroleum Engineering, Ahwaz Faculty of Petroleum Engineering, Petroleum University of Technology (PUT), Ahwaz, Iran Department of Petroleum Engineering, Amirkabir University of Technology (Tehran Polytechnic), Tehran, Iran


Introduction
Given the challenges such as climate change and the fossil energy crisis, renewable energy production has become much more vital [1][2][3]. Photovoltaic power production has gained more attention and increased each year because of the benefits of plentiful resources and minimal pollution [4][5][6]. Improving the reliability and accuracy of photovoltaic output power prediction is an excellent approach [7,8]. Accurate and consistent prediction results may assist the power grid in improving power quality and reducing system reserve capacities [9]. However, due to climate change, severe weather events have become more common in recent years, making it challenging to construct an accurate and reliable prediction model [10][11][12].
Several photovoltaic output power predictive models, including the time series model [13,14], physical model [15], and artificial intelligence model [16,17], have lately been proposed. The precision of the physical model prediction is heavily reliant on the accuracy of the numerical weather forecast. However, improving NWP accuracy is challenging at the moment [18]. The nonlinear properties of photovoltaic output power cannot be represented using a time series model. As a result, the prediction accuracy is low. The artificial intelligence model is capable of nonlinear fitting [19].
There are essentially three types of hybrid models for predicting photovoltaic output power. The first type predicts photovoltaic output power using an AI model paired with an optimization technique [20][21][22][23]. Photovoltaic output power was effectively estimated using an upper lower limit approximation and ELMs, as proved by Ni et al. [20]. According to work done by Liu and his colleagues, few investigations have assessed the uncertainty of predicting photovoltaic power outputs [24]. As a result, several neural networks paired with genetic algorithms were created [25]. The combined model has greater prediction accuracy and reliability, according to empirical data. Using a backpropagation neural platform with the evolving mental technique, Wang and Shen proposed a mixed framework that may be used in many situations [26]. The modeling revealed that the hybrid model outperformed the other methods in terms of predicting photovoltaic output power. While the first type produced acceptable predictive performance, it is challenging to enhance the accuracy further. The reason for this is that the first type did not extract various characteristics of photovoltaic output power. The second type is proposed to overcome this issue. First, the photovoltaic output power is decomposed into its constituent parts using a signal disintegration method. As a result, many characteristics may be retrieved. After that, an artificial intelligence model was created for the prediction of these elements. Wavelet deconstruction and minimal squares support vector systems were used by Giorgi and his colleagues to forecast photovoltaic production potential [27]. In addition, a thorough error assessment was performed to compare the model's efficiency to that of competing models. To forecast photovoltaic output strength, Wang and his colleagues used a wavelet transformation paired with a definitive method [28]. A significant ability to enhance prediction reliability was shown numerically by the recommended technique. WT was utilized by Malvoni and his colleagues to deconstruct the historical data [29]. LSSVM and the group technique were then used to predict photovoltaic output power. Majumder and his colleagues proposed a more reliable photovoltaic output power predictive model for different weathers and times [30]. The prediction model was used in conjunction with altering mode decomposition as well as an extreme learning machine. Even though the signal decomposition model was effectively employed for extracting features, the single artificial intelligence model has its disadvantages. Variables are assigned haphazardly and are prone to falling into local optimum. It is difficult for the 2nd type to enhance the projection any further. The third type has been proposed to address the issues mentioned above. To begin, the signal decomposition model is utilized on the original photovoltaic output power. Subsequently, for prediction, an artificial intelligence model mixed with an optimization method is constructed. Lin and Pai used seasonal decomposition to manage the initial photovoltaic output power [31]. Subsequently, evolutionary algorithm-optimized minimal squares support vector regression was introduced for forecasting purposes. The suggested model outperformed the competition in terms of predicting accuracy, according to empirical data. Prediction is challenging due to the significant volatility of photovoltaic production power. A new approach based on enhanced experimental deconstruction modeling as well as support vector regression utilizing an improvement approach was developed to solve this problem by Shang and Wei [32]. WT was utilized by Eseye and his colleagues to divide the original photovoltaic output power into finer elements [33]. Next, the regression of the support vector was used. Particle swarm optimization was used to enhance the variables of regression of support vector to increase predicting accuracy. The findings revealed that the hybrid model was more accurate. An ELM improved via the sine cosine approach was utilized by Behera and Nayak to disassemble and anticipate the initial photovoltaic output power employing experimental pattern disintegration [34]. The findings revealed that the suggested model worked well in terms of predicting photovoltaic output power. Even though the third type produces superior predicting results, it nevertheless has the following disadvantages. Since photovoltaic output power has high fluctuation and changeability, no constant-variable simulation can provide a reliable estimation. The versatility of existing prediction models is sometimes overlooked. Secondly, the present decomposition models have not efficiently recovered distinct information from photovoltaic output power. The purpose of writing this article is to propose a model for estimating photovoltaic (PV) output power with higher accuracy compared to previous works. Therefore, the GWO-SVM model was studied, and the performance of this model was estimated by examining the related statistical analyses.

Support Vector Machine
When it comes to identifying patterns and analyzing data, SVM is one of the monitored training approaches established employing numerical modeling principles [35,36]. Additionally, statistical analysis and categorization are performed using this parameter [37]. The purpose of our study is to use this technique as a regression approach by employing a nonlinear component of ΦðxÞ to move information from a high-dimensional environment to a firstdimensional one [38][39][40]. The aforementioned nonlinear mapping is accomplished by the creation of the appropriate kernel component of Kðx i , y i Þ [41]. Additionally, it is considered that certain points are not categorized adequately by a hyperplane; hence, the slack variable is used for this problem [42]. Using m data points in the data space as well as a training dataset of D = fðx i , y i Þ | i = 1, 2, 3, ⋯, mg, a regression function may be presented with y = w T ΦðxÞ + b, where ΦðxÞ represents nonlinear topography function and b and w indicate offsets as well as weight vectors, respectively [43,44]. As a result, the optimal formula for the support vector regression model is as follows [45,46]: s:t: International Journal of Photoenergy where C is the penalty parameter, ε displays the loss function variable, and ξ * i and ξ i are the slack parameters [47]. The model loss is calculated once the decisive fault between anticipated and actual scores is more extensive [48,49]. This problem foundation relates to convex quadratic programming. The Lagrangian function is utilized to integrate the constraint into the cost function, and the dual question may be dissolved in the following manner [50]: Here, α indicates the Lagrangian multiplier, and the kernel function computes the movement relation of the used data collection [51]. The function of kernel radial foundation is utilized in this study as [52] where γ denotes the RBF variable. Based on the above explanations, there are two deciding factors in this learning, notably forfeit variable C and the RBF variable, which refers to generalization capacity and estimation performance, respectively. Lastly, the SVM hyperparameters must be optimized [53,54].

Gray-Wolf Optimization
The GWO algorithm is among the optimization algorithms generated from simulations of gray wolf social hierarchy and predation behavior [55,56]. The gray wolf pack has a solid social structure, shown in a pyramidal hierarchy [57]. The gray wolf pack is divided into four categories depending on rank. The low-level wolf follows the high-level wolf in such grades. The gray wolf pack is responsible for hunting actions such as aggressiveness, encirclement, and prey capture. The wolf pack explored for their predation once GWO had found the optimum option [58]. After that, it rummages for the optimum choice based on the gray wolf performance score as well as the relationship among the different levels [59,60].
In the training phase, 70 percent of the records have been employed, with the remainder 30 percent being used to assess the generalizability of the algorithm. All data were normalized between -1 and 1 and put into the SVM model [61].

Sensitivity Analysis
To examine the effects of the input parameters on the outputs, a mathematical approach known as sensitivity analysis was used [62]. There are many different uses for SA, including determining research priorities, detecting technological flaws, and identifying essential regions [39,63]. There are two types of SA analyses: global and local [64]. Assuming other factors stay unchanged, local sensitivity examines the impact of one factor on the objective. On the other hand, global sensitivity is a common approach that investigates the influence of inputs on the target once all parameters are changed. Figure 1 shows the efficacy of the input parameters in GWO-SVM for predicting photovoltaic output power. As can be seen, the air temperature has the most significant influence on the photovoltaic output power. The results show that all defined inputs have a considerable impact on the photovoltaic output power values.

Designing a GWO-SVM Model
Based on previous discussions, C, ε, and γ control the performance of the SVM algorithm. As a result, GWO was employed to improve these variables in the current investigation. The GWO is divided into four sections: tracking, social hierarchy, encircling, and attacking prey. For modeling wolf hierarchy, four types of gray wolves, namely, alpha ðαÞ, beta ðβÞ, delta ð δÞ, and omega ðωÞ, are employed with α, β, δ, and ω as solutions. The a, b, and d scores are computed according to the associated fitness values so that the top three strategies may predict the prey's location. The literature has detailed information detailing all elements of this method. The GWO is terminated once the last condition is met.

Outlier Analysis
Outlier diagnosis is a critical statistical method utilized to distinguish sets of data from a larger data collection [65]. Outliers are detected using an efficient technique, termed leverage statistics [66]. The crucial leverage extent H * Hat indicators (H) and standard (R) were all taken into account in the current technique. The Hat index is written as follows [65,67]: where X and t represent the two-dimensional n × k matrix and the transpose matrix symbol, respectively. The primary oblique of H is where the most likely Hat decision lies in this issue [68]. The presentation of the Williams plot identifies the outliers. The correlation between normalized residue and the Hat indicator is shown in this chart [69,70]. The valid range of data is specified as a squared area with a range of ±3 standard 3 International Journal of Photoenergy deviations and a strength domain of 3n/ðp + 1Þ (p and n indicate the number of inputs of the model and the learning nodes). The significant frequency of data put in the spectra of −3 ≤ R ≤ 3 and 0 ≤ H ≤ H * reveals that GWO-SVM may be used in a wide range of domains. Outliers are described as data (R and H) that exceed the ranges ½−3, 3 and ½0, H * . The Williams plot of GWO-SVM outputs is depicted in Figure 2. Except for one node in the spectrum of R < −3, most photovoltaic output power values investigated in this research fell within the domain of ½0, H * and ½−3, 3, demonstrating that the GWO-SVM algorithm is impressive in statistical anal-ysis and may also enhance the capacity to portray the internal relations among the photovoltaic output power score and inputs. Figure 3 shows the photovoltaic output power value calculated using the GWO-SVM method. The acquired photovoltaic output power values are presented vs. the data index, showing the training and testing results. As can be observed, the suggested model has a high prediction capacity.  International Journal of Photoenergy

Model Evaluation
The determination coefficient (R 2 ) indicates how close determined values are to actual values [44]. R 2 is a number that ranges between 0 and 1.0. The model predicts more correctly as this parameter approaches unity. The created model's near-unity R 2 reflects its ability to estimate the photovoltaic output power value. The R 2 coefficients for the learning and evaluation components of the GWO-SVM algorithms are 0.913 and 0.891, correspondingly, as shown in the intersecting graph of modeled and actual scores in Figure 4. There are numerous scores around the bisector path in learning and validation data collection, showing that the GWO-SVM has been correctly computed. The prediction capabilities and precision of the GWO-SVM model are shown in Figure 4.
The relative deviation percentages for the GWO-SVM simulation are also demonstrated in Figure 5. It is shown that the GWO-SVM model has high accuracy, with the determined variation not exceeding the 50% band. Table 1 shows the values of different statistical parameters in order to evaluate this model in estimating the target parameter.
In order to compare the accuracy of the model proposed in this paper, with the most accurate models ever suggested to predict this parameter by Zhang et al. in 2020, statistical parameters were used [71]. According to Table 2, it is clear that the model proposed in this paper shows higher accuracy in estimating the target parameter.

Conclusions
This study is aimed at evaluating how effectively a statistical learning-based model may predict the output. For that purpose, the GWO was included in the SVM model. The GWO method performed well when it came to determining tuning parameters. When compared to actual data points, estimations were proved to be highly accurate. The efficiency of the suggested methodologies was established by a definitive agreement between model outputs and absolute values while evaluating the model throughout the training and testing phases, as evidenced by statistical analysis. Comparing the suggested models' results with another reported correlation validated the models' accuracy as expected. In contrast to the robust mathematical methodologies used for this output prediction, the sug-gested strategy for predicting photovoltaic output power is user-friendly, making it a helpful tool for academics, especially in related domains.

Data Availability
Data references are described in the text of the article.

Conflicts of Interest
The authors declare that they have no conflicts of interest.