Short-Term Reactive Power Forecasting Based on Real Power Demand Using Holt-Winters ’ Model Ensemble by Global Flower Pollination Algorithm for Microgrid

,


Introduction
The term "load" in load forecasting applications usually refers to active power (kW) or energy (kWh).There are few papers or articles on reactive power prediction.Many important processes are based on the correct understanding of reactive power such as voltage/VAR optimization, power quality improvement, frequency control, and steady-state power flow analysis [1].Therefore, accurate reactive power forecasting is necessary for power engineers to understand future reactive power profiles and improve the quality of grid operations at the transmission and distribution level [2,3].
In recent years, the commercialization of smart grids and the liberalization of energy markets have created additional demands for accurate reactive power prediction [4,5].In literature, the day-ahead and hour-ahead planning of distribution energy resources requires reactive power forecasts to ensure compliance with grid limits [6].These forecasts are necessary at the aggregate level and point load level to achieve good planning.The impact of reactive power on optimal power flow solutions in smart grids and microgrids was discussed [7].Reactive power forecasting is also relevant to the power market.In many countries, power markets only manage active power trades at regular time intervals.Nonetheless, reactive power projections test the technical feasibility of energy pathways from power plants to consumers by transmission system operators reconfiguring market systems when technical boundary conditions are violated [8,9].Moreover, reactive power forecasting is expected to become a fundamental input for the reactive power market shortly [10,11].
As already mentioned above, the current practice of reactive power prediction consists of multiplying the active power prediction by a factor related to the average power factor or using a simple method that relies on the experience of grid operators [12,13].Several scientific papers deal with the prediction of reactive power.Artificial neural networks have been proposed to predict reactive power in houses and substations.Neural networks have also been presented in a probabilistic framework to predict reactive power in bulk utility buses [14,15].Another approach based on linear and piecewise linear relationships between active and reactive power was used to build a regression model of reactive power [16].The Takagi-Sugano fuzzy method is applied to the ultrashort-term prediction of active and reactive power [17].
Reactive power forecasting based on fuzzy logic requires framework verification for each stage of approval.Again, rationalizing each step is also not feasible for analyzing the boundary condition [18].This also affects the accuracy setting and enrolling capacity of the system.It also lacks hypothesis testing, which is required to validate the model under partial variation conditions.Furthermore, ANFIS involves a very complex structure and gradient learning strategies for its realization [19].The ANFIS model becomes more complicated and, at the same time, also requires more computational time.The design of the membership function dynamically also requires a very complex analysis [20].It is required to handle seven parameters at the input to predict the demand for reactive power in a microgrid.The seven parameters include voltage, frequency, time stamp, number of nonlinear load connected, temperature, real power drawn, and power factor.To address such a large number of data sets, it is required to go for regression analysis, which will combine all the input to predict the output at any instant of time for 5 minutes or 300 sec.Thus, short-term forecasting can be best modeled by the autoregressive integrated moving average (ARIMA) model.Here, in this research work, a seasonal variation to ARIMA has been added to make it simpler and practically feasible.These ARIMA and SARIMA models will act as base models or benchmarking models.The research paper has the following contributions and outcomes: (i) The dynamic optimization model for reactive power prediction based on real power demand and six parameters has been modeled using a global flower pollination algorithm (ii) The developed framework (algorithm) is accurately forecasting the short-term demand of reactive power The rest of the research article has been organized as follows: Section 2 represents the problem formulation based on the proposed scheme as mentioned above.Two benchmarking models such as ARIMA and SARIMA have been discussed under the benchmarking model of Section 4. A detailed result analysis has been described in Section 5 followed by the conclusion discussed in Section 6.

Problem Formulation
In order to analyze the reactive power forecasting based on the real power and consequent price prediction, a time series model can be designed.The flow of reactive power (Q) is a function of voltage and real power drawn over a period of time.Therefore, where Q t represents the reactive power drawn over time t and δ represents the functional parameter of voltage.Taking convolution of eq. ( 1) must satisfy The convolution method applied over eq. ( 2) results in the establishment of the white noise signal Q t t∈Q ~WN 0, δ 2 .Now, in order to analyze standard independent time series data, the system can be modeled in terms of the linearindependent model.Let us compare and model a linear time series signal R T t∈Q , representing real-time power transferred between two identified buses as a function of reactive power flow data.Therefore, shows that the flow of both real and reactive power including losses can be made the same with the rest of the white noise data.Both the time and its time inverse are on the same plane of operation; therefore, the series can be modeled in terms of the ARIMA model of the two different (p, q) processes.Figure 1 shows the contour for reactive power energy for the scaled area x.
So, defining for a set of periods of data by equation θ i = 0, the ARIMA function can be written as Equation (4) represents the series analysis of reactive power modeled on influenced as a function of real power and constant ψ i over a period of time.Again, from a regression analysis point of view, this model is termed an ARI-MA(p) process.Similarly, the ARIMA(q) process becomes Again, from eq. ( 4), for ψ i < 1, eq. ( 4) can be written as Applying energy to eq. ( 6), it becomes Now, solving eq. ( 7) for T ⟶ ∞, it becomes Here, eq. ( 11) represents the unique existence solution of eq.(10).Now, deriving the covariance function, it becomes Again, further deriving eq. ( 12), the energy content of the system becomes Again, on simplification, eq. ( 13) becomes Therefore, the inline optimization becomes The energy density equation presented in eq. ( 16) shows the Gaussian distribution based on probability theory.The empirical confidence can be taken into consideration for evaluating the interpolation.Keeping in view the spikes present in the waveform, in this work, Gaussian flower pollination algorithm (GFPA) is used instead of the flower pollination algorithm (FPA).The merits of GFPA over FPA are listed below.
(i) Easy in segregating the collateral spike (ii) Step size can be varied in accordance with spike length (iii) Analysis of periodic exponential term with respect to seasonal variation in reactive power demand Now, the modified optimization equation becomes Being a seasonal regression on time series data, the exponential decaying part presented in eq. ( 1) converges itself to a solution.A detailed process of flowchart is presented in Figure 2.

Data Collection and Analysis
Reactive power analysis over a time period requires a rigorous evaluation of load performance for both linear type and nonlinear type of loads.Therefore, two simulation test 3 International Journal of Energy Research models have been created inside the lab one with 20 kVAR load and 30 kVAR load.Both the loads are connected to the microgrid through two different sources such as solar PV and battery.The 20 kVAR load consists of high-speed blowers, heaters, and 7 induction motors (each 2 hp).In order to capture the line power data such as time stamp, reactive power, real power, current, and energy consumption, a phasor measurement unit has been installed at each nodal point.Here, the nodal point represents the load termination point.All the loads were connected to the main grid through the line scheduler (TPS2301 and SLVS277) with a preloaded time setting.This enables the operator to create an artificial transition between loads, thereby enabling the PMU to capture the required power quality.Thus, the data collected from PMU was saved into a CSV file for further analysis, testing, and evaluation.
The data were collected every 15 seconds for three months.After the collection of the data depending upon the type of load, the data were segregated into 5 different clusters based on their nominal, ordinal, interval, and ratio.All the data have undergone a redundancy check for incomplete data sets like inserting the unavailable data wherever necessary or deleting a particular set of data.Table 1 shows the basic statistical analysis of PMU data.The data has been separated from each other based on the null hypothesis.As observed in Table 1, the mean value is lowest for 45 days of data, i.e., -0.223.This signifies that the data points as collected from PMU are not far away from each other.This also signifies that, at no condition, the null hypothesis is rejected.The average standard deviation by consolidating all the data is limited to 1.01%, which is much smaller than the allowed statistical deviation of 5%.Again, as observed, the kurtosis is lightly tailed, which means that the data are not highly deviating from its normal distribution.
Table 2 represents the null hypothesis analysis of the cluster.As seen, the R-square error is gradually decreasing from cluster C-15 (days) to C-90.This means that as we approach a large data set, the data set becomes more viable for a time series analysis because the goodness of fit has also increased.The average standard error by considering all the clusters becomes 0.698 (average).This signifies that 69.8% of data can take part in the regression analysis, as they represent how distinct the data are inside the data set.The variation in the df 1 and df 2 lies in the range of 0.99 and 1.
This section of the data analysis part represents that all the clusters of data are arranged in proper order as desired by the time series algorithm.All the statistical analysis as shown in Tables 1 and 2 has been utilized by the benchmarking model for further analysis in the next section.

Benchmarking Model
4.1.ARIMA.The autoregressive integrated moving average time series model is an amalgamation of differenced autoregressive and moving average models, i.e., a combination of past output and some random noise.The AR in ARIMA represents the autoregression aspect of time series analysis on its data, MA signifies the moving average of errors, and I stands for integration, indicating that the data has been differenced with its previous values.Therefore, modifying eq. ( 15) in terms of ARIMA, Equation ( 18) represents the ARIMA model reactive power forecasting.Here, n represents the autoregressive, and m represents the moving average error.T St t−1 represents the random noise.Now, based on eq. ( 18), two ADF models have been modeled with two different P values.The critical parameters for the order of (1, 1, 2) have been evaluated.Table 3 shows the ADF statistics for different P values.
Table 4 shows the ARIMA model error analysis for the order (1, 1, 2).Here, in Table 4, the minimum standard deviation has been observed for Ar.L1.D value of 0.213 and that of -0.455 for 0.975 accuracy level.Similarly, Table 5 represents the ARIMA modulus status for the order (1, 1, 2).Tables 6 and 7 represent the ARIMA model error analysis for the order (2, 1, 2) and modulus status.As shown in Table 6, the minimum standard error of 0.116 was recorded for Ar.L2.D value.Similarly, the corresponding probability value is 0.772, and that of the upper boundary is 0.262.The corresponding moving average value is 0.307 (std.error).This signifies that the system becomes valid under the upper limit of -0.211 to 0.262.
Figure 3 shows the autocorrelation analysis with 1storder difference and 2nd-order difference of the original reactive power demand series.The higher-order difference analysis has been conducted to find any possible good solution.It is observed that the autocorrelation error for the second difference is very small as compared to the firstorder difference.Again, for 91 number of observation, the AIC is 1306.99 and BIC is of 1322.058.Similarly, the HQIC becomes 1313.07.
Figure 4 represents the forecasted reactive power as a function of real power demand for 20 days.Here, the analysis has been carried out for 95% of the confidence interval.Similarly, Figure 5 shows the rolling mean and standard deviation analysis for the original forecasted data.The rolling standard deviation shows the deviation from the average of the reactive curve.Here, it can be found that the maximum deviation is 0 07 × 10 4 .

SARIMA.
Seasonal ARIMA is one of the most widely used forecasted models.The direct modeling of the SARIMA model can be modeled by using SARIMA p, d, q × P, D, Q .Here, p, d, q represents the nonseasonal component, and P, D, Q represents the seasonal parameter.
Figure 6 represents the autocorrelation function plot for the SARIMA model.As observed, the initial spike = 1 and seasonal spike = 10.It clearly depicts that autoregression is having order 1 and seasonal AR is of order 1.
Similarly, Figure 7 represents the partial autocorrelation for SARIMA.As observed from the figure, the initial lag spike is at 1, and that of the seasonal spike lies at 10.It means that it has a moving average of 1, and that of the seasonal dependent average order is also 1.
Based on the order and class as obtained from Figure 7, Table 8 represents the SARIMA fit model with respect to reactive power time series data.
The autocorrelation has a direct impact on the partial correlation with an order of 50.37% as reported by Ljung-Box.Similarly, a probability of 87% dependency has been observed from the result analysis.The heteroskedasticity factor has been found to be 0.58, which determines the monotonicity involved in the prediction process.
The standard residual plot along with the histogram is plotted in Figure 8.As observed here, the mean residual plot is zero and all the residuals are uncorrelated.This states that the model is well fitted with the data and p, q, d values.

Extreme Gradient Boosting (XGBoost). The XGBoost is the application-oriented model of random forest (RF).
XGBoost adjusts the weight of weak learners, thereby improving the forecasting rate of each tree.XGBoost improves the generation of complex data sets by avoiding 5 International Journal of Energy Research In eq. ( 19), i represents the iteration level, and t represents the time series data of each level.ŷt represents state of data, and that of ŷ x i−1 represents the previous state of data.Equation ( 19) offers a wide range of hyperparameter selection, which enables the operator to optimize the model performance.
XGBoost parameter analysis for regression study has been presented in Table 9. Figure 9(a) represents the predicted vs. actual graph of reactive power demand with a learning rate (L-Rate) of 0.032.As observed, the sample has an associate rule of 0.11.The associate rule here represents the probability of data points inside the data set or how closely they are related to the data of different data sets.The result shows that (Figures 9(b) and 9(c)) the maximum probability of 0.14 can be achieved with the data set of L-Rate and that of 0.34 for tree analysis.Figures 9(d)-9(f) represent feature importance vs. feature index for three different learning rates.As observed in Figure 9(d), the feature importance is positive for sample-3 only.This means that the accuracy of the model in determining the feature prediction is 0.02 which is very low.In Figure 9(e), the feature importance becomes positive for two samples with the highest value of 0.2 and with a learning rate of 0.067.Therefore, it can be assumed that there exists a scope for further improvement.In the 3rd trial in Figure 9(f), all 3 samples have shown positive feature importance with a maximum accuracy level of 0.43.International Journal of Energy Research

Result Analysis
The experimental setup for the proposed work has been started with data collection for reactive power from the laboratory-operated critical load.Three months of continuous data have been collected, i.e., May 2022 to July 2022, through the phasor measurement unit (PMU) from various sensitive points.The data has been tested for redundancy check using Python programming, and wherever the data was missing is filled with mean data from that series.
In order to create a proper mathematical optimization model for its validation with respect to the problem-formulated modeling, a curve fitting analysis has been carried out over the data set.Four different methods of curve fitting have been applied to the data set such as sine order-8, interpolant, polynomial, and smoothening order-4.The detailed statistical analysis of the method is shown in Table 10.In order to optimize the local sample on the smoothing spine, curve global flower pollination-based optimization has been applied in accordance with the optimization equation as presented under eq.( 27) and eq.( 28).Algorithm 1, presents the pseudocode used in the pollination algorithm for the global optimization of the curve, here referred to as the reactive power fitted curve.
Figures 11 and 12 represent the cost function optimization for two different cost functions of c1 = 0 17 and c2 = 0 21.It is observed that the error in tracking the trajectory path is limited to 0.026 for c1 and 0.022 for c2.At about 10.00 sample time (Figure 12), the actual direction of optimization is the same as that of trajectory.Based on this, Tables 11 and 12 represent a detailed comparative analysis of different optimization algorithms such as PSO, GA, and FPA.Here, FPA shows better performance for 750 samples of observation with min.deviation of 0.47.11 International Journal of Energy Research Figure 13(a) represents the fitness function analysis for FPA with an initial pollen level of 2.33, enabling reactive power optimization.It determines the closeness of the optimized solution to the set point.Here, 5-fold cross-validation has been applied to find the mean fitness value at 4603.58. Figure 13(b) shows the current best individual.Here, a sample pollination size of 10% has been used for the evaluation of the distance between two conjugate pollens.In order to find the best local optimized solution, the average distance between two pollen has been maintained at 0 03 × 10 −3 as shown in Figure 13(c).Fitness scaling for each individual is presented in Figure 13(d).Here, expectation vs. the raw score is presented.As noticed for a score level of 4603, a maximum expectation of 4 has been found.Similarly, each individual's fitness is presented in Figure 13(e), where a fitness value of 4000 is found for every individual.The selection function for each individual is presented in Figure 13(f).Here, individuals 6 and 9 have been taken into consideration for further analysis because these individuals have the highest number of children.Similarly, Figure 14 represents the flower pollination optimization with an initial pollen level of 3.07.
In order to analyze the data using Holt-Winters' exponential smoothing, the following steps have been followed: (i) Prepare the data set to find about trend and seasonal patterns (ii) Linearity inside the elements of the data set is established (iii) Identification of seasonal multiplication or addition for prediction over the data set In this paper, the data set has been tested for both seasonal multiplication and addition models evaluating the effectiveness of the system.Table 13 shows the seasonal multiplication method under different weight conditions, and Table 14 shows the additive method under different weight conditions.Algo-rithm 2 shows the pseudocode for reactive power prediction using Holt-Winters' exponential model.
Figure 15 represents the reactive power flow/demand over 3 different months.Three different patches of reactive power demand have been evaluated.Figure 16 shows the distributed seasonal trend of the reactive power samples.The trend pattern of the data reveals that the prediction is a linear combination of exponential data and sine waveform.Here, a residual of 1 unit has been observed.The seasonal pattern has an accuracy level of 0.98 units, which is 7.09% efficient as compared to ARIMA and 11.38% efficient as compared to the SARIMA model.As desired, the additive and multiplicative trends of the proposed model merge at the end of 3 months of data.However, there exists an RMSE level of 3.28 between the two sets of predictions.
Figure 17 represents Holt-Winters' additive and multiplicative trend analyses for reactive power prediction.The prediction has been done for 5 min intervals of time for the next 5 days.The error between additive and multiplicative forecasting is merged together and found to be zero.This indicates that the prediction is accurate for all seasonal variations.
Table 15 represents the prediction for an 8-day evening peak using HW model.As observed during 8-9 pm, maximum evening peak has been observed of a maximum magnitude of 917.415MW on day 3, with an average demand of 843.39 MW on day 1.All the peak values were also validated against their actual values based on the history data set.Table 16 represents a comparative analysis of error matrices between the benchmarking model and the proposed model.The median analysis of different quantities such as nMBE, nMAE, nRMSE, and RMSE has been conducted.Normalized MBE, which evaluates the underestimation and overestimation of data inside a data set, is found to be negative for ARIMA, whereas the least positive value of 0.42 has been found for HW-GFPA under validation.Similarly, for the testing data set, it is 0.43.The normalized RMSE, which determines the variance present in between actual measured values and forecasted value, is 0.803 (least) for proposed HW-GFPA for validation and of 0.799 for testing.Require: Define objective function, f Q = q q 1 , q 2 , q 3 ⋯q n Ensure: the Initial optimal solution for the reactive power cost function, Initialize q n for DR tn = 1 IterationNo count do for =1: N do Plot initial distribution for Q n k , Q n+1 k ;where Q=reactive power Random choose p,d,q perform pollination eq. ( 27) find new optimized solution check the boundary eq.( 28) update pollen end for end for Algorithm 1: Reactive power local sample evaluation.International Journal of Energy Research

Conclusion
In this research paper, an attempt has been made to find the reactive power demand based on the usage of real power.Out of 1st-year load data, only three months of critical load reactive power has been taken into consideration from the lab experimental setup.The problem formulation, as discussed in Section 2, reveals that reactive power forecasting can be modeled based on the load demand.After detailed modeling, it is understood that the global flower pollination algorithm can be best suited for the point-by-point optimization of the system.Again, in this research paper, Holt-Winters' model has been used for forecasting reactive power.
In order to validate the robustness of the proposed model, ARIMA and SARIMA models have been taken into consideration as the benchmarking models.The GFPA has been validated against PSO and GA regarding its effectiveness in predicting the time series data for two different cost functions.Again, under the result section, a detailed sample forecasting for three different situations such as normal load, critical peak demand during the day time, and critical peak demand during the night time has been presented as forecasted reactive power for 8 days in the month of August 2022, which is also validated in terms of MAPE and average error of actual data during that time interval for the same month.
The result analysis shows that the proposed Holt-Winters' model enabled with GFPA can predict effectively reactive power based on the active power against ARIMA and SARIMA models.In this research article, unit price commitment based on reactive power has not been taken into consideration; however, future work can be made based on this research matter.

Figure 4 :Figure 5 :
Figure 4: Forecasted reactive power as a function of real power demand for 20 days.

Figure 9 :
Figure 9: Reactive power forecasting: (a) sample-1 with a learning rate of 0.032, (b) sample-2 with a learning rate of 0.067, (c) sample-3 with a learning rate of 0.070, (d) feature index for sample-1, (e) feature index for sample-2, and (f) feature index for sample-3.

Figure 11 :Figure 12 :
Figure 11: Performance of c1 against time in search space.

Figure 13 :Figure 14 :
Figure 13: Flower pollination optimization with initial pollen level 2.33: (a) fitness vs. pollen generation, (b) current best individual, (c) average distance between individual, (d) fitness scaling, (e) fitness of individual, and (f) selection function for children/pollen.

Figure 16 :Figure 17 :
Figure 16: Distributed seasonal trend of reactive power sample over three different months.

Table 1 :
Basic statistical analysis of PMU data.

Table 2 :
Null hypothesis analysis of cluster.

Table 3 :
ADF statistics analysis for two different P values.
overfitting of the curve against the data set.The nonlinearity relationship among the complex data set can be handled by providing additive training on each data point, where each previous state of iteration solution (n − 1) is considered as an actual forecast.This enables high-dimensional feature space analysis.Mathematically,

Table 8 :
SARIMA fit model with respect to reactive power time series data.

Table 10 :
Comparative analysis of different curve fitting techniques.

Table 11 :
Comparative analysis of different optimization algorithms for cost function: c1 = 0 17.

Table 12 :
Comparative analysis of different optimization algorithms for cost function: c2 = 0 21.

Table 13 :
Seasonal multiplication method under different weight conditions.

Table 14 :
Seasonal additive method under different weight conditions.Define map evaluation, f Month = m m 1 , m 2 , m 3 ⋯m 4

Table 15 :
Prediction using HW model for 8-day evening peak analysis.

Table 16 :
Comparative analysis of error metrics between the benchmarking model and proposed model.