Very Short-Term Load Forecasting Using Hybrid Algebraic Prediction and Support Vector Regression

This paper presents a model for very short-term load forecasting (VSTLF) based on algebraic prediction (AP) using a modified concept of the Hankel rank of a sequence. Moreover, AP is coupled with support vector regression (SVR) to accommodate weather forecast parameters for improved accuracy of a longer prediction horizon; thus, a hybrid model is also proposed. To increase system reliability during peak hours, this prediction model also aims to provide more accurate peak-loading conditions when considerable changes in temperature and humidity happen. The objective of going hybrid is to estimate an increase or decrease on the expected peak load demand by presenting the total MW per Celsius degree change (MW/C) as criterion for providing a warning signal to system operators to prepare necessary storage facilities and sufficient reserve capacities if urgently needed by the system. The prediction model is applied using actual 2014 load demand of mainland South Korea during the summer months of July to September to demonstrate the performance of the proposed prediction model.


Introduction
Load forecasting plays an important role in an efficient operation and planning of power system to maintain system stability, security, reliability, and economics.Operational decisions in power systems, such as economic dispatch, unit commitment, maintenance scheduling, reducing spinning reserve, automatic generation control, and reliability analysis, depend on the future behavior of loads.Therefore, accurate load forecasting helps the electric utility to make these operation decisions appropriately [1].
Substantial deviation from actual load demand may lead to allocation of insufficient reserve capacity resulting in limited supply of electricity at the consumer end, which leads to energy quality reduction in system reliability and load shedding such as the rolling blackout that happened last September 15, 2011, around the neighborhoods of Seoul, Busan, and other major cities [2].On the other hand, an overestimation may cause unnecessary investments or establishments which run under capacity and therefore result in uneconomic operating conditions.One of the major objectives of this paper is to help prevent the former problem from happening by providing a warning signal to the system operators if a large mismatch between the actual load and forecasted day-ahead load demand would take place and give them enough time to prepare necessary storage facilities and sufficient reserves to satisfy system security and reliability issues that might occur.This is the very purpose of very short-term load forecasting (ranging from minutes to hours), to support real-time control and security evaluation [3].
Thus, the application of the very short-term load prediction concept serves not only for advanced security evaluation of the system but also for assisting in real-time scheduling of large storage facilities.Figure 1 depicts an overview of very short-term load prediction application for real-time control.Actual load demand and weather-related factors such as ambient temperature and humidity are important input parameters to produce a more accurate forecast for the next few hours.The updated load forecast will then be easily compared with the offline day-ahead load prediction to assist in real-time control and update the scheduling of reserve and storage facilities as needed.An exhaustive review of recent studies and previous forecasting methodologies can be found in [4] pertaining to short-term load forecasting (STLF), that is, hours to weeks ahead prediction.But the focus of this paper is mainly on very short-term load forecasting (VSTLF), that is, minutes to hours ahead horizon focusing on peak-loading conditions.Different VSTLF methods reported in literature include methods of persistence, extrapolation, time series, fuzzy logic, Kalman filtering, neural networks, and support vector regression.Persistence forecasting is the simplest method where it assumes that the forecast data will be the same as the last measured values [5].This is not practical for VSTLF because very short-term load series change in real-time.Extrapolation predicts the load based on the past by using a least-square algorithm [6] or by using a curve fitting algorithm based on a shape similarity criterion [7].Similar to the extrapolation method, the autoregression method uses a simple linear combination of the previous load series for prediction.Its coefficients were tuned online using the least mean square algorithm in [8].The method was extended to autoregressive integrated moving average (ARIMA) for load forecasting, and parameters were updated via a recursive least-square algorithm with a forgetting factor in [9].ARIMA was extended to seasonal autoregressive integrated moving average to capture the seasonal load feature in [10].Kalman filter was also applied to VSTLF where loads were separated into deterministic and stochastic components, and both were predicted via Kalman filters in [11].Fuzzy logic methods convert input data to fuzzy values which are then compared with patterns extracted from the training process.The most similar fuzzy value was chosen and then mapped to the prediction in [8].
Finally, support vector regression (SVR) was developed for VSTLF, which was used with kernel functions to create complex nonlinear decision boundaries in [12].This paper exploits a different application of SVR as a compensator for the main load prediction, thus coming up with a hybrid prediction model.The hybrid model utilizes the proposed algebraic prediction (AP) as the main forecasting algorithm for an iterative hour-ahead prediction.The main objective is to come up with the most accurate prediction of the peakloading conditions for online application.The system then identifies if a significant difference between previous day weather data and current weather forecast is found; then it calls upon the implementation of SVR to adjust the initial peak load output from the AP and compensate for the expected change or behavior of the load.

Methodology
This section breaks down the proposed hybrid solution method for very short-term load forecasting system model.
Problem formulation and implementation for both solution methods will be discussed in detail starting with the AP followed by SVR and how weather-related parameters affect the prediction results.
Figure 2 illustrates a simple flowchart and overview of the hybrid solution method.First, an initial forecast is done using AP.Then the weather conditions between the current day and the previous day are evaluated if a significant change is found; if yes, the SVR is implemented to update the initial prediction by adjusting the peak-loading conditions; if no, the initial prediction is deemed final.model of the process and then projecting this model into the future.This algebraic prediction technique is developed based on the rank of the Hankel matrix to identify the skeleton of the algebraic sequence.The Hankel matrix, named after Hermann Hankel, is widely used for system identification when given sequence of output data when a realization of an underlying state-space model is desired.It has been the key tool for solving the state-space realization problem since 1965 as provided in [14].The concept of Hankel rank of a sequence is proposed in [15] for the identification of a numerical sequence.But it is important to note that the Hankel rank of a sequence is a concept independent from the state-space realization of the system.The Hankel rank just describes algebraic relationships between elements of the sequence without pretending to approximate the analytical model of an underlying dynamical system [13].Also, these algebraic relationships are exact as illustrated in the following Hankel matrix construction and formulation.
For a given number of observations with 2 elements as expressed in (1), one could build a model of the process to extrapolate the past behavior into the future.
In ( 1),  −1 is the value of the observation at the present moment.Assuming that the sequence is an algebraic progression and its Hankel rank is equal to , it is possible to determine the next element of the sequence   in the form of a Hankel matrix having an ascending skew-diagonal from left to right as shown in det With   as the only unknown in the series, (2) or the determinant of the Hankel matrix equated to zero can be expressed and broken down as the following set of linear algebraic equations to solve for   : . . .
This is the general formulation for algebraic progressions such as arithmetic progression and geometric progression.Unfortunately, such formulation does not apply to real life time series because of existing disturbance and noise in the actual systems.

Forecasting Strategy.
To create a more practical model of prediction using the concept of Hankel rank of a matrix, ( 3) is improved to identify a better representation of the skeleton algebraic sequence in order to predict a more accurate future value of a time series.Equation ( 3) is modified as follows: . . .
with (4) still retained to solve for the value of   .Notice that additional rows are augmented above the original matrices to allow for more information to increase the accuracy of the prediction.We can simplify (5) using the following formula representation: It is critical that the number of measurements () exceeds the number of system parameters (); that is,  > .Thus, Mathematical Problems in Engineering the measurement error can be filtered out in the estimation process and good quality estimates can be obtained.In the method of linear least error squares estimation, the objective is to minimize the sum of the squares of the errors or residuals.Using the customary process of getting the pseudoinverse, we get the following equation: where [P T P] −1 P T is called the left pseudo-inverse of P and â is the optimal or best least-square estimate of a.
The P matrix implementation is very similar with building the Hankel matrix using the series of hourly data from the previous days.But the most critical part of maximizing the proposed methodology is by knowing the optimal value of row-vectors which has been proved to be the breakthrough for the implementation of the proposed prediction strategy.
The method of linear least error squares estimation finds the mean value of a set of measurements a ×1 .The mean value is generally accepted to be the best estimate when the set of measurements has a Gaussian error distribution [4].It is also adversely affected by the presence of bad data; that is why it is necessary to get as much historical data as possible to establish the best estimate based on the data provided.Hence, it is also important to disregard irrelevant data during data preparation and preprocessing.The proposed method is tested using a set of ten days to forty days of historical data to determine the adequate amount of data to be used.

Support Vector Regression. Support vector regression (SVR
) is an extension of the support vector machine (SVM) algorithm for numeric prediction.SVM is a state-of-the-art machine learning algorithm that is applicable to classification tasks [16].Using the training data, it finds the maximum margin hyperplane between two classes by applying an optimization method.The decision boundary is defined by a subset of the training data, called support vectors.By using a kernel function, nonlinear decision boundaries can be formed while keeping the computational complexity low.
SVR also produces a decision boundary that can be expressed in terms of a few support vectors and can be used with kernel functions to create complex nonlinear decision boundaries.Similar to linear regression (LR), SVR tries to find a function that best fits the training data.In contrast to LR, SVR defines a tube around the regression line using a user specified parameter  where the errors are ignored and it also tries to maximize the flatness of the line (in addition to minimizing the error) [17].
The SVR formulation for time series prediction is expressed as follows.Given training data ( where   is mapped to a higher dimensional space and   is the upper training error ( *  is the lower) subject to the insensitive tube |  −(  (  )+)| ≤ .The parameters which control the regression quality are the cost of error , the width of tube , and the mapping function,  [18].
The constraints of ( 8) imply that we would like to put most data   in the tube |  − (  (  ) + )| ≤ .If   is not in the tube, there is an error   or  *  which we would like to minimize in the objective function.For traditional least-square regression,  is always zero and data are not mapped onto higher dimensional spaces.Hence, SVR is a more general and flexible treatment on regression problems [19].
The main parameters in SVR are  and .As mentioned above,  defines the error-insensitive tube around the regression function and thus controls how well the function fits the training data [20].The parameter  controls the tradeoff between training error and model complexity; a smaller  increases the number of training errors; a larger  increases the penalty for training errors and results in a behavior similar to that of a hard-margin SVM [21].
The principle of the SVR formulation is with similar respect to SVM and, once trained, the SVR will generate predictions using the following formula: For the implementation of SVR in this paper, the input training data   's and   's represent the weather-related parameters and load deviation, respectively.Once the training of the input data pairs is prepared, the trained SVR module can be used to approximate the value of   given an input   .

Effect of Weather
Variables.The lack of weather input into time series models usually limits their forecasting ability.Effect of weather variables such as temperature and humidity has been examined in [22][23][24][25].A minimum distance measurement is used in [24] to identify the appropriate historical patterns of load and temperature readings to estimate network weights in a neural network approach.This is to overcome the problems of drastic changes in weather patterns.In a similar way, historical patterns of load, temperature, and humidity are identified as a basis for the training of SVR module.It is observed that when a significant change between the previous day and present day weather conditions is found, an increased error in prediction results using AP is also observed.

Implementation and Testing. Algebraic prediction (AP)
using the concept of Hankel matrix has been implemented using the historical data of mainland South Korea for the year 2014.To show AP's performance in comparison with other forecasting models, AR, ARMA, and SVR methods are also  implemented.Accordingly, AP's performance is also tested with increasing input of historical data as well as varying the value of -row-vectors of the modified Hankel matrix that is surprisingly found to be critical in finding the most accurate prediction.
Finally, an iterative one-step forward prediction is implemented to assess the performance of each forecasting model during the days where significant changes in weather conditions are found.3 maps the process of building the SVR module in relation to the earlier illustration in Figure 2. The process of building the trained SVR module starts with preparing a fair amount of historical data to build a more reliable compensating module.Once the trained SVR module is built, the preliminary peak load output of the algebraic prediction method can be easily adjusted during days of significant change in weather conditions.

Training the SVR Module. The flowchart in Figure
Step-by-step procedures are outlined as follows starting from setting up the SVR module.
Step 1. Gather the hourly historical data of load, temperature, and humidity of the past 2-3 years.
Step 2. Do an AP simulation on the summer (or winter) months on the gathered historical data.
Step 3. Based on the simulation results, identify the days where the discrepancy between actual and predicted peak load is significant.
Step 4. Verify the collected data by filtering the ones with corresponding significant deviation of weather parameters (i.e., Δ MAX , Δ AVE , and Δ AVE ) between the current day and the previous day.
Step 5. Train the filtered data using SVR with Δ MAX , Δ AVE , and Δ AVE as input training set  and the normalized value of peak load discrepancies as output training set . Identify the appropriate kernel function as well as the optimal parameter settings for  and .
Step 6. Save the trained SVR module.
Consequently, the following procedure is done for realtime operation once the trained SVR module is prepared for peak load compensation.
Step 1. Make a 14-hour-ahead prediction using AP to determine the peak-loading conditions.
Step 2. Determine if the most recent weather forecast has significant deviation with the previous day weather conditions.
Step 3. If significant variations are found, proceed to Step 4. If none, use the initial AP results as the final peak load forecast.
Step 4. Predict the corresponding increase or decrease in the peak load using the trained SVR module with current Δ MAX , Δ AVE , and Δ AVE as input parameters.
Step 5. Update the AP results using the following formula for peak load compensation given as the set  1 ,  2 , . . .,   , . . .,   representing the predicted load for the next  hours: where  0 is the same as  1 as the initial prediction for hour 1,  max is the peak load based on the AP results, Δ SVR is the peak load adjustment from the trained SVR module, and  ,new is the updated forecast at hour  based on SVR.
Step 6. Save the new set of hourly prediction with updated peak-loading conditions.
For the actual implementation, the only data available are the 2014 hourly historical demand of the mainland South Korea.So the trained SVR modules that can be built for summer season are based only on 3 months of summer data, that is, July, August, and September data.It was found that the optimal settings for the SVR training as kernel function are the radial basis function (RBF) with parameter values of  = 0.001,  = 10.
For the input training sets  and  used for the SVR building,  is a vector with three components Δ MAX , Δ AVE , and Δ AVE while  is a normalized value from the set of all 's (peak load discrepancies).A pair of  and  is considered to be a viable data when Δ MAX , Δ AVE are both positive along with a positive  value because an increase in temperature during summer season means a positive increase as well in peak load demand.
While it is more reasonable to consider an hourly temperature and humidity data for the AP's output correction, only the maximum and average values were used as input in the SVR compensating module to make the actual implementation much easier to conduct and set up.

Simulation Results
The performance of AP is first tested in response to its sensitivity with increasing amount of historical data and varying the value of -row-vectors of the input P matrix.Figure 4 shows a summary of the least MAPE for each number of days of historical data considering different values of -row-vectors.Similarly, Figure 5 shows how the MAPE changes with respect to the value of -row-vectors of the P matrix.When implementing the AP model, it is important to identify the optimal number of needed historical data for a specific system.It is also found by trial approach that  should be at least an integer greater than the period of the underlying sequence to get a decent prediction result (e.g., for an hourly interval,  should be at least 25 because there are 24 hours in a day; for 30-minute intervals,  should be at least 49 because there are 48 30-minute intervals in a day, etc.) Furthermore, using twice the period as the value of  (i.e.,  = 48 for hourly prediction) is observed to yield a more accurate prediction according to the simulations conducted.These findings could be area specific so it is important to check if the same behavior could be observed in other locations.it takes to finish a set of hourly load prediction.As seen in Table 1, the main advantage of using AP besides lower MAPE and RMSD is that it only takes less than a second in order to yield an output compared to other methods.This is very significant because we want to use as much historical data as possible for better accuracy.Therefore, AP is more practical for use in real-time application.
To better establish the effectiveness of AP, shown in Figure 6 is a implementation of one-hour-ahead load prediction for the final week of August.And to demonstrate further its performance, an iterative one-step prediction is implemented for the same week completing a 14-hour-ahead prediction (i.e., from 8 am to 9 pm) to predict the daily peakloading conditions as shown in Figure 7.
The 14-hour-ahead prediction is performed for the whole month of August and it is found that 60% of the time, the peak load prediction is accurate enough.However, for the remaining 40%, a compensation module is needed to consider weather-related parameters such as forecasted temperature and humidity to obtain a more accurate peak-loading condition.Thus, an SVR compensating module is proposed on the application of 14-hour-ahead prediction to cover for the change in weather conditions.
From the summer months of July, August, and September for the year 2014, seventeen (17) viable pairs for training of One way to determine a good training set (given that the optimal SVR parameter settings are already known) is when each member of the set is chosen to be an input for the trained SVR module.All of these are accomplished by trial and error.Shown in Table 3 is the output  TEST when each  is chosen as an input compared to their output  ACTUAL .From Table 3, it can be easily deduced that the error sample during the building of the SVR compensation module is zero, meaning an optimal SVR parameter settings help decrease the MAPE of the training set.Looking at the column "MW/C ∘ ," the sensitivity of loading conditions per degree change in temperature is seen to vary from 400 MW to 1.5 GW of load demand.

Mathematical Problems in Engineering
Using the peak load change based on the output of the SVR compensation module, AP output can be scaled down or up depending on the sign of the SVR output.Shown in Figure 8 are the AP-SVR prediction results as applied on August 4, 12, 14, and 27 in comparison with the other forecasting models.

Conclusion
A model for very short-term load forecasting applying a hybrid approach is presented.Algebraic prediction (AP) based on the concept of Hankel rank and linear least error square estimation is coupled with support vector regression (SVR).The SVR is used to factor in weather-related parameters and improve the accuracy of algebraic prediction output.It is shown that from a set of viable pairs of historical data, an SVR compensating module can be built knowing the optimal SVR setting for the training set.Due to limited amount of historical data, the trained SVR module could not be tested for a different year.Furthermore, several test cases should still be done to expand the number of viable pairs to increase the reliability of the SVR module.Nonetheless, the feasibility of coupling SVR with AP's output as compensation for changes in temperature and humidity is shown to have promising results given an adequate amount of historical data.

Figure 1 :
Figure 1: Overview of very short-term load prediction for online application.

Figure 3 :
Figure 3: Setting up trained SVR module for peak load compensation.

Figure 8 :
Figure 8: Comparison output prediction using AP, AR, ARMA, SVR, and hybrid AP-SVR models during days with significant weather changes.

Table 1 :
MAPE and RMSD comparison of different forecasting models using 20-day input during the month of August 2014.

Table 1
shows performance of AP in comparison with other forecasting models, that is, AR, ARMA, and SVR models.Initial test is conducted to compare the performance of both prediction methods.The table shows a summary of maximum absolute percentage error (MAPE), root mean square deviation (RMSD), and computation time or the time

Table 2 :
Training set from the months July, August, and September 2014.

Table 3 :
Percentage error when each  from the training set is used as an input to the trained SVR module.