A Simple Hybrid Model for Short-Term Load Forecasting

The paper proposes a simple hybrid model to forecast the electrical load data based on the wavelet transform technique and double exponential smoothing. The historical noisy load series data is decomposed into deterministic and fluctuation components using suitable wavelet coefficient thresholds and wavelet reconstruction method. The variation characteristics of the resulting series are analyzed to arrive at reasonable thresholds that yield good denoising results. The constitutive series are then forecasted using appropriate exponential adaptive smoothing models. A case study performed on California energy market data demonstrates that the proposed method can offer high forecasting precision for very short-term forecasts, considering a time horizon of two weeks.


Introduction
Short-term load forecasting (STLF) is an integral part of power system operations as it is essential for ensuring supply of electrical energy with minimum expenses.This type of forecasting is used to predict load demands from an hour to a week ahead so that the day-to-day operations of a power system can be efficiently planned to minimize the operational costs [1].The ability to forecast accurately the load even a few hours ahead is beneficial from different points of view, ranging from technical to commercial [2].This importance has led to the development of several mathematical models/techniques.These techniques can be broadly classified as (i) classical time series model and (ii) machine intelligence models.The classical time series models have been criticized by researchers for their weakness of nonlinear fitting capability [3].Moreover, they require huge amounts of data to arrive at optimal models and perform poorly for data with inherent special events [4][5][6].The artificial neural network (ANN) is one of the most popular machine intelligence models [7] that have been used for short-term load forecasting with encouraging results [8][9][10].An excellent review of neural networks for short-term load forecasting has been presented in [11].However, no single model has performed well in short term load forecasting [12].This has led to the development of hybrid models that try to deduct the best features of different models and integrate them to achieve good forecasting results [13][14][15].Hybrid models include a time series and ANN-based model as in [16], a combination of ANN and fuzzy expert systems as in [17], and integration of wavelet-based approach with ANN as in [18].It is to be noted that the aim of any forecasting model for STLF is to identify the different characteristics of load over different time horizons and incorporate them into the model [19].It is here that wavelet multiresolution analysis is found to be useful.The wavelet transforms are found to capture all the information in a time series and associate it with specific time horizons and locations in time and thus help to unfold the inner load characteristics that are useful for more precise forecasting [20,21].This paper describes a hybrid model based on the discrete wavelet transform for shortterm load forecasting.The idea is to combine the wavelet transform technique with the exponential smoothing method that is intuitively appealing, easy to update, and has minimal computer storage requirements [22].Noisy characteristics in the power load data will affect the forecasting precision, so it is first preprocessed using the discrete wavelet transform according to the methodology outlined in Section 3. The proposed model is verified by forecasting the electrical load of California energy market.The hybrid model is found to provide good forecasting precision for four-hour-ahead forecasts.

Theoretical Background
In modeling the time series data, noise and nonstationarity are two problems that a forecasting methodology should take care of.The presence of noisy characteristics in the data prevents the full capture of the dependency between the past and the future behavior of the time series.The non-stationarity of data infers that the time series switch their dynamics among different regions [23].Thus, in the time series data, one observes that the intrinsic property of the data (deterministic feature) is superimposed by rapid variations on much shorter time scales.Therefore, it seems appropriate to separate the fluctuating component from the deterministic component and wavelet decomposition is an excellent technique that can separate the two.[24,25].A dyadic discretized wavelet is given by

Discrete Wavelet Transform
where the control parameters ,  ∈ .The discrete wavelet transform (DWT) of a signal () can be written as where  , is known as the wavelet (or detail) coefficient at scale and location indices , .The dyadic grid wavelet leads to the construction of an orthonormal wavelet basis.They are associated with scaling function and their dilation equations.The scaling function is associated with the smoothing of the signal and is given by The scaling function can be convolved with the signal to produce approximation coefficients as The multiresolution decomposition theorem [26] gives a method of generating approximation coefficients and detail wavelet coefficients at different scales.The approximation and wavelet coefficients at scale index +1 can be generated using the coefficients at the previous scale as In general, the discrete input signal is taken to be signal approximation coefficients at scale index  = 0.For a signal of length  = 2  the range of scales that can be investigated is 0 <  < , and the input signal can be expressed as where the mean signal approximation at scale  is and detail signal approximation corresponding to scales index  is given by that is, adding the approximation of the signal at scale index  to the sum of all detail signal components across scales 0 <  <  gives the approximation of the original signal at scale index 0.

Wavelet Denoising.
It is generally assumed that the noisy part of a signal shows random characteristics and mainly reflects the inherent uncertainties in the data, while the original signal is composed of deterministic components and mainly reflects the deterministic characteristics of the data studied.When a discrete wavelet transform (DWT) is applied to a time series data, small wavelet coefficients are presumed to be dominated by noises and carry little useful information, but the original series carry useful information and are concentrated in a limited number of wavelet coefficients [27].Wavelet shrinkage denoising is not to be confused with smoothing.Smoothing removes high frequencies and retains low ones whereas denoising attempts to remove whatever noise is present regardless of the signal frequency content [28].Wavelet shrinkage denoising involves shrinking in the wavelet transform domain and consists of three essential steps: a linear forward wavelet transform, a nonlinear shrinkage denoising, and a linear inverse wavelet transform.If the observed data () is assumed to be of the form where () is true signal and () is noise, then the denoising procedure can be outlined as follows.
Given threshold  for data , the rule defines nonlinear soft thresholding.Setting the threshold  is an essential part of denoising.Different thresholding techniques have been provided in [29,30].A hybrid method called heursure [31] that determines at each multiresolution level a threshold that is either universal or Stein's unbiased risk estimator (SURE) is used in the present work.

Exponential Smoothing.
Exponential smoothing is a pragmatic approach for forecasting wherein the prediction is constructed from an exponentially weighted average of past observations [32].Its robustness and accuracy have led to its widespread use in a variety of applications.In the present work, the wavelet preprocessed data is used for prediction using double exponential smoothing.The specific formula for double exponential smoothing with additive trend and no seasonality [33,34] is Here  is the smoothing parameter for the level of the series,  is the smoothing parameter for the trend,   is the smoothed level of the series computed after   is observed,   is the smoothed additive trend at the end point of the period ,   is the observed value of the time series at period , X () is the forecast for  periods ahead from origin .The double exponential smoothing methodology is used in the present work to forecast the constitutive series obtained via Haar wavelet decomposition.

Methodology
The DWT provides a handy tool for decomposing the time series into deterministic and fluctuation components and wavelet denoising method is a means of achieving the decomposition.The wavelet threshold denoising method is influenced by several key issues such as choice of wavelet, choice of decomposition level, threshold estimation, and thresholding rule [35].In the present work, the Haar wavelet transform is used for the decomposition due to the following characteristics of the wavelet function.
(1) As the real world data is collected periodically, it is easy to assume that they are piecewise constant functions [20], and Haar wavelet transform is the most appropriate for these functions.
(2) The Haar wavelet transform is not affected by border distortions when one performs filtering/convolution on finite length signals and thus eliminates the boundary effect that is a problem with other wavelet transforms.
With the choice of Haar wavelet transform, the best resolution level is tested and it is observed that resolution level three is optimum for this case study.There are several threshold estimation methods such as universal threshold algorithm, minimax algorithm, Stein's unbiased risk estimation (SURE) [29,30] algorithm, and heuristic SURE algorithm [31] that can be used to select a threshold.The denoising results are analyzed to see if the denoising results are good [36] and hence determine the optimum threshold.It is observed here that heursure threshold selection criteria gave the best denoising results with multilevel thresholding.With the key issues of wavelet denoising method being fixed, the proposed methodology is given as a flowchart in Figure 1 and is outlined as follows.
(1) Decompose the available load series () via the Haar wavelet transform into one approximation series denoted as  3 and three detail series denoted as  1 ,  2 ,  3 .
(2) The wavelet coefficients in the detail series are thresholded using Heursure threshold selection criteria to obtain the new detail series d1 , d2 , d3 .
(3) The approximation series  3 and detail series d1 , d2 , d3 are used by the inverse Haar wavelet transform to construct the deterministic series (  ).The fluctuation series (  ) is obtained by   =  −   .
(4) The deterministic series   = ( 1 ,  2 , . . .,   ) is now forecasted using double exponential smoothing method through ( 12)-( 14).The strategy is summarized in the following steps: (i) the initial values for   and   are chosen as  1 =  1 ;  1 = (  −  1 )/( − 1) (ii) the parameter  is varied from 0 to 1 in steps of 0.05 and  is varied from 1 to 0 in decreasing steps of 0.05.For every possible combination of  and , the four-step-ahead forecast is computed.The best parameters are those resulting in the smallest four-step-ahead forecasting errors.The minimum value of the mean squared forecast error determines the optimum values of the parameters.With these parameters the four hour ahead deterministic component ( X ) is forecast.
(5) The mean corrected fluctuation series (subtracting the sample mean) is modeled using the double exponential smoothing method in a procedure similar to the one outlined in step (4) to forecast the four-hourahead fluctuation component ( X ) of the load series.
(6) The four-hour ahead load forecast is now obtained as X = X + X .presented.The one-day forecasts are carried out using an adaptive scheme.The 24-hour forecast (one day) is obtained by considering a moving window of 336 hours (2 weeks) prior to the four hours whose load is to be forecast and shifting the window by four hours until the entire day is covered.The forecasted on different test days (one in each season) is considered.The considered days are Jan 15th (Saturday a Holiday), April 11th (Tuesday a working day), and July 14th (Friday a day close to weekend) for a fair comparison.

Results
. The forecast methodology outlined in the previous section is performed on the data considered.The forecast accuracy is examined using the two different evaluation metrics: the root mean square error (RMSE) and the mean absolute percentage error (MAPE).They are defined as follows: The hourly load data for the year 2000 is shown in Figure 2.
Observe the unstable mean and variance present in this series.This behavior makes forecasting a difficult task.The  The values for the RMSE and MAPE metrics to evaluate the accuracy of the proposed methodology in forecasting the day-ahead forecasts for the days in different time periods and different seasons are presented in Table 1.The results are compared with the double exponential smoothing method (DEM).The parameters of the adaptively fitted DEM model to forecast the 24 hrs load data of the days January 15, April 11, and July 14 using the wavelet decomposed series of the load data are given in Table 2.The days considered include a holiday, a working day, and a day close to the weekend in different seasons.This has been done to show  the effectiveness of the model in handling special events/days.The evaluation metrics of the four-hour-ahead forecasts on different days (mentioned in the case study) for the three seasons of the proposed model are presented in Table 3.
The results presented confirm the assumption that wavelet shrinkage denoising produces constitutive (deterministic and fluctuating) series that can be predicted more accurately by the exponential smoothing methods.

Conclusion
This paper proposes a technique of applying the simple Haar wavelet transform in a hybrid model that utilizes the double exponential smoothing model for short-term load forecasting.The heursure wavelet thresholding method is used to separate the load series into deterministic and fluctuation (noisy) components.The problems associated with wavelet denoising such as selection of appropriate wavelet function, suitable level of decomposition, and threshold selection are handled to a certain extent by analyzing the load data.It is seen that the trend in both the components can be taken care of effectively using double exponential smoothing models.
The results demonstrate that the proposed wavelet model is advantageous in reducing the complex structure of load series and thereby enhancing the accuracy of short-term forecasts.

Figure 1 :
Figure 1: Flow chart of the algorithm.
original series, denoised series, and fluctuation series of two weeks in summer, spring, and winter are shown in Figures3(a), 3(b), and 3(c), respectively.

3 Figure 2 :Figure 3 :
Figure 2: Hourly load data of California energy market in the year 2000.

Table 1 :
Forecasting accuracy metrics for 24-hour-ahead forecast of a typical day in each seasons.

Table 2 :
The parameters of the adaptively fitted double exponential smoothing models.