A Hybrid Short-Term Power Load Forecasting Model Based on the Singular Spectrum Analysis and Autoregressive Model

Short-term power load forecasting is one of the most important issues in the economic and reliable operation of electricity power system. Taking the characteristics of randomness, tendency, and periodicity of short-term power load into account, a new method (SSA-AR model) which combines the univariate singular spectrum analysis and autoregressive model is proposed. Firstly, the singular spectrum analysis (SSA) is employed to decompose and reconstruct the original power load series. Secondly, the autoregressive (AR) model is used to forecast based on the reconstructed power load series.The employed data is the hourly power load series of the Mid-Atlantic region in PJM electricity market. Empirical analysis result shows that, compared with the single autoregressive model (AR), SSA-based linear recurrent method (SSA-LRF), and BPNN (backpropagation neural network) model, the proposed SSA-AR method has a better performance in terms of short-term power load forecasting.


Introduction
Short-term power load forecasting is one of the most important issues in economic and reliable operation of power system.Many operating decisions related to electricity power system such as unit commitment, dispatch scheduling of the generating capacity, reliability analysis, security assessment, and maintenance scheduling of the generators are based on the short-term power load forecasting.
In recent years, domestic and foreign scholars have done many studies in the field of short-term power load forecasting.Currently, the short-term power load forecasting method can be divided into two categories, that is, load-series-based forecasting method and affecting-factors-based forecasting method.Although the power load shows the random and uncertain characteristic, it also has an apparent tendency.Therefore, the load-series-based forecasting method is based upon the internal structure of the short-term power load series, which includes ARIMA, ARMAX [1,2], neural networks [3,4], gray prediction model [5,6], wavelet analysis [7,8], and other forecasting methods.However, these methods have some shortcomings: the load-series-based forecasting method can only be used for data fitting and is not suitable for the treatment of regularity; the neural network method has the problem that the relation between the input variables cannot be expressed explicitly; the grey prediction model is used for the case of the little sample data; the wavelet analysis forecasting method transforms the original sequence by the orthogonal wavelets to get the subsequences of different frequency-domain and then uses the subsequences to predict and reconstruct, which has a higher accuracy than using the original series directly.The affecting-factors-based method is from the perspective of the factors affecting the power load, which uses the regression analysis to perform the prediction on the base of determining the relation between different variables [9].However, it is very difficult to find all the factors accurately and comprehensively, which are influenced by the geographical factors, economic situation, and climate and other factors that are often quite different in different periods to some extent.So, it is too hard to find an equation which can be suited to all the forecasting cases.Although the affecting-factors-based method is better than the loadseries-based in theory, it has much less practical operability.Moreover, the affecting-factors-based method can be put into use for forecasting the medium-and long-term power load.To our knowledge and related work [10,11], the key of short-term power load forecasting is to grasp the primary ingredients reflecting the variation tendency and it is quite 2 Advances in Electrical Engineering necessary to find out a method that can depict the fluctuation characteristics of power load series.
The singular spectrum analysis (SSA) technology is a typical time-series-based analysis method which has been used for industrial production forecasting [12], signals detection [13], electricity price forecasting [14], murmur detection from heart sounds [15], and so on.The aim of SSA is to make a decomposition of the original time series into the sum of a small number of independent and interpretable components such as a slowly varying trend, oscillatory components, and a structureless noise.Then, some of these components are used for time series forecasting.At present, the SSA method has been widely applied to cope with the problem in many domains such as geography and sociology [16,17].
From the perspective of the fluctuation characteristics the power load shows randomness, trend, and periodicity, which can be extracted by using the SSA method.That is to say, the stochastic noise components which influence the forecasting accuracy can be eliminated.And Afshar and Briceño have already applied this method to load forecasting with linear recurrent formulae (LRF) [18,19], but, as we all know, the relation of power load in different time is not simply linear; usually complicated nonlinear relationship is presented.So the SSA-LRF is not so perfect if we take this into consideration.Meanwhile, using the time-series-based model to forecast power load can not only overcome the problem of invalid linear fitting, but also make up for the shortage of failing to handle the regularity of time series.Therefore, in this paper, the SSA approach and autoregressive (AR) model are combined to forecast the short-term power load, that is, SSA-AR forecasting model.
The rest of this paper is organized as follows.In Section 2, the SSA methodology and the AR model are described briefly, and the hybrid SSA-AR power load forecasting model is introduced; the empirical analysis is performed in Section 3, and the forecasting results of several different forecasting models are presented and discussed; Section 4 concludes this paper.

The Basic Principle of SSA-AR Model
2.1.A Brief Introduction to SSA Method.Singular spectrum analysis (SSA) method contains two phases named decomposition and reconstruction.The former phase arranges the original sequence in a form of time-delay matrix before decomposing the original time series.Then, the time series are reconstructed via grouping and diagonal averaging, which is called reconstruction.The reconstructed series is then used for forecasting the new data points.
In the second step, the  ×  matrix   is calculated and its Eigen triples (  ,   , and   ) are determined by singular value decomposition (SVD).Denote the Eigen values of   by   ( = 1, 2, . . ., ) in descending order and let   and   be the th left and right Eigen vectors of   , respectively.Set  = rank().Then, the trajectory matrix  can be rewritten as where   is the th singular value of  and   ( = 1, . . ., ) are the matrices of rank one.

Reconstruction.
After decomposition, the time series is reconstructed via grouping and diagonal averaging.In grouping step, the indices  =  = 1, . . .,  are grouped into  disjoint subsets  1 , . . .,   corresponding to splitting the elementary matrices   ( = 1, . . ., ) into  groups.Each group contains a set of indices as  = { 1 , . . .,   }.Then, the resultant matrix   is defined as where the trajectory matrix  is represented as a sum of  resultant matrices.Next, the diagonal averaging transfers each matrix    ( = 1, . . ., ) into a time series.

A Brief Introduction to AR Model.
The principle of autoregressive (AR) model is to use the current interference and the limited past observations to predict the present value.Given a time series  = ( 1 ,  2 , . . .,   ), the mathematical representation of the -order autoregressive model is as follows: where   ( = 1, 2, . . ., ) is the undetermined coefficients of the model (also called autoregressive coefficient) and   is a random disturbance whose mean value is zero but variance is not equal to zero.
If the lag operator is defined as , then   =  −1 ,     =  − .The -order autoregressive model can be presented as follows: Only the reciprocal of all roots of lag operator's polynomial is less than one (both fall within the unit circle); the AR () process is called covariance stationary process.

Introduction to SSA-AR Model.
The application process of SSA-AR model for short-term power load forecasting is mainly divided into two steps: first, using the SSA method to decompose and reconstruct the original power load series and then using the AR model to forecast with the reconstructed sequences.The specific calculation procedure of SSA-AR model for short-term power load forecasting is shown in Figure 1.

Empirical Analysis
The hourly power load series of Mid-Atlantic region in the PJM electricity market from 18 June to 18 July 2013 containing 720 sample points is employed in the experimental data.The mentioned data versus time have been shown in Figure 2.

Decomposing and Reconstructing the Original Power Load
Series Based on SSA.As mentioned earlier, the window length  is the only parameter in the decomposition stage.If the time series has a periodic component, the window length is taken proportional to that period to get better separability.Therefore,  = 24 × 7 = 168 h is assumed here, which corresponds to weekly variations of power load time series.This window length results in 168 Eigen triples.Then, the singular value decomposition (SVD) is applied to the time-delay matrix by MATLAB programming.Figure 3 illustrates the trend of the 168 singular point values.As shown in Figure 3, the convergence rate of the singular value is very fast.The first 30 singular values have a share of 99.6% of the power load series which can nearly be considered as the weekly trend.The singular values from the 30th one are basically close to zero, which comprise the random component of the original power load serials.Therefore, the original serials should be omitted when reconstructed.The first 30 singular values in descending order (taken to the base-10 logarithm) are listed in Table 1.
Based on the above analysis, this paper tries to reconstruct the power load series using the first 30 Eigen values.Figure 4 shows the gap between the original power load series (OS).Just as shown in Figure 4, the reconstruction has been done with a satisfactory error.
Figure 4 reflects that the reconstructed power load series fit well with the original series, of which the correlation coefficient reaches 0.9988.Therefore, the decomposition and reconstruction of the original power load series not only greatly reduce the matrix dimension, but also extract the main ingredient of the original series.So, we can describe its dynamic change trend in a better way and also get a better predicting outcome.

Forecasting the Reconstructed Power Load Series Based on AR Model.
Because the AR model is only applicable to the stationary time series, the first thing that needs to be done is to examine the stationarity of the reconstructed power load series with the ADF unit root test.By employing the software Eviews 6.0, we can get the  value equal to 0.0623 > 0.05, which means that the null hypothesis should not be rejected.That is to say, the reconstructed power load series is a nonstationary time series.Then, consider the stationarity of its first order difference series, and the calculation result is listed in Table 2.It can be seen that the first order difference series is stationary.
Observe the autocorrelation coefficient and partial correlation coefficient of the first order difference sequence which are shown in Figure 5; we can see the autocorrelation coefficient gradually decreases with fluctuation and the partial correlation coefficient tends to zero after the third order.Therefore, the AR (3) model can be established, which is shown as follows: Then, estimate (8) with the ordinary least squares (OLS) method, and the regression result is listed in Table 3. From Figure 6, we can see that the reciprocals of two roots of lag operator's polynomial of AR (3) both fall within the unit circle, which indicates that this process is covariance stationary.
Then, expand the sample size to 745 from 744, and use the AR (3) model to forecast its value.By calculation, the 745th first-order differential value equals −3620.273.Therefore, the 745th point value can be forecasted, which equals 42867.027.In the same way, the following value at any point of a certain period can also be forecasted.

Comparing the Forecasting Results of Different Models.
The AR model, SSA-LRF model, and BPNN model are selected as the comparative models.The AR model is applied to the original power load series without any treatment that can extract the main trend.Although AR model can well represent the whole tendency of the original series, the predicted value is not perfect in a way.The SSA-LRF model is the combination of SSA method and linear recurrent formula (LRF), in which LRF is a simple linear combination of the known data and its coefficients are determined by SSA method.BPNN (backpropagation neural network) is made of neuron.Not only are sufficient neurons connected to net properly, but the BPNN model should be trained appropriately before it can simulate all types of nonlinear characteristics.BPNN model is applied most widespread among all artificial neural network models, which has been widely applied to many fields related to forecasting.The predicted hourly power load results in 24 hours of Mid-Atlantic region in PJM market on July 19 by employing the above four forecasting methods are depicted in Figure 7.
In order to measure the performance of the three forecasting methods, two indices, that is, hourly mean error (HME) and the hourly peak error (HPE), have been employed in where  ACT and  FOR are the actual and predicted power load of hour , respectively.The calculation results of absolute error rate, HME, and HPE are shown in Figure 8.It can be seen that the value of SSA-AR model in terms of HME and HPE is 0.58 and 2.57, respectively, and the absolute error rate of SSA-AR model is the smallest compared with the other three models.These indicate that the SSA-AR model shows better performance than SSA-LRF, AR, and BPNN model in terms of short-term power load forecasting.

Conclusions
In this paper, a hybrid short-term power load forecasting model based on the singular spectrum analysis (SSA) and autoregressive (AR) model is proposed.As we all know, the short-term power load forecasting is vital in the fundamental operational functions of electricity market, such as unit commitment, economic dispatch, interchange evaluation, scheduled maintenance, and security assessment.In this paper, the SSA-AR model has been employed as a tool for short-term power load forecasting.Firstly, the power load series is analyzed with the SSA method to obtain the effective and predictable components of power load series.Then, the AR method is used to forecast the future values of the power load series.The hybrid SSA-AR power load forecasting method is examined by using the experimental data of Mid-Atlantic region in PJM electricity market.The obtained results show that the proposed method has a good ability in the prediction of the desired power load series.However, there is one point that needs emphasis: this method does not take the factors influencing the power load fluctuation into account.Once the outside situation, such as political, economic, and climate condition, has a sudden change, this method may not work.Therefore, this method is applicable to the short-term power load forecasting without the tremendous changes of outside situation.

Figure 1 :
Figure 1: The specific flow chart of the SSA-AR model.

3 Figure 2 : 3 Figure 3 :
Figure 2: The hourly load curve of the Mid-Atlantic region from June 18 to July 18 in 2013.

Figure 4 :
Figure 4:  The gap between the reconstructed power load series (RS) and the original power load series.

Figure 5 :
Figure 5: The autocorrelation and partial correlation of the reconstructed series.

Figure 6 :
Figure 6: Unit circle test of the covariance stationarity.

3 Figure 7 :
Figure 7: The comparison of prediction results of different models and the actual value.

Figure 8 :
Figure 8: The forecasting error comparison of four methods: (a) absolute error rate; (b) HME and HPE.

Table 1 :
The first 30 singular values in descending order (taken to the base-10 logarithm).

Table 2 :
The result of ADF test on first order difference power load series.

Table 3 :
The regression result of AR (3) model of reconstructed power load series.