Electricity Price Forecast Using Combined Models with Adaptive Weights Selected and Errors Calibrated by Hidden Markov Model

A combined forecast with weights adaptively selected and errors calibrated byHiddenMarkovmodel (HMM) is proposed tomodel the day-ahead electricity price. Firstly several singlemodels were built to forecast the electricity price separately.Then the validation errors from every individual model were transformed into two discrete sequences: an emission sequence and a state sequence to build the HMM, obtaining a transmissionmatrix and an emissionmatrix, representing the forecasting ability state of the individual models. The combining weights of the individual models were decided by the state transmission matrixes in HMM and the best predict sample ratio of each individual among all the models in the validation set. The individual forecasts were averaged to get the combining forecast with the weights obtained above. The residuals of combining forecast were calibrated by the possible error calculated by the emission matrix of HMM. A case study of day-ahead electricity market of Pennsylvania-New Jersey-Maryland (PJM), USA, suggests that the proposed method outperforms individual techniques of price forecasting, such as support vector machine (SVM), generalized regression neural networks (GRNN), day-ahead modeling, and self-organized map (SOM) similar days modeling.


Introduction
Since the 1990s, the monopoly vertically integrated utilities of electric power industries around the world have been deregulated into competitive markets, aiming to break monopoly and increase operation efficiency.It is crucial for all participants in the market to predict the electricity price with high accuracy.Their bid actions depend on the forecasting and their benefits therefore are affected by the forecasting; thus price forecasting draws great interests.
Electricity price is affected by various uncertainties, such as power load, weather, and bidders' expectations.These influential factors interact and have an intricate impact on price.Electricity price is more volatile than load with unexpected spikes (unusual prices), high frequency, and multiple seasonality (e.g., daily and weekly periodicity).So it is more difficult to be predicted than power load.There are primarily two categories of electricity price forecasting modeling, time series modeling, and artificial intelligence (AI) modeling.Time series modeling forecasts future price with available historical prices by mining the relation information contained in the data, such as autoregressive moving average (ARMA), generalized autoregressive conditional heteroscedasticity (GARCH).Contreras et al. [1] used an ARMA model to forecast next-day electricity prices for mainland Spain and Californian markets.A novel technique was proposed to forecast day-ahead electricity prices based on wavelet transform and ARIMA models in [2].A more robust time series modeling, GARCH model, was developed to forecast day-ahead electricity prices in [3,4].Time series modeling tries to mine the information contained in previous data however pays less attention to external influence leading to undesirable forecasting for the unstable characteristic of prices.
AI modeling usually exploits more circumstance influence factors than time series modeling and thus presents more desirable results.Artificial neural networks (ANNs) were developed to forecast electricity prices and showed better performance than time series modeling [5,6].An ANN modeling based on similar days was proposed to forecast day-ahead electricity prices in Pennsylvania-New Jersey-Maryland (PJM) market [7].A technique with combining the Probability Neural Network (PNN) and Orthogonal Experimental Design (OED) was developed in [8] showing better performance than its counterparts.
Limited by the complexity of AI model, information contained in the historical prices is not made full use of.A hybrid model with support vector machines (SVM) to capture the nonlinear patterns and ARIMA to solve the residuals regression estimation problems was proposed in [9] showing the great potential of hybrid modeling.Another hybrid model combining SVM and GARCH was developed in [10] to forecast the day-ahead price of the PJM market.
Time series modeling and AI modeling have different weaknesses and strengthens in price forecasting since they place different emphasis on the exploitation method for the influence information of electricity price.Several predictions by different methods were suggested to combine to smooth the fluctuations which often occur in single model forecasting.The performance of the traditional combined forecast models relies on the combining weights of individual models, which usual are fixed and determined by historical performances of the models.Fixed weights are not always the best choice because the forecasting abilities of individual models vary along with the circumstance.Sometimes one model shows better performance, other times it does not.So it is necessary to select the combining weights of individual models according to their performance under certain circumstance.However it is a big challenge to analysis the circumstance and therefore to determine the proper combining weights of individual models under that circumstance.On the other hand, neither the single model nor the combining model can make full use of the influence information, and the modeling residuals usually contain information which have not be exploited by the models.It helps to improve the forecasting accuracy by analyzing the residual series and then to estimate the residual of next step [11].
A Markov chain is a random process which undergoes transitions from one state to another.It has an important character: the next state depends only on the current state and not on the sequence of events that preceded it.Markov chain can be used to analyze the performance of forecasting [12][13][14].A hidden Markov model (HMM) is a statistical Markov model in which the system being modeled is assumed to be a Markov process with unobserved (hidden) states.We can apply HMM to exploit the information contained in the forecasting error sequence.The forecasting errors can be treated as the observations of the HMM, and the forecasting abilities of a model under certain circumstance can be looked as the states [15].In this paper a hybrid method consisting of a combining model with adaptive weights based circumstance and an error calibration technique was proposed to forecast the day-ahead electricity price.Several individual models were developed to forecast electricity price, respectively; then their performances under different circumstances were evaluated to build Hidden Markov models (HMMs).Together with the general past performance of the individual models, the state sequences of the HMMs were proposed to decide the combining weights; the emission sequences of HMMs were exploited to calibrate the errors by the combining model.
The rest of the paper is organized as follows.In Section 2, we describe the fundamental of HMM and the principle for combined forecast by HMM; Section 3 demonstrates the approach of combined modeling and error calibration with HMM.Experiments of the proposed technique and compared methods are showed in Section 4. Finally, the conclusions are presented in Section 5.

Principle of Combined Forecast with Weights Selected by HMM
In this section the basic ideas of combined forecast with weights selected by HMM are discussed.
2.1.Principle of HMM.HMM can be regard as a dual random process, a sequence of emissions that can be seen, and the other invisible sequence of state in which the emissions are generated.There are two kinds of HMM, discrete HMM and continuous HMM.Here we discuss the former and apply it to build combining model.For simplicity and emphasis we just give a brief introduction of discrete HMM.More details of HMM principle and how HMM works can be read in [15,16].Discrete HMM can be described by series parameters of five dimensions:  = (S, O, A, B, ), where (1) S: a set of states where the observation locates,  = { 1 , . . .,   }, and  is the number of the states; (2) O: a set of emissions or observations, O = { 1 , . . .,   }, and  is the number of the potential observations (or emissions) in each state; (3) A: a transition matrix which describes the probability of a transition from a given state to another state, A = (  ) × , and here,   = ( +1 =   /  =   ); (4) B: an emission matrix, whose ,  entry gives the probability of emitting symbol   given that the model is in state   .B = (  ) × , where   = (  = V  /  =   ); (5) : a vector of initial state distribution,  = ( 1 ,  2 , . . .,   ).
HMM mainly aims to resolve three problems: (1) to evaluate the most likely state path of a given sequence of emissions; (2) to estimate transition and emission probabilities of a given sequence of emissions; (3) to calculate the posterior probability that the model is in a particular state at any point in the sequence.

Combined Forecast by HMM.
The basic idea of combined forecasting is to give a weighted sum of forecasting by different models to reduce the defects of individual modeling method.In this paper, we use HMM to determine the weights of combining models.
In electricity price forecasting, a sequence of errors generated from price modeling can be considered as a HMM process.The intervals in which the error of each forecasting locates form the sequence of observations or the emission sequence; the forecasting abilities of the individual models are regarded as the state of HMMs.The HMMs are built according to the validation errors of the individual models.Then the next states of the HMMs which depict the abilities of the individual models are used to decide the combining weights.The possible next emissions of individuals are averaged with combining weights and then used to calibrate the combined forecast.

Error Calibration by HMM.
With the state probability vector of the next step assessed in Section 2.2 and the emission matrix B of HMM, the probabilities of emissions in the next step can be calculated.Since the different emissions present the range intervals where the error falls in as mentioned in Section 2.2, we can convert the emissions and their probabilities to expected value of forecasting error.Then the expected value is used for error calibration of the combined modeling.

Approach of Combined Forecasting and Error Calibration by HMM
This part depicts how to build a combined model with error calibration based on HMM techniques.As showed in Figure 1.Considering that the hourly prices in different hours shows great difference, we build 24 combining models to forecast the hourly prices one by one.For any hour price's modeling, the approach is the same, so we just take an hour as an example to show the modeling approach.The following 7 steps consist of the proposed method.
Step 1 (initiate).Including data pretreating and candidate models selection.We cluster the experimental data into three sets: a training set, a validation set, and a test set.The first one is used to train models, the second one is used to tune models' parameters according to their performances, and the last one is applied to evaluate the modeling algorithms.
Step 2. Individual modeling for combined forecast.
The following process is repeated for each individual model.

Substep 1. Build the individual model and calculate the validation error vector e and forecasting price p.
We train the   model with the training set, then tune the parameters in   with the validation set, and after that test   with the test set.In the above steps, we get the validation error vector e (see (1)) and forecasting price p (see (3)) separately e = ( ,1 ,  ,2 , . . .,  , ) , where  , is validation error by   for the th hour on the th day, and  is the number of days in the validation set, and error   is calculated by where ŷ is the forecast price and   is the actual price where  , is the forecasting price with   for the th hour on the th day and  is the number of days in the test set.
Substep 2. Calculate the emission sequence and the state sequence of   .In this step, for each model we transform the error sequence into discrete emission (observations) sequence and class the states according to the error in which denotes the forecasting abilities of the model.
Substep 3. Calculate the emission sequence and the state sequence of   .In this step, we transform the error sequence into discrete emission sequence and obtain the state sequence according to the performance of modeling which denotes the forecasting abilities of the model.
As discrete HMMs are discussed here, the emission sequence needs to be discretized.Here we divide the range, in which  spreads, into several intervals.Then marks the intervals where  falls in with the emission values (elements of the emission set).Then we get the emission vector s  according to the intervals in which each  falls: where   , is the emission of the model   ;   , ∈ O, O is the emission set.
Then we begin to calculate the state matrix s  according to certain criterions based on the model's performance.For simplicity, we just set three states to reveal the ability of the model forecasting, as follows: : the state of underestimate (when target price is significantly underestimated); : the state of proper prediction (when target price is estimated with acceptable accuracy), and : the state of overestimate (when target price is significantly overestimated): where   , is the state of the individual candidate in the th hour of the th day and   , ∈ S, S is the state set.

Substep 4. Estimate transition matrix A and emission matrix B.
In this step, the maximum likelihood estimate of the transition matrix A and emission matrix B are calculated with the known s  and s  .The process can be easily accomplished with the function of hmmestimate in Matlab (those who interested in its theory see the following: Durbin, R., S. Eddy, A. Krogh, and G. Mitchison, Biological Sequence Analysis, Cambridge, UK: Cambridge University Press, 1998.), so we will not intend to give a detailed description here.
Given an initial state distribution, with the state sequence and the emission sequence, we can estimate the transition matrix A (see (6)) and the emission matrix B (see (7)) for HMM of the th hour: where   is the transition probability from the th state to the th state and  is the number of states: where   is the probability of the th emitting symbol under the state   and  is the classes of the emissions.
Substep 5. Obtain the probabilities of the next state.
In Substep 4, we have obtained the transition matrix A and the emission matrix B. As discussed previously the matrix of A describes the probability of a transition from a given state to another state; matrix B gives the probability of emitting symbol under different states.So for a given state   , (suppose   , = , 1 ≤  ≤ ) in the th hour on the th day, the probabilities of the next state (in the th hour on the ( + 1)th day) are the vector a  in the transition matrix A.
Step 3. Calculate probabilities of the next emission and estimate the possible error generated by the model.
The probabilities of the next state a  obtained in Substep 4 are multiplied with the emission matrix to calculate the probabilities of the next emission.
The emission probabilities then are transformed to continuous possible error   with the intervals defined in Substep 3.
Step 4. Calculate the combining weights of the next step.
In this step, the combining weights of the different models are settled.The number of samples under the proper state of each individual for the validation set is used to evaluate the abilities of these models, as shown by where   denotes the historical forecasting ability of   , and   is the number of samples in state of  by model   among the validation set.The proper forecasting probability   in the vector a  and the abilities   are used to calculate the combining weights, as shown by where   is the combining weight of model   in the next step and   is the proper forecasting probability of the   .
Step 5. Get the combined forecast.The forecasts from individual models are averaged with the weights obtained in Step 4 to get a combined forecast, as shown by where  is the combined forecast and   is the forecast by   .
Step 6. Calculate the expected value of forecasting error of the combined model.The possible errors from individual models are averaged with the combining weights obtained in Step 5 to get possible error of combined forecast, as shown by where  is the possible error of the combined forecast and   is the possible error from   .
Step 7. Obtain the final forecast with expected value of error to calibrate the combined forecast, as shown by New prices and loads are appended in modeling process to accommodate the models to the new circumstance.

Numerical Results
In order to consider the influences from different aspects, we select various methods as candidates for modeling.Here we choose intelligent algorithm modeling (SVM, generalized regression neural networks (GRNN)), time series modeling (GARCH and GARCHX), and two direct methods as modeling candidates.One direct method is day-ahead modeling, in which we take the hourly price of the previous day as the forecasting price.The other direct method is SOM similar days modeling, in which we find the similar days by SOM from the validation set and then average the hourly prices of the similar days as the forecasting price.For simplicity, we use  1 to  6 to represent the SVM, GRNN, GARCH, GARCHX, day-ahead modeling, and SOM similar days modeling, respectively.
The performance of intelligent algorithm is sensitive to the input, so the modeling data are pretreated to eliminate the scale effects before modeling.All the numeric data are scaled to [0, 1], as shown by where    is the scaled value of the th attribute of the th sample,   is the raw sample value,  min  is the minimum of the th attribute of all the samples (all the data from 1 Jan. to July 31), and  max  is the maximum of the th attribute of all the samples (ditto).
The input for SVM, GRNN, and SOM model is {  ,  −24 ,  −24 ,  −48 ,  −168 ,  −192 ,   ,  −24 }.   is the forecast load of the target hour (it can be predicted day-ahead with high accuracy, so here we use the actual load as forecast where   is the th hourly price on the th day of the th week in 2010.As data from August 1 to August 31 are applied to test the model, the prices of previous 30 weeks are used to calculate   .The exogenous variables in GARCHX are   and  ,−1 . The parameters of individual models and MAPE (for the validation set) are listed in Table 1.From Table 1, we can see that SVM outperforms the other models.The rest models have close results.
Figure 2 is the distribution of the validation error of the candidates.It can be seen that SVM outperforms other candidates obviously, and day-ahead modeling with right tailed is different with other candidates.
Table 2 is the correlation coefficient between the validation errors of candidates.
From Table 2, we can see that  3 ,  4 , and  5 show high correlation.Since each model has some contribution we mainly focus on the work of selection of combining weights and the error calibration, so all the six models are taken in as candidates for combined model.

Build HMM with Validation Errors of the Individual Models.
In this part, HMM are built with the information of the validation error series of the selected models.As discussed in Step 2 of Section 3, the validation error sequence of   is discretized to build HMM.Considering that most of the errors fall in the range [−0.08, 0.08], we divide the range into 5 intervals, as shown by With process detailedly described in Substeps 3 and 4 of Section 3, we get the transition matrix A and the emission matrix B for   .
Given   , , the model state of the th hour on the th day, the state probabilities of the next day of the same hour can be easily obtained from the transition matrix A. Then the probabilities and historical abilities of the individuals are multiplied to generate the combination weights, according to (9).The combined forecast can be obtained by multiplying the forecasting of different models and the corresponding probabilities of their probable states, according to (10).

Calibrate the Combined Forecast with the Possible Combining
Error.In this step, the possible errors of the next step by different candidates are estimated by their emission matrix B as described in Substep 4 in Section 3; they are exploited to calibrate the combined forecast, according to (11).   3 shows the performance of the different modeling.It can be seen that the combination model significantly outperforms all the individual models. 1 and  2 also have the better forecasting than other individual models, just the same as the performance in the validation set.  Figure 3 is the comparison between the actual prices, forecasting by SVM and combined forecast with error calibration techniques.The forecasting by other models is not listed in the figure for simplicity since they are not as good as SVM.We can see from Figure 3 that most of the prices have been properly predicted by the combination model.Some extreme prices are too low or too high to model by both the two models.The hourly prices are overestimated by SVM, especially which have high prices on the previous day.

Performance Comparison and Analysis. Table
The effects of errors calibration is shown in Figure 4.The colors in Figure 4 show the value of the forecasting error: red color denotes big positive  (high overestimation) and blue color denotes big negative  (markedly underestimation), as the color bar lying on the right presents.We can see the most difference of the forecasting error from the black circle of the right part of Figure 4.The calibration reduces extreme errors (too high or too low) of the combined model; also it increases the number of proper forecasts, whose errors fall around zero, displayed by the grey color zone.
Figure 5 lists the error distributions by different models.The errors spread the range of [−0.4,0.6].The minimum and maximum of errors reveal that some prices are not properly forecasted by some models.If we probe it deeply, we can find that SOM similar day modeling and day-ahead

Conclusions
This paper proposes a comprehensive combined forecast technique for day-ahead price by HMM.Several models, SVM, GRNN, GARCH, GARCHX, day-ahead modeling, and SOM similar day modeling are selected as candidate models.
The error distribution of each model is exploited to calculate the state of HMM and the intervals where minimum errors fall mark with emissions of HMM.Then the state sequence and emission sequence are used to estimate HMM.Given a state of current hour, the state probabilities of the combination modeling of next day can be obtained from transition matrix A. These probabilities are regarded as combination weights for the combined forecast.Then the HMM are used to calculate the weights of combined model and to calibrate the error of the combined model.The combined forecast can adapt to the varieties of circumstance by changing its weights dynamically with HMM, and the error calibration technique helps to reduce the error generated by combined model.The case study to forecast summer prices in PJM market shows that the proposed method outperforms other comparison methods, including SVM.

Figure 2 :
Figure 2: Validation error distribution of the candidate models.

4. 3 .
Selecting Weights for Combined Forecast.Suppose that we have obtained all the forecasting errors of the 6 models of the th hour on the th day, as well as the forecasting price of the th hour on the (+ 1)th day ( > ).For each model, we separate the range where errors fall into three zones, , , and , as discussed in Substep 3, indicating the forecasting ability of the model under certain circumstance.As shown by $/(MW•h))

Figure 5 :
Figure 5: Histogram of errors by the different models.
The proposed method is validated on the day-ahead electricity market of PJM.Considering electricity price in summer is more volatile than in other seasons, we apply the method to forecast the hourly price of August 2010.The data in July serve as validation data, and date in June serve as training data.

Table 1 :
Parameters of the candidate models and validation MAPE. −24 is the price 24 hour previous to the target hour, and so on.  is a daily variable reflecting the price fluctuation with different day types.  can be calculated by

Table 2 :
Correlation coefficients of validation error sequences between the candidate models.

Table 3 :
Test MAPE of the candidate models.