Short-Term City Electric Load Forecasting with Considering Temperature Effects : An Improved ARIMAX Model

Short-term electric load is significantly affected by weather, especially the temperature effects in summer. External factors can result in mutation structures in load data. Under the influence of the external temperature factors, city electric load cannot be easily forecasted as usual. This research analyzes the relationship between electricity load and daily temperature in city. An improved ARIMAX model is proposed in this paper to deal with the mutation data structures. It is found that information amount of the improved ARIMAX model is smaller than that of the classic method and its relative error is less than AR, ARMA and SigmoidFunction ANN models. The forecasting results are more accurately fitted. This improved model is highly valuable when dealing with mutation data structure in the field of load forecasting. And it is also an effective technique in forecasting electric load with temperature effects.


Introduction
Short-term load forecasting (STLF) is mainly used to forecast the power load for the next few days or week [1][2][3].It plays an important role in the modern electricity Demand Side Management (DSM), as its accuracy directly affects the economic cost of operators in the electricity market.Accurate load forecasting is helpful for security, stability, and economic operation in power grid.It is also advantageous in making reasonable arrangements for maintenance plan.Meanwhile, power load forecasting can optimize power system dispatch and reduce production cost.
Short-term daily peak power load in summer fluctuates regularly, showing an obvious periodical characteristic.It is greatly affected by temperature, wind, precipitation, and other meteorological factors.There are significant mutation structures in load data [4][5][6].There are traditional methods in power load forecasting, such as regression model, gray model, support vector machines, neural networks, and time series.Ramón Cancelo et al. [7] used Red Eléctrica de España (REE) to forecast the electricity load from a day to a week ahead.Hipperta et al. [8] adapted large neural networks in electricity load forecasting to handle nonlinear time series data.Felipe Amarala and Castro Souza [9] used smooth transition periodic autoregressive (STPAR) models for short-term load forecasting.Amjady and Keynia [10] proposed a new neural network learning algorithm based on a new modified harmony search technique.This learning algorithm is widely used to search the solution space in various directions, by which overfitting problem and trapping in local minima and dead bands can be avoided.Wangdi et al. [11] adapted ARIMAX model to determine predictors of malaria for the subsequent month.And the test showed that prediction accuracy has been greatly improved.Chadsuthi et al. [12] studied seasonal leptospirosis transmission and the association with rainfall and temperature by using ARIMAX model showing that factoring in rainfall (with an 8-month lag) yields the best model for the northern region.The above forecasting methods are obviously effective in dealing with mutation structures and intelligent algorithms.However, they are not ideal in practical operation due to the limitation of data and laboratory equipment.The generalization capability is also weak.Traditional time series forecasting methods highlight the time role, without considering the external factor effects.Thus, the forecasting accuracy of time series methods is poor, with obvious defect [13,14].Based on the above research, an improved ARIMAX model 2 Mathematical Problems in Engineering is proposed here by combining the traditional time series with regression analysis to forecast short-term electric load, which has a strong practice value in the short-term power load forecasting field.This model fills the gaps of external effects on electric load.The prediction result showed that the improved ARIMAX model has a smaller model information amount than AR() or ARMA(, ) [15,16].

Sigmoid-Function ANN Model
ANN (Artificial Neural Network) is very practical forecasting technology in short-term electric load forecasting fields, especially for those nonlinear data.The basic concepts about ANN are shown as where Net  is the state of network unit   .Consider The output unit: ( Through the operating of first-order derivative to output unit,   (Net  ) is obtained as follows: Output layer unit: Hidden layer unit: Weight tuning function: Δ  ( + 1) =     . ( The specific algorithm of Sigmoid-Function ANN model is shown as follows: (1) The initial value of weight or threshold is defined as   (0), while   (0) is small random number. ( (6) When index  is located at , judging if  ≤ .

Time Series Theory
Time series is a typical time-domain analysis method.It can be used to reveal the internal laws of the sequences from the perspective of autocorrelation.Typical time-domain analysis steps are the following: (1) Observing sequence features.
(2) Selecting the appropriate fitted model according to the features computed by SAS.
(3) Model testing and optimization process.
(4) Using fitted model to infer the nature of sequence.
Core contents of time series analysis method are proposed by American statistician George E. P. Box named {  } for white noise sequence or displayed as Definition 1.The model is named autoregressive moving average model, if it contains the following structures, abbreviated as ARMA(, ) : Introduced delay operator , it can also be presented as Cointegration Theory.The cointegration theory was put forward by Engle and Granger in 2001 [17].Model can be calculated without the requirement that all sequences are stationary, if the cointegration relationship is obvious.The typical cointegration test is  test [18][19][20].
Definition 2. Supposing that the response variable {  } and the input variable sequences { 1 }, . . ., {  } are all stationary, the regression model is established in response to the input variable sequences and response sequences: In the actual modeling process, an improved ARIMAX model is proposed to forecast the short-term electric load.The specific process is displayed below.(2) Checking the stationarity of logarithmic transformation sequences,

The Improved ARIMAX Modeling Process
If the sequences are stationary, move on to the next step; if not, conduct differential operation to the logarithmic sequences and testing stationarity again; then execute the second-order differential operation until the stationarity is satisfied.
(5) Exploring the correlation coefficient between the stationary N-order difference logarithmic sequences "∇  ln   " and "∇  ln   " to determine the structure of improved ARIMAX model.This step is the improved part for traditional ARIMAX model.Therefore, the revised ARIMAX model can be calculated as follows: {  } is a zero mean white noise sequence.
Based on the above steps the improved ARIMAX model can be applied into load forecasting process.

Load Forecasting with ARMA Model
5.1.Load Data.The table in the appendix shows the daily maximum power load data and the maximum temperatures in a city from 1st June to 14th August (see Table 12).In this paper, the data is used to explore the classical time series models and ANN models are used to firstly forecast load.In Section 6, an improved ARIMAX model is established to compare the prediction accuracy [21][22][23].
It can be seen in Figure 2 that load data has an upward trend and clear cyclical fluctuations by observing the sequences, showing that the sequences are nonstationary [24,25].

Establishing ARMA Model.
After the time series analysis on the load data by SAS software, autocorrelation table is previously mentioned.Table 1 shows that the autocorrelation coefficients of the sequences are always positive [26][27][28].It can be inferred that daily peak power load data is nonstationary series with a monotonic trend, which is shown in Figure 2.
At the same time, the partial autocorrelation table can be obtained.Table 2 shows that only the first-order partial autocorrelation coefficient is significantly greater than twotime standard errors [29].The rest partial autocorrelation coefficients rapidly decline to zero, making random fluctuations within two-time standard deviation ranges.Thus it can be regarded as the first-order truncation.
According to white noise test, statistic  (probability) is less than 0.05; thus the sequence is nonwhite noise.Then, the AR(1) model is applied to forecast power load data.In residual autocorrelation coefficient test about AR(1) model, it is shown that statistic  is larger than 0.05; thus this model applies.
After the SAS processing, AR(1) model can be presented as   In order to optimize the ARMA model, the minic option is used to detect the best order [30].Setting ARMA(, ) model as   =  + (Θ()/Φ())  , the option detects the ARMA (3,2).The ARMA(, ) model is presented as  Thus the sequences can meet the homogeneity of variance.The white noise test of ln  sequence indicates that the ln  sequence is a nonwhite noise sequence.Unit root test shows that the  value of  statistic is significantly greater than 0.05.It is suggested that ln  sequence is nonstationary.There is one unit root in ln  sequence at least.And the analysis on ln  sequence is similar to that of ln  [23][24][25].
Secondly, operating first-order differential operators to {ln   } and {ln   } sequences to get stationary {Δ ln   } and {Δ ln   }, ln   , ln   → Δ ln   , Δ ln   . ( Thirdly, operate stationary test and white noise test to logarithmic sequence after first-order differential {Δ ln   } and {Δ ln   }.The test result shows that the  value of white noise test is greater than 0.05, which means that {Δ ln   } and {Δ ln   } sequences are pure random white noise sequences [26].And the  value of  statistic is less than 0.05, showing that {Δ ln   } and {Δ ln   } sequences are stationary series.Until now, the test analysis has been finished.

Computing
Secondly, the Δ ln  model is established (see Table 6).The test results obtained by SAS from Tables 3 to 5 show that Δ ln  is a stationary white noise sequence (the  value in Table 3 is larger than 0.05, while the  value in Table 4 is smaller than 0.05).The best order for ARMA(, ) model is [AR(0), MA(4)].Therefore, the fitting model is ARMA(0, 4) or MA(4) model.The constant term is not significant, using the noint option to remove the intercept.The final fitting model is shown as [31] Δ ln   = (1 − 0.45098 4 )   .
( It can be found in Table 7 that only the 0-order delay mutual relationship number is significantly nonzero, which means that there is no hysteretic effect between response sequence and input sequences.Therefore, the model should be treated in the same period.
The regression analysis in Table 8 shows that the final regression coefficient is 0.37098.
The statistics test is carried out on residual sequence, showing that the residual sequence is stationary white noise sequence (Pr > 0.05).The fitted model for residual sequence is   =   , and   is zero mean white noise sequence [32][33][34][35].
It is known that there is significant correlation in the zero-order between the two sequences in Table 7.The same period model is established between Δ ln  and Δ ln , based on the parameter estimates in Table 8 and tests in Tables 9∼10.The  value in Table 9 is larger than 0.05; thus the autocorrelation check of residuals shows that the model is effective for forecasting loads: The load from 15th to 31st is forecasted according to the improved ARIMAX model.
where ŷ is the prediction value,   is the actual value, and  is the sample size.It can be seen in Table 11 that the MAE of the improved ARIMAX model is the minimum, Sigmoid-Function ANN ranked second small, followed by AR model, and ARMA model is the maximum.It means that the improved ARIMAX model is better than S-ANN, AR, or ARMA model according to the AIC and SBC Criterion in Table 11.The revised ARIMAX model is more effective, by which more accurate load results can be obtained.
It can be seen in Figure 3 that the blue line is the actual daily maximum power load data, while the red line is the forecasting data of improved ARIMAX.The difference between improved ARIMAX model and actual power load data is the minimum among these models.Residual stationarity and white noise test show that the residual is stationary white noise sequence, showing that   =   .There is secondorder delay correlation between ln  and ln .

Conclusion
Based on the above analysis, the improved ARIMAX model can effectively dig up self-related information of load data.As an effective method for short-term load forecasting, the model can get a more accurate prediction result than traditional time series models.Prediction accuracy of this

4. 1 .
Modeling Steps.These are as follows (1) Perform logarithmic transformation on the original response sequence and the inputted sequences in order to meet the homogeneity of variance assumption.

Table 3 :
Autocorrelation check for white noise.
Δ ln   and Δ ln   Sequences Model.Firstly, the Δ ln  model is established.The test shows that Δ ln  is a stationary white noise sequence; thus the fitting model is

Table 6 :
Fitting parameter of Δ ln  model.

Table 7 :
Relationship number table of Δ ln  and Δ ln .

Table 8 :
Parameter estimates of REG procedure.

Table 9 :
Autocorrelation check of residuals.

Table 10 :
Crosscorrelation check of residuals with input ln .
1 } ⋅ ⋅ ⋅ {Δ ln   } and the response variable sequence {Δ ln   }.The mutual relationships numbers between the independent variables and the response variable are calculated after filtration by ARIMA analysis process.

Table 11 :
Predictions of different models.

Table 12 :
Daily maximum power load and temperature data in some city in 2010.: The state of network unit     : Outputunit   : Output (hidden) layer unit   (0): Initial value of weight or threshold   : Training samples : Accuracy requirements {  }: Time series   : Mean of time series {  }   : Autoregressivecoefficient   : NotationNet